The interaction between island geomorphology and environmental parameters drives Adélie penguin breeding phenology on neighboring islands near Palmer Station, Antarctica

Abstract Despite many studies on Adélie penguin breeding phenology, understanding the drivers of clutch initiation dates (CIDs, egg 1 lay date) is limited or lacks consensus. Here, we investigated Adélie penguin CIDs over 25 years (1991–2016) on two neighboring islands, Torgersen and Humble (<1 km apart), in a rapidly warming region near Palmer Station, Antarctica. We found that sea ice was the primary large‐scale driver of CIDs and precipitation was a secondary small‐scale driver that fine‐tunes CID to island‐specific nesting habitat geomorphology. In general, CIDs were earlier (later) when the spring sea ice retreat was earlier (later) and when the preceding annual ice season was shorter (longer). Island‐specific effects related to precipitation and island geomorphology caused greater snow accumulation and delayed CIDs by ~2 days on Torgersen compared to Humble Island. When CIDs on the islands were similar, conditions were mild with less snow across breeding sites. At Torgersen Island, the negative relationship between CID and breeding success highlights detrimental effects of delayed breeding and/or snow on penguin fitness. Past phenological studies reported a relationship between air temperature and CID, assumed to be related to precipitation, but we found air temperature was more highly correlated to sea ice, revealing a misinterpretation of temperature effects. Finally, contrasting trends in CIDs based on temporal shifts in regional sea ice patterns revealed trends toward earlier CIDs (4–6 day advance) from 1979 to 2009 as the annual ice season shortened, and later CIDs (7–10 day delay) from 2010 to 2016 as the annual ice season lengthened. Adélie penguins tracked environmental conditions with flexible breeding phenology, but their life history remains vulnerable to subpolar weather conditions that can delay CIDs and decrease breeding success, especially on landscapes where geomorphology facilitates snow accumulation.


| INTRODUC TI ON
Breeding phenology in seabirds may be driven by internal and external cues, both biotic and abiotic, which theoretically aid in aligning food demand for chicks with peaks in food availability (Dawson, 2007;Perrins, 1970;Visser & Both, 2005). The ability of a species to synchronize their breeding chronology to optimal local conditions is thus considered crucial to population and breeding success, especially in polar regions where strong variability in seasonality, both natural and climate-driven, can lead to short periods of favorable or unfavorable conditions, the latter possibly amplifying the potential for ecological mismatch (Visser & Both, 2005, Cushing, 1974, Schwartz, 2003Thackeray et al., 2016). The recent and rapid warming at the poles in particular has inspired a number of studies to not only investigate the drivers of phenological trends on both global and regional scales, but also to understand the processes that affect the extent to which a species may be vulnerable to climate change (Black et al., 2018;Dunn & Winkler, 2010;Keogan et al., 2018;Parmesan & Yohe, 2003;Youngflesh et al., 2017). While many studies suggest the climate signal controlling spring breeding phenology is well understood (Walther et al., 2002;Wormworth & Sekercioglu, 2011), other studies report difficulties in teasing out the degree to which environmental factors drive phenology (Youngflesh et al., 2018), indicating further research is still necessary (e.g., identifying the correct time window or obtaining climate data at the appropriate scale).
Populations of Adélie penguins (Pygoscelis adeliae) along the Western Antarctic Peninsula (WAP) inhabit one of the most rapidly warming regions on Earth (Turner, Maksym, Phillips, Marshall, & Meredith, 2013;Vaughan et al., 2003), thus offering the opportunity to investigate the drivers of Adélie penguin breeding phenology in relation to a strongly trending climate change signature.
The Adélie penguin is an ice-dependent, migratory seabird with a circum-Antarctic breeding distribution and a short temporal window in which to breed after returning to natal breeding colonies to rear chicks in the austral spring following migration from distant, generally ice-covered wintering habitats (Fraser & Trivelpiece, 1996;Fraser, Trivelpiece, Ainley, & Trivelpiece, 1992). Being long-distance migrants, Adélie penguins have little or no local information on breeding conditions at their natal colonies during winter, but it is likely that conditions at overwinter sites influence the timing of arrival at their breeding colonies by affecting the timing of migration (Ainley, 2002). Along the WAP, air temperatures have warmed, sea ice has declined and precipitation has increased (Obryk et al., 2016;Stammerjohn, Massom, Rind, & Martinson, 2012;Thomas et al., 2017), factors long-known to negatively impact the demography and sustainability of Adélie penguin populations (Cimino, Fraser, Irwin, & Oliver, 2013;Cimino, Moline, Fraser, Patterson-Fraser, & Oliver, 2016;Fraser & Patterson, 1997;Fraser, Patterson-Fraser, Ribic, Schofield, & Ducklow, 2013;Fraser et al., 1992;Patterson, Easter-Pilcher, & Fraser, 2003). In concert with these trends, evidence is accumulating to suggest that some spring-associated events such as snowfall are changing in timing, magnitude, and in its persistence on the landscape (Kirchgäßner, 2011). These factors are episodic, but often produce devastating effects on breeding populations of Adélie penguins (Fountain et al., 2016;Massom et al., 2006).
Motivated by an interest in understanding how environmental factors influence phenology, we used 25 years  of Adélie penguin breeding phenology data to investigate and reveal the environmental drivers of clutch initiation date (CID), and the subsequent consequences to breeding success near Palmer Station, Anvers Island, WAP. We investigated the drivers of CID on two neighboring islands that are <1 km apart but differ significantly in landscape attributes that may affect breeding habitat quality. We hypothesized that ambiguity in previous findings could be related to the influence of island-specific geomorphology on nest site microclimate   Figure 1), which thus far has not been considered. In contrast to previous studies, our analyses also capitalized on local measurements of environmental parameters obtained as part of the Palmer Long-term Ecological Research (LTER) program. This is noted because previous studies have typically used distantly recorded environmental conditions as proxies for local conditions;

K E Y W O R D S
Adélie penguin, breeding phenology, climate change, ecological threshold, environmental drivers, sea ice hence, we wanted to remove any ambiguities resulting from a possible scaling mismatch. Though our study location is only in one region of Antarctica, we aimed to provide useful insights for interpreting phenological dates as a proxy for species demographics or environmental forcing by considering both climate and landscape effects.
Given the large climate signal on the WAP, we also reevaluated the absence of ecologically relevant phenological trends reported in past studies. We hypothesized that Adélie penguins track environmental conditions due to some flexibility in their breeding phenology, but remain vulnerable to subpolar weather conditions that can simultaneously delay breeding and influence breeding success.

| Study site and penguin demographics
We studied Adélie penguin breeding phenology on Torgersen (TOR, 64°46′S, 64°5′W) and Humble (HUM, 64°46′S, 64°03′W) islands near Palmer Station, Antarctica ( Figure 1). The TOR study colony is located on the north side of the island and essentially lies on a lower elevation gently sloped landscape, while the HUM study colony is located on the southeast side of the island on a slope at a slightly higher elevation than the TOR colony ( Figure 1). Thus, HUM receives more wind scour even though both islands have equal exposure to the predominant northeasterly winds (Fraser & Patterson, 1997;Patterson et al., 2003).
Adélie penguins return annually to these rocky islands to begin nesting in mid-October, with egg laying beginning in mid-November.
Adélie penguins typically lay two eggs (Ainley, 2002). In the vicinity of Palmer Station, the Adélie penguin population has declined by ~90% since the 1970s, with the subpopulation on TOR declining more steeply (8,200 breeding pairs; 85% decline) than on HUM (2,485-575 breeding pairs; 77% decline) during our 1991-2016 study period. Nest sites were monitored by noting the date of egg lay/loss, egg hatch/loss, and chick crèche each year. The crèche stage occurs when chicks (approx. 20 days of age) form groups independent of the nest (Ainley, 2002). The number of nests monitored annually ranged from ~50 to 200 with ~20 nests monitored toward the end of the time series due to population declines. The nest sites in the study were checked for the presence of adults, eggs and chicks every day (but see caveats below). Dated observations were recorded on a custom Julian calendar in which October 1 represented day 1, and the year represents the austral summer field season (e.g., 1991includes October 1991to March 1992. Some of the HUM data analyzed here were used in past phenological studies (Lynch et al., 2009(Lynch et al., , 2012Youngflesh et al., 2017). All protocols were carried out in accordance with the approved guidelines of the Columbia University Institutional Animal Care and Use Committee (Assurance #AAAH8959).
Weather and sea ice conditions often prevented small boats from accessing the study sites. For this reason, lay and hatch dates were divided into two categories, true and estimated. For true dates, weather and ice did not impede daily checks, hence the exact timing of all nesting events were confirmed. However, if weather and ice induced gaps in daily checks, then lay and hatch dates at the next possible nest check were recorded as estimated because the exact dates were unknown. For all records with true dates, we calculated the mean and standard deviation in the lay date of egg 1 and egg 2, the days between these lay dates, the hatch date of chick 1 and chick 2, the days between these hatch dates, and the days between lay and hatch dates (incubation period). We determined whether there were significant differences between phenology at HUM and TOR F I G U R E 1 Palmer Station is located along the Western Antarctic Peninsula. (a) The study sites were located on Humble and Torgersen Island, near Palmer Station. (b) Torgersen Island often has more snow on the ground at the studied Adélie penguin colony than (c) Humble Island (photo credit Megan Roberts). Photos were taken on 16 November 2018 using a linear mixed model (LMM) fit by maximum likelihood using lmer in the lme4 and lmerTest packages (Bates, Maechler, & Bolker, 2011) in the R statistical program (R Core Team, 2017) with the phenological parameter as the response variable, site as a fixed effect and year as a random effect (see Table 1). We also determined mean breeding success each year at each island by calculating the average of the number of chicks that crèched per breeding pair to compare with annual mean CIDs, or the day the first egg was laid.
Because Adélie penguins have an incubation period of approximately 30-40 days (Ainley, 2002), there is a strong linear relationship between lay and hatch dates, which we used to adjust the estimated or unrecorded lay dates when weather or sea ice prevented colony access. We used all true lay and hatch dates to estimate the relationship (lay date ~ hatch date) using linear regression (R 2 = .88, Figure S1). Combining lay and hatch dates from both islands was appropriate because we found no significant difference between the incubation period at the two islands across years (Table 1, LMM t-statistic = 0.82, p = .41). For this linear regression between lay and hatch dates, 91% of the predicted lay dates were within ±2.5 days of the true lay dates, 83% were within ±2 days and 70% were within ±1.5 days. An error range of ~2 days may seem considerable but it should be noted that a 1.5-day interval was within our measurement resolution (e.g., if the colony was visited on Day 1 in the morning and on Day 2 in the evening). Therefore, we used this linear regression model to adjust estimated lay dates and dates where the true/ estimated identifier was not recorded. The latter was especially true for dates prior to 1995, or before protocols were refined to accommodate the environmental conditions that might affect island access and thus how date-specific data were interpreted. Thus, prior to 1995, we used the hatch dates to check the lay dates, and if the predicted lay date was within ±2.5 days of the recorded lay date, the record was retained (95% of the data) and all others were removed from further analysis. Similarly, after 1995, if both lay and hatch dates were estimated, we only retained records if the predicted and recorded lay dates were within ±2.5 days (57% of the data).
We computed mean CIDs for each island for each year. We removed 2013 (for HUM and TOR) and 2005 (for HUM) from further analysis due to low sample sizes resulting from unusually poor weather or sea ice conditions. For both years, <35% of the data had a true hatch or lay date, and the number of trues was <5. For all other years, we compared adjusted and unadjusted mean CIDs by island using Pearson's correlation to determine the strength of the correlation. We tested whether the CIDs differed between sites and years using a LMM as described above with CID as the response variable, site as fixed effect and year as a random effect. Further, to identify which years had significantly different CIDs between sites, we used a separate Welch two sample t test with the Benjamini-Hochberg false discovery rate used as a correction for multiple testing (Benjamini & Hochberg, 1995;Pike, 2011) (deemed significant at p < .05) for each year to test for differences between island means using all CID records within that year. Additionally, for each island, we calculated the anomaly in the mean CIDs by subtracting the time series CID mean from each season's mean CID.
Adélie penguin chick fledging mass was measured at HUM as chicks gathered on beaches (for methods and drivers see Chapman et al., 2011;Cimino et al., 2014;Salihoglu et al., 2001). Because Cimino et al. (2014) did not include phenology as a candidate driver of chick mass, we used linear regression to determine whether there was a significant linear relationship between annual mean chick mass and CID on HUM. We also used linear regression to determine Note: Incubation length is number of days between egg lay to egg hatch, lay interval is the number of days between egg 1 and egg 2 lay day, and hatch interval is the number of days between chick 1 and chick 2 hatch day. Linear mixed model (LMM) results show the phenology delays at TOR compared to HUM with year as a random effect (gray shading signifies significance at p < .05). The sample size (n) is in parentheses.
TA B L E 1 Mean and standard deviation of breeding phenology dates (days since September 30) at Humble (HUM) and Torgersen (TOR) Island from 1995 to 2016 when true dates were recorded whether there was a relationship between annual mean CID and breeding success on HUM and TOR.

| Environmental predictors of clutch initiation dates
We aimed to understand if CID was related to environmental drivers. Adélie penguins generally return to their colonies in October for courtship and nest building prior to laying eggs in November.
Therefore, conditions in the late austral winter and early spring may influence CIDs. Other studies have shown that CID is related to October air temperature, hypothesized to be indicative of snow on the ground (Hinke et al., 2012;Lynch et al., 2009Lynch et al., , 2012. It was thought that warm October temperature facilitates snowmelt, making for suitable nesting areas where penguins can build nests. Therefore, we used mean October air temperature and November snow depth as predictors of CIDs, both measured locally at Palmer Station and obtained from the Palmer LTER data archives (http:// pal.ltern et.edu/data/). We also calculated the number of days with precipitation (rain or snow >0 cm) in October to compare with observations at nearby Faraday/Vernadsky Station (discussed below).
We obtained sea ice parameters (sea ice retreat day, concentration and area in October and November), because these may influence access to the colony, access to open water from the colony, be related to wintering locations, parental condition, and/or migration distance/timing of arrival. Sea ice retreat day within 200 km of Anvers Island (Stammerjohn, Martinson, Smith, & Iannuzzi, 2008) was defined as the day in which sea ice concentration fell below 15%. Monthly sea ice area was defined as the area covered by sea ice with concentrations >15% within the Palmer LTER study grid.
Monthly sea ice concentration within 200 km of Anvers Island was also used. Sea ice records used here from 1979 to 2016 are available on the Palmer LTER data archive.
Climate indices may be related to CID through teleconnections due to both sea ice and weather being linked to the El Niño Southern Oscillation (ENSO) (Renwick, 2002;Stammerjohn, Martinson, Smith, & Yuan, 2008;Steinberg et al., 2015). We obtained El Niño 3.4 (Nino3.4) information from the NOAA National Weather Service,
When the breeding/phenology monitoring sites on HUM and TOR were established in late October or early November from 1998 to 2016, snow depth was usually measured at the colony edge near the groups of monitored nests. This measurement was taken before the mean CID and was an indication of snow depth at the sites before clutch initiation. We used these data to investigate the difference in snow depth between the two locations, which was likely related to the influence of island-specific geomorphology on snow accumulation ( Figure 1; Fraser et al., 2013). We calculated the mean and standard deviation in snow depth by island and year and then calculated the difference between the mean snow depth on HUM and TOR as an indication of island-specific effects. A LMM with snow depth as the response, site as a fixed effect and year as a random effect was used to test whether snow depth differed between sites and years.
To match and extend our Palmer Station precipitation record, we computed the number of precipitation days (solid precipitation in the form of snowflakes or grains) in October from 1979 to 2015.
The amount of snow on the ground in November is related to how much and often it snowed in October (similar to Kirchgäßner, 2011).
Our analyses suggested that rain/snow at Palmer Station equated to snow at Faraday/Vernadsky, likely due to Faraday/Vernadsky being >50 km south of Palmer Station.

| Statistical analyses
We hypothesized that in years without significant differences in mean CIDs (no stars in Figure 2a) between the two islands, the environmental conditions would be more benign (e.g., intermediate sea ice/air temperature and low snow) and more similar to each other than during years when CID differed between islands. Thus, we grouped environmental parameters into two categories by year, (a) significant difference in CID or, (b) nonsignificant difference in CID. We did a t test between the environmental conditions in significant versus nonsignificant years to determine whether there was an environmental difference. We tested all environmental variables measured at Palmer Station described above. Similarly, we tested for differences in breeding success and chick mass during significant versus nonsignificant years using a t test. We also calculated the difference in days between mean CID at HUM and TOR each year and used linear regression to determine whether this difference was related to the environment (Difference in CID ~ environmental variable). A separate linear regression was run for each environmental variable.
Next, we tested whether interannual variability in mean CIDs co-varied with environmental variables from 1991 to 2016. We used generalized additive models (GAMs) using the "mgcv" package in R. The smoothness parameter (s) was estimated using generalized cross-validation (Wood, 2006). GAMs are flexible and capable of fitting complex nonlinear relationships. Prior to GAM fitting, we investigated the relationship between predictor variables by calculating the Pearson correlation coefficients ( Figure   S2) because collinear variables could lead to model overfitting and did not include any variables in the same model when the correlation was above .55. For model selection, the Akaike information criterion for small sample size (AIC c ) identified the models that accounted for the most variation, selected the model with the best balance between bias and precision, and avoided over fitting (Burnham & Anderson, 2002). We considered models with a ΔAIC c < 2 to have substantial support and models with ΔAIC c > 10 to have no support. We also reported the percentage of deviance explained, the Adjusted R 2 and Akaike weight as indicators of model performance. We ran three suites of models using, (a) mean annual CID at TOR, (b) mean annual CID at HUM, and (c) an average of mean CIDs at HUM and TOR as response variables, with a combination of predictor variables. We report all models with ΔAIC c < 10. For each GAM, the effect of each environmental predictor was plotted to visually inspect the functional form to confirm the same relationship was seen within and across the three model suites. This general approach has been used in many studies to look at the environmental drivers of seabird demographics (e.g., Bond et al., 2011;Cimino et al., 2014;Dehnhard, Ludynia, Poisbleau, Demongin, & Quillfeldt, 2013;Santora, Veit, Reiss, Schroeder, & Mangel, 2017).
As studies often report no trends in seabird phenology over short time series (Keogan et al., 2018), we used Faraday/Vernadsky precipitation and sea ice records to extend our time series to 1979 to estimate the possible response in CID. We refitted our two best performing GAMs at TOR by replacing November snow depth with the number of precipitation days at Faraday/Vernadsky from 1991 to 2015 (years with available CID and Faraday/Vernadsky data).
We then applied this model onto the full time series of Faraday/ Vernadsky precipitation days and sea ice data from 1979 to 2015 to estimate CID. We tested for a trend in the estimated Adélie penguin mean CID over time. It has been recently recognized that sea ice trends began leveling off or reversing along the WAP since about 2008-2009 when a local minimum in annual ice season duration was observed, after which there was an increase in the length of the annual ice season (Henley et al., 2019;Schofield et al., 2017), at least until ~2016. Therefore, using linear regression, we tested for trends in observed CIDs and predicted CIDs over time by assessing the time series before and after this local minimum. For the GAM that included sea ice retreat day, we omitted 1989 from the linear regression analysis because 1989 was an extreme outlier and predicted an earlier CID by nearly a month. Such an extreme shift in CID is unlikely, especially given other factors like photoperiod or hormones acting as other cues (Dunn & Winkler, 2010).

| Reproductive phenology
On average, egg 1 was laid 1.8 days later on TOR than HUM, or November 12 (day 43) and November 15 (day 46) for HUM and TOR (Table 1). Egg 2 was laid approximately 3 days later on both islands, on November 16 and 18 for HUM and TOR, which was 1.9 days later on TOR (Table 1). Mean hatch day for chick 1 was ~36 days after egg 1 was laid, on December 19 and 20 for HUM and TOR, which was 1.5 days later on TOR. Mean hatch day for chick 2 was approximately one day after chick 1 hatched, following an incubation period of ~33 days on both islands. On average, chick 2 hatched on December 19 and 21 for HUM and TOR, which was 1.6 days later for TOR (Table 1). There were no significant or biologically meaningful differences between HUM and TOR in the number of days between egg 1 and 2 lay date, days between egg 1 and 2 hatch date, or incubation length for egg 1 and 2.
In general, the annual mean adjusted and unadjusted CIDs at HUM and TOR were similar (Pearson's R, HUM R = .88, p = 1.5e-08; TOR R = .96, p = 7.1e-14) and had the same variability over time ( Figure 2a). The adjusted mean CIDs were often statistically different between islands and years (t test, p < .05; in Figure 2a the stars indicate significance by year). A LMM showed that across all years TOR penguins bred later than HUM penguins by an average of 1.99 ± 0.32 days (t-statistic = 6.13, p = 2.3e-6). The overall mean lay date for all adjusted mean CIDs was November 14 and 16 for HUM and TOR. Anomalies across years in mean CIDs ranged from approximately ±5.6 and ±7.6 days for HUM and TOR (Figure 2b). Because mean CIDs were related to snow depth, we further investigated island-specific conditions by comparing snow depth measured when nest monitoring sites were established from 1998 to 2016.
Mean snow depth measurements at TOR and HUM were correlated (R = .78, p = .0006), but there were years of noticeably more snow on TOR (Figure 4). In many years, mean snow depth at each island was <5 cm at the time of the measurement and in those years, the difference in snow depth between islands was small. However, there were also years with mean snow depth > ~10 cm and in those years, the difference in mean snow depth between islands was often greater ( Figure 4). LMM results revealed snow depth was significantly deeper by 6.32 ± 1.09 cm on TOR than HUM (t-statistic = 5.79, p = 1.7e-08).
In the five years when the mean CID was not significantly different between islands (arrows in Figure 4), these were the lowest absolute differences in snow depth between islands (<0. 75 cm in 1998, 2006, 2008, 2010) and/or were the only cases of HUM having slightly more snow (0.11 and 2.12 cm more in 2010 and 2011).
GAMs were used to relate HUM, TOR and an average of HUM and TOR CIDs to environmental parameters. Our best models (ΔAIC c < 2) performed well and explained most of the variance for HUM (R 2 = ~.6), TOR (R 2 = ~.77) and the average of HUM and TOR F I G U R E 3 Environmental parameters grouped by years that had significant and nonsignificant differences in clutch initiation date (CID) between Humble and Torgersen Island (Figure 2a) Table 2). The best models used only two predictor variables. The models with the highest Akaike weight included snow depth and air temperature for HUM (58%), snow depth and ice retreat day for TOR (49%), and snow depth and ice retreat day for the average of HUM and TOR (43%). The top three models (models 1, 2, and 7) performed better than all other models tested (ΔAIC c < 2, Table 2). The parameters tested were also somewhat related to each other ( Figure S2). For example, Nino3.4 and MEI were related to air temperature (R = .63, .66) and sea ice conditions (R = .6-.7); air temperature was generally negatively related to sea ice variables (R = .5-.8). In contrast, snow depth was not strongly related to any environmental variables (R < .53, Figure S2). The best performing models were in good agreement with the data, being on average within ±1.4 days of observed CIDs across all years ( Figure 5). There were four unique terms in models 1, 2, and 7, including November snow depth (included in all models), ice retreat day, October temperature, and November ice area (Table 1) ice retreat day and ice area had a slight nonlinear relationship with CIDs ( Figure 6). Overall, CIDs were later when snow depth was deeper, ice retreat day was later, air temperature was cooler and ice area was greater ( Figure 6). The fitted relationships were similar when visualizing predictor variables for only HUM or TOR CID models.

| Clutch initiation date versus breeding success
At both TOR and HUM, a negative linear relationship was revealed between CID and breeding success, but it was only statistically significant at TOR (p < .05, Figure 7a). During the five years, when CID did not differ between HUM and TOR (Figure 2a), mean breeding

| Trends in clutch initiation date
For TOR, we tested for a trend in CID over time by extending the precipitation time series to 1979 (Figure 3b). Models A and B included the number of precipitation days and November ice area, and precipitation days and ice retreat day, respectively (Table 3, Figure 8).
As with previous models (Table 2), models A and B performed well and explained a high proportion of the variance (Table 3). CIDs were later when there were more precipitation days, ice retreated later, and ice area was higher (as in Figure 6). We ran linear regressions to test for a trend in expected CID over time using

| D ISCUSS I ON
We believe this is the first study to report differences in phenology and breeding success between colonies on closely neighboring islands, highlighting the importance of small-scale island-specific landscape attributes in a region where storm and precipitation events are becoming more frequent. In the following discussion sections, we consider the relative roles of sea ice and precipitation as primary, large-scale (Section 4.1) and secondary, smaller-scale drivers (Section 4.2), respectively, including the role of the latter especially as a source of delayed breeding on TOR due to interactions with a landscape that is more prone to snow accumulation. In this context, we discuss why air temperature can be indicative of CID but not a main driver (Section 4.3) and explore why environmental triggers are critical to Adélie penguins (Section 4.4). In the last two sections, we discuss the negative association between delayed CIDs and breeding success (Section 4.5) and the ecological/evolutionary relevance of trends in Adélie penguin phenology (Section 4.6).

| Sea ice as the primary driver of clutch initiation date
Sea ice is a key driver of phenology across many trophic levels (Barbraud & Weimerskirch, 2006;Cherry, Derocher, Thiemann, & Lunn, 2013;Ramírez et al., 2017;Saba et al., 2014). We found that CID models that included sea ice and precipitation described most of the variation (~60%-80%) in CIDs, which accords with the early observation that annual sea ice variability is a major driver of Adélie penguin life history patterns (cf. Fraser et al., 1992). As with other seabirds, the initial commencement of the breeding season in Adélie penguins is likely hormonal, induced by a lengthening photoperiod in spring that leads to changes in behavior including migration to natal nesting colonies (Ainley, 2002;Murton & Westwood, 1977;Van Tienhoven, 1961). However, photoperiod alone cannot explain the interannual variability in CIDs we observed because it does not change from year-to-year in comparison with environmental variables such as sea ice, which can have both direct and indirect effects on breeding chronology. During some years, for example, sea ice can physically impede penguins from completing their migration back to natal colonies (e.g., Ainley, 2002;Ainley, LeResche, & Sladen, 1983), thus forcing a delay until the ice cover diminishes to some optimal state (the habitat optimum hypothesis, Fraser & Trivelpiece, 1996). Austral winter/spring sea ice conditions may also influence parental body condition, and in turn affect CIDs via feedback mechanisms that delay breeding in response to food web variability. The presence or absence of sea ice influences the magnitude of phytoplankton blooms and krill abundance, which may impact prey availability at the earliest stages of the breeding cycle (Atkinson et al., 2008;Saba et al., 2014;Steinberg et al., 2015).
Considering such a life history from an evolutionary perspective, it is not surprising that our results suggested that sea ice dynamics were a primary driver of CID variability, even though, admittedly, challenges remain insofar as identifying the exact mechanisms responsible for this association.

| Precipitation as a secondary driver of clutch initiation date
Precipitation was the other significant variable determining CIDs.
However, because Adélie penguins have no local knowledge about snow conditions at their breeding colony when they initiate their return migration, and snow conditions can be both unpredictable and highly variable annually, it is equally unsurprising that precipitation plays a secondary role as a determinant of CID variability. While snow caused delayed breeding on both islands, the effect of snow was magnified on TOR likely due to the influence of island geomorphology on snow deposition (Fraser & Patterson, 1997;Patterson et al., 2003). This island-specific snow accumulation mechanism that caused differences in CIDs on the two islands is supported by longterm studies where island geomorphology and, thus, snow deposition and accumulation, results in differing availabilities of suboptimal habitat on each island. The island in the Palmer Station study region with the greatest amount of suboptimal habitat is Litchfield Island,  1979/91−2009 2010−2015/16 where the Adélie penguin population went extinct in 2007  after a prior occupation of ~500 years (Emslie, Fraser, Smith, & Walker, 1998), emphasizing the Adélie penguin life history is no longer suited to the emerging warm and moist WAP ecosystem.
There also appeared to be a threshold effect between snow depth and CIDs, illustrated by November snow depths >20 cm corresponding to delayed breeding on TOR (Figure 3). In many of those years, the difference in snow depth between the two islands was ~10-25 cm and on average, TOR snow depth was ~6 cm deeper than HUM, which further supports a threshold effect. Biologically, it makes sense that there is a threshold snow depth that delays CIDs, as from an evolutionary perspective birds must respond to thresholds that may affect fitness.
We acknowledge that snow depth in the models was measured at Palmer Station; therefore, the differences between observed and modeled CID as well as model performance at the two islands could be due to the islands having different landscapes than Palmer Station, and Palmer Station snow depth not being fully representative of island conditions. The islands likely receive greater exposure to wind scour than Palmer Station, but Palmer Station snow data does capture the overall regional snow conditions and anomalies each year.

| Air temperature as an indicator of clutch initiation date
Many phenophases of plants and animals (including penguins) correlate with spring temperatures (Hinke et al., 2012;Lynch et al., 2012;Walther et al., 2002). Air temperature was an informative predictor of CIDs but it was also highly correlated to sea ice conditions rather than precipitation ( Figure S2). Interestingly, October air temperature was not as highly correlated to November snow depth at Palmer Station and cannot explain the variation in snow depth recorded on the two islands. Past studies, including some that included the HUM data analyzed here, used October air temperature as a predictor of CIDs (Hinke et al., 2012;Lynch et al., 2009Lynch et al., , 2012 with the assumption that temperature was an indicator of precipitation, but our results suggested this is not the case at Palmer Station. Air temperature could be indicative of snow characteristics (packed or slushy) that can influence nest site flooding but as is the case for a majority of landscape and population-scale wildlife studies (Boelman et al., 2019), we lack the type of snow data that would be required to test this idea. We thus interpret the association between air temperature and CID as being related to sea ice conditions. For other WAP studies that found air temperature was related to CID, the main mechanism could be through sea ice and not precipitation (i.e., Hinke et al., 2012;Lynch et al., 2009;Lynch et al., 2012).

| The importance of environmental variability in driving clutch initiation dates
Phenological variability in penguins can occur with or without environmental variability. While it has been suggested that high variability in penguin breeding phenology is normal under both stable (e.g., penguin colony at a zoo) and variable environments (e.g., penguin colony in nature, Youngflesh et al., 2018), the tight connection between phenology and the environment under variable conditions in our study emphasizes phenology can be a reliable indicator of environmental forcing. For example, at TOR ~ 80% of the variation in CIDs was explained by the environment, suggesting that 20% of the variation may have been due to other factors such as inaccuracies in weather data collection (e.g., snow was not measured on each island), parental age/experience/condition (Ainley & Schlatter, 1972;LeResche & Sladen, 1970;Moreno, Leon, Fargallo, & Moreno, 1998;Trivelpiece, Butler, Miller, & Peakall, 1984) or colony size/nest location (Barbosa, Moreno, Potti, & Merino, 1997;Fargallo et al., 2006).
In contrast, at HUM where only ~ 60% of the variation in CIDs was explained by the environment, there could be more random individual variation because weather impacts were simply less severe.
Therefore, synchronous breeding with lower variability in phenology is more likely to occur when a factor delays breeding, such as penguins waiting for snow to melt to lay eggs at TOR. In comparison, at HUM (or at a zoo), where conditions are often more predictably suitable and birds can begin breeding as they are ready, higher variability in phenology and less synchrony may occur. Notably, and supporting our rationale, a captive penguin population exhibited less synchrony among individuals each year than a wild population (Youngflesh et al., 2018). Penguin life histories are the product of evolution; hence, their selection must scale to environmental thresholds that optimize reproductive strategies (Fraser & Trivelpiece, 1996). Because Adélie penguin life history is tied overall to short seasonal cycles (Fraser & Trivelpiece, 1996), these environmental thresholds may function as critical drivers of breeding phenology that may result in more synchrony and less variability when the specific parameters associated with these thresholds are exceeded.

| Effect of clutch initiation date on breeding success and the role of match-mismatch dynamics
Penguin chick fledging mass and breeding success can be affected by numerous factors (see Hinke et al. (2012) for an overview), which can represent the investment of parents in their chicks and also be due to factors that are unrelated to CIDs (e.g., Youngflesh et al., 2017). We found no relationship between CID and chick fledging mass, which corroborates a past study at this location (Cimino et al., 2014). At both islands, breeding success was higher for earlier breeders but the relationship was not statistically significant at HUM, suggesting the ramifications of snowfall were less severe due to lower snow accumulations.
Similar to our study, on a circumpolar scale, Adélie penguin breeding success was higher when penguins bred earlier (Youngflesh et al., 2017) making it clear that either late CIDs contribute negatively to breeding success, or conditions that contribute to late CIDs also influence breeding success negatively. For example, snow accumulation that caused late CIDs will eventually melt and can then potentially decrease the survival of eggs or chicks through nest flooding. In this instance, synchronous breeding of a colony cannot outweigh the importance of synchronizing lay dates with optimal environmental conditions (in contrast to the hypotheses of Youngflesh et al., 2017), as nest flooding can lead to near-immediate nest failure.
Earlier breeding could also signify other optimal habitat conditions, such as adequate food availability, but Youngflesh et al. (2017) suggested that while mismatches in Adélie penguins are apparent, it is not the main driver of reproductive dynamics.
It may be important to also consider phenological mismatch in the context of the latitudinal climate gradient along the WAP, which is characterized by a warmer, maritime climate in the north that is nearly sea ice-free most of the year and a colder, continental climate in the south where some perennial sea ice still exists (the transition zone is near Palmer Station, Ducklow et al., 2013). Along this gradient the timing of sea ice retreat can affect the timing and magnitude of phytoplankton blooms differently, as well as the potential for temporal mismatches in trophic interactions. For example, the northern WAP showed less favorable environmental conditions for phytoplankton blooms when the ~1970s were compared to the ~1990s, whereas the southern WAP experienced more favorable conditions  (Garzio & Steinberg, 2013), lower lipid content krill (Ruck, Steinberg, & Canuel, 2014), shifts in macrozooplankton communities (Steinberg et al., 2015), decreased Adélie penguin populations (Lynch & LaRue, 2014) and a microbial (vs. krill) based food web (Sailley et al., 2013). Thus, despite large-scale sea ice decreases in extent and duration along the entire WAP , the north and south regions started with different baselines, thus progressed differently along the Adélie optimal-suboptimal curve (e.g., Fraser & Trivelpiece, 1996;Smith et al., 1999).
While Youngflesh et al. (2017) aimed to detect phenological mismatches between Adélie penguins and bloom onset/ice retreat, they did not include bloom magnitude, which has cascading effects on higher trophic levels. Further, it is unclear how these indices relate to macrozooplankton phenology, a factor that Steinberg et al. (2015) admittedly could not address. Therefore, for future studies investigating trophic mismatches, it is important to first consider the baseline from which change is occurring, because change can either be toward less or more favorable conditions (thus affecting trophic interactions differently). If long-term warming persists along the WAP, then less favorable conditions will continue to shift southward, as will the increased potential for trophic mismatches.
Further, flexible reproductive timing may provide some buffer for reproductive success but it may be limited by both CID and snow.
High breeding success was seen when CID was <day 45 and snow depth was <20 cm. In contrast, the two years (2001 and 2015) with the greatest snow depths had some of the lowest breeding success values and latest CIDs; this pattern was also noted at the Copacabana colony (~400 km north of Palmer Station, Hinke et al., 2012). During years of such environmental stress, life history strategies are in effect being tested by the environment, and the high frequency of nest failures highlights an incompatibility between Adélie penguin life history and precipitation above an optimal threshold.

| Trends in clutch initiation dates placed into an environmental context
The WAP has been warming since at least the 1950s with increasing winter air temperature and precipitation (Kirchgäßner, 2011).
During our study period, there was no trend in precipitation, and after decades of sea ice decline from 1970 to 2009 (most notably the ice season became shorter), sea ice began to rebuild along the WAP after reaching a local minimum in 2008-2009 (i.e., the annual ice season became longer post-2008-2009) (Henley et al., 2019;Schofield et al., 2017). In general, to detect an ecologically relevant trend in phenological datasets it may be necessary to consider and evaluate environmental conditions over the same temporal scales. For future studies, after identifying the correct environmental variables (which is inherently difficult in natural systems), we suggest that phenological data should be matched along the same time scales to long-term climate data to determine whether a trend can be expected and detected. It is also important to recognize the impact of landscape geomorphology at a study site to understand how weather (e.g., wind and precipitation) may differentiate impacts on colonies located in close or distant proximity. This is especially relevant when looking at phenology across latitudes, where a delay in phenology at southerly latitudes could be amplified or reduced by landscape effects (e.g., Lynch et al., 2012). The influence of landscape geomorphology on nest site microclimate may also be an important factor in identifying Adélie penguin refugias under future climate change scenarios. Regardless of the mechanism driving phenological variability, reproductive success is generally higher for species able to adjust their breeding phenology and strategy to climate change (Forcada et al., 2008;Visser & Both, 2005) and for Adélie penguins, their responses to climate change may partly depend on this flexibility (Ballerini, Tavecchia, Pezzo, Jenouvrier, & Olmastroni, 2015). The ability of Adélie penguins to track environmental variability demonstrates phenotypic plasticity in their behavior and simultaneously emphasizes vulnerabilities in their life history to increased precipitation. OPP-0523261, OPP-0741351, OPP-0823101, PLR-1326167. We are grateful for the logistical support provided by Palmer Station staff and the many field team members that assisted in data collection.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

AUTH O R CO NTR I B UTI O N S
WRF designed the data collection methodology. WRF and DPF collected the data. MAC and WRF developed the research questions.
MAC analyzed the data, and MAC wrote the initial manuscript with contributions from WRF. SS analyzed the sea ice data. All authors contributed to data interpretation, edited, and reviewed the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data reported in this manuscript is either publically available or will be available online through the Palmer LTER Data System (http://pal. ltern et.edu/data/) within one year of publication.