Morphometric and genetic evidence for four species of gentoo penguin

Abstract Gentoo penguins (Pygoscelis papua) are found across the Southern Ocean with a circumpolar distribution and notable genetic and morphological variation across their geographic range. Whether this geographic variation represents species‐level diversity has yet to be investigated in an integrative taxonomic framework. Here, we show that four distinct populations of gentoo penguins (Iles Kerguelen, Falkland Islands, South Georgia, and South Shetlands/Western Antarctic Peninsula) are genetically and morphologically distinct from one another. We present here a revised taxonomic treatment including formal nomenclatural changes. We suggest the designation of four species of gentoo penguin: P. papua in the Falkland Islands, P. ellsworthi in the South Shetland Islands/Western Antarctic Peninsula, P. taeniata in Iles Kerguelen, and a new gentoo species P. poncetii, described herein, in South Georgia. These findings of cryptic diversity add to many other such findings across the avian tree of life in recent years. Our results further highlight the importance of reassessing species boundaries as methodological advances are made, particularly for taxa of conservation concern. We recommend reassessment by the IUCN of each species, particularly P. taeniata and P. poncetii, which both show evidence of decline.


| INTRODUC TI ON
A recent investigation into global species diversity of birds proposed that the number of species may be underestimated by as much as a factor of two when unrecognized species are accounted for (Barrowclough et al., 2016). This discrepancy exists in part due to the historical application of the Biological Species Concept (BSC) in ornithology. The BSC defines a species as a "group of actually or potentially interbreeding natural populations, which are reproductively isolated from other such groups" (Mayr, 1942). While generally applicable, the BSC is complicated in ornithology by the ability of birds to hybridize with deeply divergent relatives (Prager & Wilson, 1975).
It is also often impossible to test for reproductive isolation in wildlife taxa that do not have overlapping ranges. As a result, the widespread application of the BSC led to an underestimation of avian species diversity. The Phylogenetic Species Concept (PSC), conceived by Cracraft (1983) and applied in Barrowclough et al. (2016), defines a species as "the smallest diagnosable cluster of individual organisms within which there is a parental pattern of ancestry and descent" or more simply as "a group of organisms that have a unique and shared evolutionary history (i.e., monophyletic)." This definition allows for species delimitation without the need to invoke reproductive isolation. Another factor leading to the recognition of greater avian species diversity is the advancement of species delimitation tools, including genomic sequencing and multivariate morphometrics.
Unrecognized (or hidden) species are distinguishable using physical characters but were not previously recognized as full species due to either limitations in analytical methods or historical interpretations of the species concept. Cryptic species, on the other hand, refers to taxa that cannot be readily identified using physical characters, but can be discerned using genetic and/or ecological evidence (Hosner et al., 2018). Cryptic and hidden diversity in birds has been uncovered across the world in recent years by using the PSC in conjunction with integrative taxonomic approaches combining genomics and morphometrics, particularly in biodiversity hotspots such as the oldworld tropics and neotropics (Hosner et al., 2018;Pulido-Santacruz et al., 2018;Singh et al., 2019;Younger et al., 2018Younger et al., , 2019. To manage conservation priorities in light of ongoing environmental change, it is vital to understand the true number of species that exist and their range limits, rather than relying on historic estimates. Given their large geographic range and already noted genetic and morphological differences (Clucas et al., 2018;Stonehouse, 1970), gentoo penguins could be strong candidates for harboring hidden species-level biodiversity. First described by Forster (1781), the gentoo penguin (Pygoscelis papua) is the largest of the three Pygoscelis species and identifiable by its charismatic red-toned bill, blackhead, and two contrasting white patches on the face. Gentoos have a circumpolar distribution spanning the Antarctic Convergence between 65°16' S and 46°00'S, ranging from the Antarctic Peninsula to the Crozet Islands ( Figure 1) (Forster, 1781;Lynch et al., 2012;Woehler, 1994). Given this geographic spread and the considerable heterogeneity in environmental conditions among extant populations, it is important to understand not only global trends in gentoo penguin numbers, but also how each of the individual populations is faring in the rapidly changing Antarctic climate. Individual populations may also provide evidence for how gentoo penguins adapt to specific environmental conditions which could be missed when generalizing over the polytypic species.
The global population size of gentoo penguins has increased sixfold over the past 40 years, despite a changing ecological landscape due to climate change (McMahon et al., 2019). Newly established colonies on the southern extent of the gentoo range seem to be growing due to the increasing breeding habitat brought about by receding sea ice (Juáres et al., 2019;Lynch et al., 2012). Established populations, however, show varying patterns of success, with populations at Port Lockroy, Kerguelen Island, and Macquarie Island seeing 1.4%, 2.3%, and 1.8% per annum decreases, respectively, based on multi-decadal studies (Bingham, 1998;Dunn et al., 2018;Dunn et al., 2016;Juáres et al., 2019;Lescroël & Bost, 2006;Pascoe et al., 2020).  Peters and Paynter (1934) who designated the populations of Macquarie, Heard, Kerguelen, and Marion Islands as taeniata while gentoos from the Falklands, South Orkney, South Shetland, South Georgia, and the Western Antarctic Peninsula were assigned to papua. The next update to the taxonomy was by Murphy (1947), who designated the subspecies P. p. ellsworthi for the populations on the South Shetland Islands and Western Antarctic Peninsula. Stonehouse (1970) then investigated the subspecies boundaries, focusing on morphological variation. Stonehouse focused on the classic hypothesis that revolved around the influence of the Antarctic Polar Front and the extent of pack ice on geographic variation in gentoos, and thus split P. papua into a northern (P. p. papua) and southern subspecies (P. p. ellsworthi), found north and south of 60°S, respectively, while discounting Mathews' or Peters' claim for an eastern subspecies P. papua taeniata (Mathews, 1927;Murphy, 1947;Peters & Paynter, 1934;Stonehouse, 1970). The analysis used a univariate approach based on six measures (culmen length, foot length, flipper length, flipper area, dorsal plumage, and ventral plumage) and confirmed a north/south gentoo split in line with the Antarctic Polar Front hypothesis, with the South Georgia Island population belonging to the northern subspecies. Individuals measured from Kerguelen and Macquarie Islands were found to be statistically indistinguishable in this study and were different only slightly from those from South Georgia and the Falkland Islands, and therefore were also included in the nominate northern subspecies P. p. papua (Stonehouse, 1970). A recent study found support for a new clade in the sub-Antarctic Indian Ocean based on morphological analyses but was not formally assigned to a new subspecies (de Dinechin et al., 2012) while investigations into geographic variation in gentoo vocalizations found no patterns connected with regions or subspecies (Lynch & Lynch, 2017).
Recent genetic analyses from across the penguin family have uncovered significant genetic divergence among populations across the Southern Ocean (Clucas et al., 2018;Frugone et al., 2019;Levy et al., 2016;Pertierra et al., 2020;Vianna et al., 2017). These studies, in combination with documented regional heterogeneity in population responses to climate change, highlight the importance of interrogating traditional ideas of subspecies limits within gentoo penguins (Levy et al., 2016;Vianna et al., 2017). Both Clucas et al. (2018) and Pertierra et al. (2020) suggested that cryptic species of gentoo penguins exist based on genetic methodology. Using an integrative taxonomic framework combining contemporary multivariate morphological analyses with previous genomics results, we aim to test whether the four genetic lineages of gentoo penguins described in Clucas et al., 2018 (Kerguelen, Falklands, South Georgia, and South Shetlands/Western Antarctic Peninsula) are morphologically distinct and therefore warrant recognition as distinct species under the Phylogenetic Species Concept. We then take the next step of formally describing distinct species so they will be included in assessment frameworks such as the IUCN Red List and conservation plans.

| Taxon sampling
Our geographic sampling within gentoo penguins ( Figure 1) includes representatives from Kerguelen, the Falkland Islands, South Georgia, South Shetland Islands, and the West Antarctic Peninsula. This sampling spans the two currently recognized subspecies, namely the northern gentoo (the nominate subspecies, Pygoscelis papua papua (Forster, 1781)) distributed north of 60°S; and the southern gentoo (Pygoscelis papua ellsworthi), distributed on the Antarctic Peninsula and maritime Antarctic islands south of 60°S (Clements et al., 2019;Murphy, 1947;Stonehouse, 1970). Additionally, we include the putative Indian Ocean subspecies (de Dinechin et al., 2012), which is still classified as P. p. papua (Clements et al., 2019), and the South Georgia population, also classified as P. p. papua, but which multiple genetic studies show to be more closely related to P. p. ellsworthi (Clucas et al., 2014(Clucas et al., , 2018Levy et al., 2016). Previous work has shown that gentoo penguin colonies on the South Shetlands and West Antarctic Peninsula are not reciprocally monophyletic in phylogenetic analyses (Clucas et al., 2018;Vianna et al., 2017); therefore, here, we group these populations into one unit for the purposes of this species delimitation study. There are therefore four putative species to be assessed: South Shetlands and the West Antarctic  (Table S1). Birds with evidence of juvenile plumage were excluded.

| Genetic variation
We used Genodive (Meirmans & Van Tienderen, 2004) to calculate the Weir and Cockerham unbiased weighted F ST estimator (Weir & Cockerham, 1984) between all pairs of populations, with significance calculated using 10,000 permutations of the data. Expected heterozygosity (H S ) was also calculated for each population using Genodive. Principal components analysis (PCA) was used to visualize genetic variation among all individuals, using the adegenet package (Jombart, 2008;Jombart & Ahmed, 2011) in R. Allele frequencies were scaled and centered, and missing values replaced with the mean allele frequency using the scaleGen function. PCA was computed with the dudi.pca function from the ade4 v1.7-11 package.
We previously carried out maximum likelihood (ML) phylogenetic analysis and Bayes factor species delimitation for the gentoo penguin SNP dataset (Clucas et al., 2018). In brief, we used RAxML v8.2.7 (Stamatakis, 2014) to infer an ML phylogeny with a SNP ascertainment bias correction applied to the likelihood calculations. 20 independent ML tree inferences were carried out using the GTRGAMMA model and then the best scoring topology identified and annotated with bootstrap supports from 1,000 replicates.
Coalescent-based, Bayes factor species delimitation was carried out using the BFD* method (Leaché et al., 2014)

| Morphological variation
To determine whether genetic lineages are morphologically distinct, one of us (JY) took nine linear measurements from each museum study skin, representing key morphological traits of both the crania and postcrania (Baldwin et al., 1931): culmen length (CL; taken along the medial line), bill width at the base (BWB), bill height at gonys angle (BH), bill width at gonys angle (BWG), flipper width (FW; shortest distance from anterior surface of flipper above the radiale to the posterior side of the flipper), radius length (RL), manus length (ML; indent at radiale/radius/ulna to distal wing tip), tarsus length (TML; anterior surface), and middle toe length (MTL; digit I11 excluding nail). Measurements were taken with Mitutoyo Digital Callipers to an accuracy of 0.01 mm. All measurements were repeated three times, checked for outliers (by confirming that all measurements were within one standard deviation), and then averaged. The summary statistics of these measurements for each of the four clades are given in Table S1. All measures were log-transformed before the analyses. To identify traits that significantly differed between sexes, we carried out an analysis of variance (ANOVA) of sex within lineage for each trait (Table S2). Our testing found that only Flipper Width had a statistically significant difference between sexes (p = .024).
This trait was therefore excluded from subsequent analyses to remove any potential bias introduced by uneven sampling of sexes.
Both univariate and multivariate analyses were used to investigate morphological differentiation between lineages. We carried out pairwise ANOVAs to determine whether any individual traits differed among lineages, and pairwise multivariate analysis of variance (MANOVA) on the combined trait dataset to assess overall morphological differentiation, using the F statistic for significance testing.
These analyses were performed using base R (R Core Team, 2013).
Principal components analysis (PCA) and linear discriminant analysis (LDA) were used as dimension-reduction methods to aid with visualization and prediction, with lineage as a grouping variable using the fviz_pca_biplot function in factoextra and lda function in MASS in R (Kassambara & Mundt, 2017;R Core Team, 2013;Venables & Ripley, 2002). Confusion matrices and cross-validation tests were constructed and performed using predict function in the MASS package in R (Venables & Ripley, 2002).  for the four-taxa model compared to the current taxonomy, and of 1,231 over the next most supported model (the three-taxa model) ( Table 2). Note that a Bayes factor of 10 is considered decisive (Kass & Raftery, 1995). The currently accepted taxonomy had the lowest marginal likelihood estimate.

| Morphological variation
Our pairwise MANOVA tests determined that all genetically distinct populations are significantly morphologically distinct from each other overall (p < .05; Table 3). Our PC and LD analyses show some overlap in morphospace among the four lineages ( Figures 5 and 6).
Our pairwise ANOVA testing of traits showed that several individual traits enable discrimination of all four lineages (

| D ISCUSS I ON
Our integrative taxonomic approach has revealed four deeply diver- There are currently two recognized subspecies of gentoo penguin: P. papua papua and P. papua ellsworthi, representing the classic north/south split within gentoos (Stonehouse, 1970). Other subspecies of P. papua have been previously proposed, including P.
papua taeniata, which has included various combinations of island populations since its inception in 1927 (Mathews, 1927;Peters & Paynter, 1934). The Falkland Islands lineage will retain the name P.
papua, given that P. papua was originally described from the Falkland Islands (Forster, 1781 (Pertierra et al., 2020). The Kerguelen lineage was previously described as the subspecies P. p. taeniata (Mathews, 1927;Peters & Paynter, 1934), which fell out of usage in the 1970s (Stonehouse, 1970), following which the Kerguelen gentoos have been classified as P. p. papua. We suggest the Kerguelen lineage now be designated as P. taeniata accordingly. We note that Mathews (1927) and Peters and Paynter (1934)  Georgian gentoos. We therefore describe this for the first time as.

Adult male collected by Robert C. Murphy at South Georgia, South
Atlantic Ocean on 11th March 1913. The specimen was prepared as a museum flat skin and was used in the morphological analysis.  Stonehouse (1970) but are now confirmed with modern multivariate methods.
In addition to morphology and genetics, there are notable ecological differences among the lineages. These include breeding habitat, which splits the flat beach and tussock grass nests of South Georgia and the Falkland Islands (Croxall & Prince, 1980;Reilly & Kerle, 1981) from the low-lying gravel beaches and dry moraines of the South Shetlands and West Antarctic Peninsula (Jablonski, 1984;Volkman & Trivelpiece, 1981). Lineages also differ in diet, particularly the proportions of crustaceans, fish, and squid consumed (Ratcliffe & Trathan, 2012). It has been observed that there is a trend of decreasing dietary variability and increasing krill consumption at higher latitudes (Bost & Jouventin, 1990 (Bingham, 1998;Dunn et al., 2018;Dunn et al., 2016;Juáres et al., 2019;Lescroël & Bost, 2006).
Gentoo penguins are currently listed as "Least Concern" on the IUCN Red List, with their last assessment in 2018 (BirdLife International, 2018). In order to be listed as Vulnerable, a species must exhibit one or more risk criteria, for example, population size reduction greater than 30%, a limited geographic range, small population size, or evidence of likely extinction in the next 100 years (IUCN, 2015). Here, we have shown that P. papua should be considered as at least four distinct species. While P. papua in the Falkland Islands and P. ellsworthi appear to be stable or increasing, (Baylis et al., 2011;Crofts & Stanworth, 2016;Dunn et al., 2016;Juáres et al., 2019), P. taeniata experienced a 30% reduction in numbers between 1987 and 2004 (Lescroël & Bost, 2006). Pygoscelis poncetii may also be declining at South Georgia (Woehler et al., 2001). These two species should therefore be high priority for reassessment by the IUCN.

| CON CLUDING REMARK S
In this paper, we highlight hidden biodiversity within the species P.
papua using genetic and morphometric methods, in keeping with recent assessments of hidden species diversity in birds. Our results clearly support the division of gentoo penguins into at least four species. We name a new species of gentoo, P. poncetii, and recommend elevation of three subspecies to species level (P. taeniata, P. papua, and P. ellsworthi). Our results show the importance of reassessing species boundaries as methodological advances are made. These findings have implications for the threat status of these species, and we urge that this diversity is considered in conservation planning for the Southern Ocean.

ACK N OWLED G M ENTS
We thank Bob Zink, an anonymous reviewer and the editorial team for their detailed and encouraging reviews and comments. This study was funded by the American Ornithological Society, Linnean Society, and American Museum of Natural History Collection Study grants to JY. JT is funded by an Evolution Education Trust Studentship. We gratefully acknowledge the American Museum of Natural History and the Natural History Museum (Tring) for access to their ornithological collections.

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
Gentoo morphological data are provided in Supplementary