Molecular insights into the historic demography of bowhead whales: understanding the evolutionary basis of contemporary management practices



Caleb D. Phillips, Department of Biological Sciences, Texas Tech University, Flint and Main, Lubbock, Texas, 79409. Tel: +806-742-2710; Fax: +806-742-2963; E-mail:


Patterns of genetic variation observed within species reflect evolutionary histories that include signatures of past demography. Understanding the demographic component of species' history is fundamental to informed management because changes in effective population size affect response to environmental change and evolvability, the strength of genetic drift, and maintenance of genetic variability. Species experiencing anthropogenic population reductions provide valuable case studies for understanding the genetic response to demographic change because historic changes in the census size are often well documented. A classic example is the bowhead whale, Balaena mysticetus, which experienced dramatic population depletion due to commercial whaling in the late 19th and early 20th centuries. Consequently, we analyzed a large multi-marker dataset of bowhead whales using a variety of analytical methods, including extended Bayesian skyline analysis and approximate Bayesian computation, to characterize genetic signatures of both ancient and contemporary demographic histories. No genetic signature of recent population depletion was recovered through any analysis incorporating realistic mutation assumptions, probably due to the combined influences of long generation time, short bottleneck duration, and the magnitude of population depletion. In contrast, a robust signal of population expansion was detected around 70,000 years ago, followed by a population decline around 15,000 years ago. The timing of these events coincides to a historic glacial period and the onset of warming at the end of the last glacial maximum, respectively. By implication, climate driven long-term variation in Arctic Ocean productivity, rather than recent anthropogenic disturbance, appears to have been the primary driver of historic bowhead whale demography.


Inference on the demographic history of species is an important component of understanding how evolutionary processes shape contemporary patterns of genetic variation (Waples 2005). The measurable genetic signatures of demographic change are a manifestation of a combination of effects relating to the timing, extremity, and complexity of demographic history. Demographic events must be of sufficient magnitude to produce a statistically detectable genetic effect, and event procession in a dynamic demographic history can erode signal for ancestral events (Johnson et al. 2007; Listman et al. 2007; Heled and Drummond 2008). These parameters weigh heavily on historic demographic inference, yet additional important considerations include the resolving capabilities of different types of genetic marker used for inference (which are in turn dictated by mutation rates, ploidy, mode of inheritance, variability, and the number of loci) and the analytical methods available to interrogate the data. For example, Hoffman et al. (2011) recently demonstrated differences in the utility of mitochondrial sequence and microsatellite data for reconstructing the demographic history of Antarctic fur seals (Arctocephalus gazella) under an approximate Bayesian computation framework. Apart from the above considerations, the frequent absence of strong a priori hypotheses about species' demographic history can occlude model development and result validation. For this reason, species for which strong a priori demographic information are available can provide useful case studies for identifying factors that underly demographic change over multiple timescales.

Several analytical approaches have been developed to investigate signal for historic demographic change from genetic data. Some of these, developed for sequence data, include commonly applied statistics such as Tajima's D (Tajima 1989), Fu's FS (Fu 1997), and the raggedness index (Harpending 1994). These methods rely on the comparison of average pairwise distances and segregating sites (as θ estimators), allelic diversity, and the frequency distribution of pairwise distances, respectively. The first two of these methods are classically defined as neutrality tests, although demographic changes are also readily detected, and all three can be influenced by factors other than demography such as selection and migration (Tajima 1989; Harpending 1994; Fu 1997). Statistical significance rendered through Tajima's D, Fu's FS, and the raggedness index all require some number of generations to have passed for new mutations or genetic drift to produce a measurable effect. The timing of signified events cannot be determined using the first two methods, so must be postulated from external data. For population expansions detected through the raggedness index (but not population bottlenecks), timing of the event can be inferred by making assumptions about μ and generation time. An alternative approach, also developed for sequence data, Bayesian skyline analysis (Drummond and Rambaut 2007), reconstructs demographic changes over time using the coalescent and directly from a sequence alignment and the estimated phylogeny. As genetic signal for demography at any given time is related to the genetic variability at that phylogenetic depth, this approach uniquely provides an estimated continuous demographic reconstruction over time without the need for a priori demographic assumptions.

Other tests capable of detecting demographic change, explicitly defined as “bottleneck analyses”, exploit expected changes in microsatellite distributions resulting from genetic bottlenecks, including changes in allele frequency, heterozygosity, or allele size distributions. Probably, the most well-known and widely applied of these is the heterozygosity excess test, developed by Luikart et al. (1998), which detects signal for the expected transient excess of heterozygosity that can arise during genetic bottlenecks. The heterozygosity excess test relies on a transient phenomenon (expected to last 4 × Ne generations, where Ne is the bottleneck population size), that dissipates as genetic drift and new mutations re-establish mutation-drift equilibrium (Luikart et al. 1998 and references therein). A very different approach that can be applied to understanding demographic change using microsatellite and/or sequence data is approximate Bayesian computation (ABC, Beaumont et al. 2002). This method employs coalescent-based simulations of alternative demographic scenarios, which are compared to the observed data through summary statistics. Simulated data producing summary statistics most similar to the observed are retained to estimate the posterior distributions of demographic parameters, such as the timing of events, associated effective population sizes (Ne), and marker mutation rates (μ). ABC provides the flexibility to explore genetic signals over time frames specified a priori, but poor fit to the model can indicate the influence of an unaccounted history. An important caveat of this method is the reliance, to a certain extent, on a priori information that is used to parameterize simulations. This reliance has been considered a limiter of new discovery (Templeton 2009), yet ABC has also been considered valuable for developing strong demographic inference when thoroughly implemented (Nielsen and Beaumont 2009; Csilléry et al. 2010). ABC has proven powerful for understanding bottleneck effects in species with exceptionally well-characterized recent demographic reductions (Chan et al. 2006; Hoffman et al. 2011) among other demographic scenarios (Estoup et al. 2004; Fagundes et al. 2007).

Many species of marine mammals, especially whales, are characterized by recent histories involving drastic population reduction due to unregulated commercial harvesting. As such human-induced population declines are often well documented in terms of duration and severity, demographic studies in these species are often well parameterized, providing clear a priori expectations, at least for recent events. Furthermore, whales possess a unique combination of life-history characteristics, including long generation times, a parameter that is important in determining the genetic response to population reduction (Allendorf 1986). Baleen whales are also unique in both their large body size and dietary strategy, relying on planktonic communities as a food source, which vary in abundance according to ocean productivity. Productivity dictates the carrying capacity of marine ecosystems, and is itself largely driven by temperature (Behrenfeld et al. 2006). Consequently, historic demographic changes in baleen filter feeding species could potentially mirror long-term climactic variations. From a practical standpoint, general public interest and continued aboriginal, scientific, and commercial harvest of some whale species also provide increasing relevance to demographic studies of whales.

The bowhead whale, Balaena mysticetus, is the second largest species of animal, has an estimated generation time of more than 50 years (Taylor et al. 2007), and may routinely live for over a 100 years, with a maximum age possibly exceeding 200 years (Fig. 1; George et al. 1999; Rosa et al. 2011). Bowheads are an important food source for some native communities along the coast of NW Alaska and E Chukotka. These communities take part in an annual subsistence harvest under a quota system regulated through a co-management agreement by NOAA and the Alaska Eskimo Whaling Commission and the International Whaling Commission (Suydam et al. 2010). There are five recognized stocks of bowhead whales, which are characterized by differences in geographic distribution, migration patterns, and demographic trends (Alter et al. in press). Of these, the Bering-Chukchi-Beaufort Seas stock (BCB) is the largest and currently most intensely harvested. In relation to the BCB, genetic investigations have demonstrated that while the Okhotsk and Atlantic stocks are isolated and significantly differentiated sharing little migration with the BCB, connectivity between the BCB and the Canadian stock is considerably higher (Bickham et al. 2009; Givens et al. 2010). Within the BCB population structure among localities has been thoroughly investigated through which support for this hypothesis was lacking (Givens et al. 2010). Observation and sampling of individuals at localities through annual harvests occurs during seasonal migratory events. Previous studies have investigated differences in the relatedness of groups of migratory individuals throughout spring and autumn migration (Jorde et al. 2007; Givens et al. 2010). One of these studies (Jorde et al. 2007) found elevated genetic differentiation between whales collect about a week apart. However, Givens et al. (2010), using a panel of 22 microsatellites developed specifically for bowhead whales, did not recover this pattern, and concluded after careful analysis that this pattern was confined to the separate set of 10 loci used by Jorde et al. (2007). Thus, the BCB population is currently managed as a single stock.

Figure 1.

Photograph of a bowhead whale taken near Point Barrow, Alaska in 2005 (photographer: J. C. George, Scientific Research Permit 782–1719). Note the whitish markings on the peduncle and scars on the back. Both are indications of advancing age. The maximum age of bowhead whales has been estimated at ~200 year with an average age at sexual maturity >20 years.

Records of past events and previous scientific investigations suggest that the BCB may have experienced a dynamic demographic history. Unregulated commercial whaling from 1848 to 1914 killed a total of 18,684 bowheads and resulted in an estimated 93% population reduction in the BCB, reducing the population to an estimated 1000 individuals (Woodby and Botkin 1993; Rooney et al. 2001; Punt 2006). Koski et al. (2010) estimated the contemporary census size of the BCB (in 2004) to be 12,631 and, George and Zeh (2012) estimated the current rate of population increase to be 3.5% annually. No studies have been able to realize a genetic signal for the anthropogenic population depletion (Rooney et al. 1999; Hunter 2005; Givens et al. 2010). Through investigation of more ancient demographic phenomena, Rooney et al. (1999) reported signal for a population expansion estimated to have occurred 8500 years before present (ybp). These authors note that the timing of this proposed population expansion is coincident with the formation of the M'Clintock Channel Sea Ice Plug, which would have isolated the BCB from the eastern Canadian Arctic stock (Dyke et al. 1996).

This study uses information about harvest-induced population reduction, contemporary census population size increase, and climatic chronology to explore the factors determining demographic signal and how this signal can be detected through different analytical approaches. Autosomal microsatellites and mitochondrial sequence data are the two most widely applied marker types in demographic studies. For this study, a panel of 22 microsatellite loci specifically designed for B. mysticetus (Huebinger et al. 2008; Givens et al. 2010) and three mitochondrial gene regions (Cytb, NDI, and HVRI) are investigated. Specifically, we test the hypothesis for a recent bottleneck and explore historic demographic trends in relation to episodes of climate change. The results of this study provide insights into how life-history characteristics of B. mysticetus have been central in determining the genetic response to demographic change.

Materials and Methods

Sample collection and DNA extraction

Tissues were collected from 324 whales from the BCB stock of bowhead whales. All specimens were analyzed for 22 microsatellites and 168 were analyzed for three mitochondrial gene regions. A total of 305 individuals were previously genotyped by Givens et al. (2010) and the remainder was genotyped for this study. Control region sequences (HVRI) were previously reported by LeDuc et al. (2008) and the protein coding genes (Cytb and ND1) are reported here for the first time. Sequences are available at Genbank under accession numbers JX470203-JX470262. The majority of the samples were obtained via necropsy sampling of whales that were part of Alaskan aboriginal subsistence harvests. Six tissue samples were obtained from remote biopsy darting. Table 1 presents sampling locations (8 Alaskan villages where whales were landed) and number of samples collected from each location. Tissues and DNA samples were stored at −80°C. DNA was extracted from skin slices or biopsy plugs using the GenElute mammalian genomic DNA purification kit (Sigma-Aldrich; St. Louis, MO).

Table 1. Summary of number of samples and sampling location for each population for microsatellites and mitochondrial DNA (mtDNA) datasets
LocationN microsatellitesN mtDNA
Little Diomede10
Point Hope71

Microsatellite genotyping

Genotypes for 22 microsatellite loci were generated as detailed in Huebinger et al. (2008) and Givens et al. (2010). The loci are described in Table 2. Allele sizes were determined by fragment separation on an ABI3100 DNA Analyzer (Applied Biosystems, Inc., Foster City, CA) using GeneScan-400 (ROX) size standard. Alleles were assigned in GeneMapper version 4.0 (Applied Biosystems, Inc.). Samples that produced poor quality chromatograms or failed to amplify were reanalyzed. A thorough description of the microsatellite dataset was previously reported by Givens et al. (2010).

Table 2. Details of the 22 microsatellite loci used in this study, including literature sources and polymorphism characteristics in 324 Bowhead whales. P-values significant at α < 0.05 without correction for multiple statistical tests are highlighted in bold
LocusNumber of allelesObserved heterozygosity (HO)Expected heterozygosity (HE)Hardy–Weinberg equilibrium probabilityProbability of homozygote excessNull allele frequency
Bmy10_1220.8900.9260.386 0.022 0.019
Bmy14_160.5030.5510.172 0.015 0.045
Bmy26_1220.8950.9250.385 0.028 0.016
Bmy33_1130.7980.807 0.007 0.5080.005
Bmy41_1220.8960.9050.088 0.035 0.004
Bmy42_1110.7220.7810.397 0.033 0.039
Bmy54_180.6800.7070.237 0.022 0.019
Bmy55_160.6820.710 0.005 0.040 0.019
Bmy57_190.5660.602 0.006 0.000 0.029
Bmy58_1270.9290.926 0.032 0.450−0.002

Mitochondrial DNA (mtDNA) sequencing

Sequences from three mitochondrial gene regions, HVRI (397 base pairs), Cytb (1140 base pairs), and ND1 (957 base pairs) were generated using PCR and Sanger sequencing protocols. Detailed laboratory methods are described in LeDuc et al. (2008) for HVRI and Phillips et al. (2011) for the two protein coding genes.

Classical sequence-based demographic tests

Concatenation of the mitochondrial gene regions resulted in a 2494-base pair sequence for each individual. One-hundred-sixty-eight individuals sequenced for all three gene regions were included for analyses (Table 1). Arlequin version (Excoffier et al. 2005) was used to calculate a variety of summary statistics. As initial descriptors of variability, number of haplotypes, number of variables sites, number of pairwise differences, and nucleotide diversity (Tajima 1983) were calculated. Tajima's D, useful for detecting departures from population equilibrium, selection, and rate heterogeneity, was calculated based on uncorrected pairwise comparisons. Significance was assessed by randomly generating samples under the hypothesis of neutrality and observing the proportion of simulated values less than or equal to the observed (Hudson 1990; Excoffier et al. 2005). Fu's FS, a test statistic that is sensitive to departure from population equilibrium, was also calculated. While Tajima's D relies on the comparison of estimators of θ, Fu's FS relies on comparison to the observed allelic abundance to that obtained through simulation under neutrality. Similar to that implemented for Tajima's D, significance for Fu's Fs is obtained by the proportion of times simulated FS values are equal to or smaller than the observed. A mismatch distribution and the associated raggedness index were also calculated. The mismatch distribution was based on uncorrected pairwise differences, and significance was based on the sum of squared deviations of the observed and expected mismatches that were obtained by generating 1000 random samples according to the estimated demography. Having recovered a non-significant raggedness index, the timing of the postulated demographic expansion was estimated assuming a generation time of 52 years (Taylor et al. 2007), an overall mutation rate (μ) of 6.14 × 10−7 per nucleotide per generation (obtained by averaging independent rate estimates for each gene region obtained through Bayesian demographic reconstructions, see below), and following the algorithms of Watterson (1975), Rogers (1995), and Schneider and Excoffier (1999) for estimating τ (τ = 2μt, where t is the time to demographic expansion).

Extended Bayesian skyline plot

To explore signal contained in the mitochondrial dataset for demographic change over time, an extended Bayesian skyline plot (EBSP) was calculated. Bayesian skyline plots are based on the coalescent and use the estimated phylogeny to reconstruct demographic change without prior assumptions on the timing of events (Drummond and Rambaut 2007). The EBSP is an implementation of the Bayesian skyline plot that incorporates multi-locus information in the demographic reconstruction. As discussed by Heled and Drummond (2008), multi-locus perspectives about demographics empower reconstructions. Although the entire mitochondrial genome is functionally a single locus, EBSP was implemented to allow separate phylogenetic reconstructions for each mitochondrial gene using an HVRI mutation rate of 2.8%/million years (95% CI = 1.5–4.6%). This value was previously estimated by Bickham et al. (2009) following the methods of Alter and Palumbi (2009). ND1 and Cytb were assigned relaxed molecular clocks (as deemed appropriate through clock testing in MEGA version 5; Tamura et al. 2011) and their rates were estimated from the data in relation to that assumed for HVRI. Fifty-million MCMC sets were conducted, with sampling every 1000 iterations. Effective samplings of prior and posterior tree distributions were confirmed in Tracer (Rambaut and Drummond 2007). The last 10,000 iterations of simulations were retained for demographic reconstruction. Simulations and skyline plotting were repeated three times and inferred demographic trends were compared across analyses for consistency.

Microsatellite data checking

Tests for deviation from Hardy–Weinberg equilibrium and for linkage disequilibrium were implemented using GENEPOP v. 3.1d (Raymond and Rousset 1995). Bonferroni adjustments (Hochberg 1988) with an α level of  0.05 were carried out on all tabulated results. GENEPOP was also used to determine expected and observed heterozygosity (HE and HO, respectively). Null allele frequencies were calculated following Chakraborty (1988) using the program Micro-checker (Van Oosterhout et al. 2004). Microsatellite error rates for the previoulsly published portion of this dataset were investigated by Morin et al. (2009).

Classical microsatellite-based bottleneck tests

To test for evidence of a recent demographic decline or expansion, we analyzed the microsatellite data for deviations from expected heterozygosity at mutation-drift equilibrium within the program BOTTLENECK v 1.2.02 (Piry et al. 1999). Six different mutation models were evaluated: the strict Stepwise Mutation Model (SMM, Kimura and Ohta 1978), the Infinite Alleles Model (IAM, Kimura and Crow 1964), and four intermediate Two-Phase Models (TPMs) with 1%, 5%, 10%, and 30% IAM mutations, respectively. For each mutational model, the heterozygosity of each locus expected at equilibrium given the observed number of alleles (Heq) was determined using 10,000 simulations and then compared against observed heterozygosity (He). We then recorded the number of loci for which He was greater than Heq and smaller than Heq, and determined whether the overall set of deviations was statistically significant using sign, standardized differences, and Wilcoxon signed ranks tests. Finally, BOTTLENECK was also used to generate a qualitative descriptor of whether the observed allele frequencies at each locus deviate from the L-shaped distribution expected under mutation-drift equilibrium Luikart et al. (1998).

Approximate Bayesian computation

Approximate Bayesian computation (ABC), originally introduced by Beaumont et al. (2002), was implemented in DIYABC v1 (Cornuet et al. 2008, 2010). Demographic models were defined to capture any signal present in the genetic data specific to an anthropogenically induced population reduction. Information on the timing and severity of this reduction served as a priori demographic information around which models tested through ABC were developed. Underlying parameters to ABC analyses are generation time and sex ratio. Here, generation time was defined as 52 years, as reported by Taylor et al. (2007), and a 1:1 sex ratio was implemented based on data from Nerini et al. (1984), Heide-Jørgensen et al. (2010), and J. C. George (unpubl. data). Two demographic models were simulated for comparison, one being a model enforcing a population size reduction having priors for timing and severity bracketing known values, whereas in the second model, this population size reduction was not enforced. Models are graphically depicted in Figure 2 and summarized below. These models were defined by identical priors on time to account for historic events. Specifically, time priors employed represented a demographic event uniformly distributed between one and six generations ago (encompassing the timing of the known period of unregulated whaling), and two additional time priors uniformly distributed with equivalent prior distributions between 7 and 357 generations ago (population expansion, and ancestral population size). The upper bound on these time parameters extended to 18,500 ybp, liberally surrounding the timing of the postulated population expansion of Rooney et al. (1999). Prior distributions on Ne associated with each time window (and including the estimate of contemporary Ne) were defined by uniform distributions between 1 and 20,000, with the exception of the bottleneck model, which incorporated a uniform Ne prior of 1–2000 associated with the initial time prior (1–6 generations). The purpose of confining the prior distribution on Ne at this time period was to enforce Ne values during this interval to correspond with knowledge of the census population size at that time. As such, demographic models were identical with the exception of Ne associated with the bottleneck time prior. Ne parameters are subsequently referred to as Ne(contemporary), Ne(bottleneck), Ne(historicA), Ne(historicB), with the latter two being the Ne parameters associated with identical but independent time priors. Microsatellite μ was defined as the generalized stepwise mutation model (Estoup et al. 2002) with a mean rate uniformly distributed between 1.00 × 10−5 and 1.00 × 10−3 substitutions/generation. Although the lower bound on this prior distribution is lower than that generally assumed for microsatellites, previous studies have indicated a reduced molecular evolutionary rate in whales as compared with that usually observed in mammals (Alter and Palumbi 2009). For mitochondrial sequences, the employed model of evolution determined through model testing in MEGA 5 (Tamura et al. 2011) was HKY + I (0.5) + G (0.05). This substitution rate was uniformly distributed between 1 × 10−7 and 1 × 10−8 substitutions/site/generation.

Figure 2.

Models of demographic history of bowhead whales tested through ABC.

Sensitivity to prior assumptions in ABC inference has previously been acknowledged by other authors (Chan et al. 2006; Hoffman et al. 2011). To explore the influence of prior assumptions about Ne on posterior estimates, additional simulations were performed incorporating a range of prior bounds on Ne. Subsequent simulations were performed in which the Ne parameters originally defined between one and 20,000 were confined to an upper bound of 10,000 (a biologically reasonable upper bound given estimates of ancestral census size), or extended to 50,000, and then to 100,000 in final simulations. Although these latter prior bounds could be interpreted as overly generous, Roman and Palumbi (2003) reported considerably larger Ne estimates for several whale species than is generally discussed for B. mysticetus. Comparative analysis of all sets of simulations allowed for a diagnosis of how the availability of biological information about Ne (and prior assumptions) influences posterior inference.

By ranking simulated summary statistics in relation to the observed following Cornuet et al. (2010), preliminary simulations showed that, while simulated microsatellite data fit the observed data reasonably well, simulated mitochondrial data were a poor fit for the observed data. This is consistent with a previous study of Antarctic fur seals by Hoffman et al. (2011), and indicated that the mitochondrial data did not contain resolution for the recent time frames being investigated in this analysis. Because of this, simulations exploring ranges of Ne assumptions over relatively recent times focused on the microsatellite portion of the dataset. The mitochondrial data were instead analyzed using the alternative methodologies described above, which were not confined to specific time periods.

For all ABC analyses, 1 million simulated datasets were generated for each demographic model. Heterozygosity and the mean number of alleles were then computed as summary statistics for the observed and simulated datasets. These parameters were specifically selected because they are known to be influenced by changes in effective population size (Luikart et al. 1998). Model comparisons implemented the local linear regression method introduced by Beaumont et al. (2002). Type I and II error rates in model selection were calculated by simulating 500 datasets under the parameters of each model and assuming the given model was the correct model (Bertorelle et al. (2010) has discussed the advantage of using pseudo-observed datasets for evaluating the accuracy of analyses). From the best supported model, 10,000 datasets with the smallest Euclidean distances from the observed were retained to build posterior parameter distributions, which were smooth weighted using the Locfit function within R version 2.9.1 (R Development Team 2005).


For the microsatellite dataset, moderate to high levels of genetic variability were found, with each locus yielding between six and 28 alleles (mean = 15.7, Table 2) and expected heterozygosity ranging from 0.566 to 0.938 (mean = 0.808). Weakly significant deviations from Hardy–Weinberg equilibrium were detected at four loci (Table 2), but none of these remained significant following table-wide Bonferroni correction for multiple statistical tests (Hochberg 1988). Similarly, no evidence was found for null alleles being present at high frequencies in any of the loci. Tests for linkage disequilibrium yielded 19 significant P-values (P < 0.05) of 231 pairwise comparisons, only one of which remained significant following Bonferroni correction (loci Bmy10 and Bmy26). From the concatenated three gene mitochondrial dataset including 168 individuals, 86 haplotypes were identified (Appendix 1). Nucleotide diversity was estimated at 0.004 ± 0.002, and mean number of pairwise differences was 9.75 ± 4.49. Haplotypes were derived from 102 variable positions, including 95 transition, six transversions, and one indel in the HVR1 gene region.

Classical sequence-based demographic tests

Both Tajima's D and Fu's FS were negative and statistically significant (Tajima's = −1.4, < 0.05; Fu's FS = −24.3, < 0.0001). The mismatch distribution was unimodal with a raggedness index of 0.003 (the probability that simulated raggedness was greater than or equal to the observed raggedness was 0.89). Average μ for the concatenated mitochondrial gene regions was estimated at 1.18%/million years, τ was estimated at 4.34. Following a generation time of 52 years (Taylor et al. 2007), the estimated time to the demographic expansion was calculated at 75,296 ybp.

Extended Bayesian skyline plot

Survey of effective sampling of values in TRACER disclosed values of greater than 200 for all parameters, indicating sufficiently deep sampling (Drummond and Rambaut 2007). Posterior estimates of mutation rates for HVRI, Cytb, and ND1 were calculated at 2.52%, 0.58%, and 0.49%/my, respectively. Reconstructions indicated a demographic history involving a few major episodes of population increase and decline, which were corroborated through independent replicates of the analysis yielding the same results. The demographic reconstruction included an increase in Nef estimated to have begun between 50,000 and 75,000 ybp that continued until about 15,000 ybp, which was followed by a subsequent population reduction (Fig. 3). From this reconstruction, current Nef was estimated at a median value of 20,000; however, the corresponding 95% highest posterior densities during this period were large.

Figure 3.

Extended Bayesian skyline reconstructions of Nef, timing of six past glacials (Rohling et al. 1998; gray bars denote glacials), CO2 (Petit et al. 1999), and CH4 (Loulergue et al. 2008) atmospheric concentrations plotted over time. For the demographic reconstruction, the gray area denotes the 95 highest posterior densities for the estimates, the hashed line represents the mean, and the solid line represents the median estimates.

Classical microsatellite-based bottleneck tests

Analysis of the microsatellite dataset within the program BOTTLENECK yielded virtually identical results regardless of whether the full dataset was used or the analysis was restricted to the 18 loci in HWE (Table 3). There was also strong consistency among P-values obtained from the sign, standardized differences, and Wilcoxon tests. However, the results were highly dependent on the mutational model specified, with a significant excess of heterozygosity being detected under the IAM, but a significant deficiency of heterozygosity being found under the SMM. Similarly, the intermediate TPM models indicated a significant excess of heterozygosity when strongly influenced by the IAM (e.g., the TPM70 model) and a significant deficiency of heterozygosity when mutations were predominantly SMM (e.g., the TPM99 and TPM95 models).

Table 3. The number of loci exhibiting heterozygosity excess and test probabilities obtained using a range of mutational models (see 'Materials and Methods' for details) within the program Bottleneck. Results are presented for separate analyses based on (a) the entire dataset; and (b) only the 18 loci that did not deviate significantly from HWE prior to correction for multiple tests. The mode test revealed normal L-shaped distributions under all of the scenarios tested. P-values significant at α < 0.05 without correction for multiple statistical tests are highlighted in bold
Mutational modelNo. of loci with heterozygosity excessNo. of loci with heterozygosity deficiency Sign test P-value Standardized differences test P-value Wilcoxon test P-value
All 22 loci:
IAM220 <0.0001 <0.0001 <0.0001
TPM70184 0.02129 0.01204 0.00528
TPM95814 0.03066 0.02695 0.13749
TPM99416 0.00279 <0.0001 0.00192
SMM517 0.0006 <0.0001 0.0003
18 loci in HWE:
IAM220 <0.0001 <0.0001 <0.0001
TPM70184 0.02129 0.01204 0.00528
TPM95814 0.03066 0.02695 0.13749
TPM99416 0.00279 <0.0001 0.00192
SMM517 0.0006 <0.0001 0.0003

Approximate Bayesian computation

Comparison of demographic models indicated the model not enforcing a recent Ne reduction produced simulated datasets yielding summary statistics most similar to the observed. This model received a posterior probability of 0.85, while the model enforcing a genetic bottleneck yielded a posterior probability of 0.15. Type I and Type II error rates for the selection of the best supported model were 0.28 and 0.3, respectively. The selected model was described by an ancient Ne with a median value of 8980 (95% CI 1700–18,600; very similar values were obtained for both Ne(historicA) and Ne(historicB); Table 4). No resolution was recovered for posterior estimates for the times associated with these Ne estimates (data not shown because posterior distributions were flat). Similarly, posterior distributions associated with the whaling period time prior, its associated Ne(bottleneck), and Ne(contemporary) were all uninformative.

Table 4. Point estimates and 95% credibility intervals for all Ne and μ obtained through simulations evoking different prior assumptions on Ne. * because the posterior distributions for these parameters were mostly flat (Fig. 2) point estimates for these parameters are weak estimates
Ne (1–10,000)
N e(contemporary) * 5.33 × 1035.37 × 1039.53 × 1029.54 × 103
N e(bottleneck) * 6.22 × 1036.38 × 1032.13 × 1039.66 × 103
N e(historicA) 5.71 × 1035.77 × 1031.38 × 1039.50 × 103
N e(historicB) 5.74 × 1035.82 × 1031.45 × 1039.56 × 103
μ 6.80 × 10−46.97 × 10−43.24 × 10−49.70 × 10−4
Ne (1–20,000)
N e(contemporary) * 1.04 × 1041.05 × 1031.41 × 1031.90 × 104
N e(bottleneck) * 1.17 × 1041.20 × 1033.07 × 1031.93 × 104
N e(historicA) 9.60 × 1038.98 × 1031.70 × 1031.86 × 104
N e(historicB) 9.71 × 1039.12 × 1032.03 × 1031.87 × 104
μ 5.41 × 10−45.24 × 10−41.74 × 10−49.46 × 10−4
Ne (1-50,000)
N e(contemporary) * 2.52 × 1042.50 × 1042.93 × 1034.74 × 104
N e(bottleneck) * 2.74 × 1042.79 × 1045.50 × 1034.77 × 104
N e(historicA) 2.01 × 1041.69 × 1043.01 × 1034.57 × 104
N e(historicB) 2.04 × 1041.74 × 1043.01 × 1034.60 × 104
μ 4.10 × 10−43.54 × 10−47.29 × 10−59.09 × 10−4
Ne (1–100,000)
N e(contemporary) * 5.10 × 1045.08 × 1045.73 × 1039.53 × 104
N e(bottleneck) * 5.32 × 1045.39 × 1048.84 × 1039.55 × 104
N e(historicA) 3.62 × 1042.78 × 1043.66 × 1039.17 × 104
N e(historicB) 3.60 × 1042.79 × 1043.64 × 1039.08 × 104
μ 3.51 × 10−42.77 × 10−43.80 × 10−58.93 × 10−4

To explore sensitivity to priors on Ne and μ, we conducted a series of supplementary ABC simulations (see 'Materials and Methods' for details). In all cases, the same demographic model was supported as in the initial simulations. However, broader prior assumptions on Ne yielded larger posterior estimates for ancestral Ne coupled with smaller posterior estimates for average μ. The posterior distributions of time and Ne parameters from all simulations are available as joint plots in Figure 4 and are listed in Table 4.

Figure 4.

Plots of the posterior estimates of Ne values and μ under various prior assumptions on Ne. There were five Ne priors with associated time priors defined for analysis (see 'Materials and Methods' for details). The posterior estimates for these parameters are depicted in panels of the left column and Ne(contempory) = black line, Ne(bottleneck) = long dashes, Ne(historicA) = short dashes, and Ne(historicB) = dotted. The panels of the right column depict the posterior estimates for average μ, and the organization of panels within rows (a–d) corresponds to the prior bounds on Ne values that were assumed.


Demographic studies are inherently complicated by the fact that the histories of many species consist of multiple, temporally stratified events of different magnitudes, which are estimated from a single source of variation (i.e., the genome). This study used 22 highly polymorphic microsatellite loci and 2494 base pairs of mtDNA to explore historic demographic phenomena over multiple timescales and using a variety of contrasting but complimentary approaches. We recovered signal of ancient demographic change over time, but no signature of recent anthropogenic exploitation. The observed demographic patterns probably reflect the unique life-history characteristics of B. mysticetus, as discussed in detail below.

Evidence for historic population expansion

Analyses of the mitochondrial dataset using various statistical approaches yielded consistent results. Tajima's D and Fu's FS were found to be significantly negative, both of which are indicative of population expansion. Furthermore, the observed unimodal mismatch distribution not only corroborated these results but also provided an estimated time of expansion of roughly 75,000 ybp. This estimate was supported by results of the EBSP in which a similarly timed population expansion was reconstructed. This confirms the postulated population expansion detected by Rooney et al. (1999), but contradicts their hypothesis that the M'Clintock Channel Sea Ice Plug formation roughly 8500 years ago was implicated in the timing of the expansion. A subsequent study by Rooney et al. (2001) provided an alternative analysis and hypothesis for population expansion. These authors constructed lineage through time plots, and estimated coalescence time of B. mysticetus. The coalescence time was estimated to approximately 270,000 ybp and it was assumed that the coalescent event was directly followed by population expansion that is only now beginning to wane (from the continual estimated increase in number of lineages through time). In this study, we confirm the previous reports of a historic population expansion, but also provide a more robust estimation for the nature and timing of this expansion.

The EBSP, although providing estimates of Nef that are likely inflated (potential bases for inflated effective population sizes are discussed by Kuhner et al. (1998) and Ho et al. (2005)), was valuable for providing improved resolution on the demographic history of Nef, while also providing further support for a population expansion, having begun about 75,000 years ago, and peaking around 25,000 years ago. This historic event is putatively the source of the population expansion signal obtained from multiple analyses. It should be noted that influence of unknown immigrations from other stocks could contribute a false signal for population expansion (e.g., Hutchinson et al. 2003). However, this seems relatively unlikely because a consistent signal for population expansion was found using a variety of analytical approaches, each of which considers different aspects of the data. In addition, the EBSP also described a subsequent population reduction, estimated to have taken place over the past 15,000 years. This postulated reduction pre-dates the period of known anthropogenic reduction in the 19th and 20th centuries. Although a tentative explanation for this pattern could be anthropogenic depletion by pre-historic humans, all available data indicate that whaling was not practiced by natives until around 2000 years ago (Stoker and Krupnik 1993). Therefore, it is more likely that natural biological cycles associated with carrying capacity and/or environmental change could be implicated. Fortunately, accurate climatologic reconstructions over the past several 1000 years have been corroborated using multiple data types through multiple studies (Rohling et al. 1998, 2009; Siddall et al. 2003). The major population expansion recovered by EBSP falls directly within the second previous glacial (70–65,000 years ago; Rohling et al. 1998), and the subsequent population decline (estimated to have begun 15,000 years ago) coincides with the warming and sea level rises starting at the end of the last glacial (15,000 years ago; Severinghaus and Brook 1999 and references therein). Plotted within Figure 3 are estimated historic levels of atmospheric CO2 (Petit et al. 1999) and CH4 (Loulergue et al. 2008), measured from Antarctic ice cores. Comparisons of these plots with the ESBP reveal that the period of population expansion broadly coincides with a time interval of low gas concentrations, while the timing of the population decline markedly corresponds to increased atmospheric gas concentrations (it should be recognized that the highest posterior density around population sizes as well as error that could be attributed to mutation rate and generation time attribute uncertainty to these qualitative comparisons). It is notable that although the fluctuations in gas concentrations are dynamic over the entire data range, demographic reconstructions are uninformative over deeper timescales. This observation is likely a reflection that more recent demographic events have eroded signal for more ancient phenomena. Trends observed between effective population size and climate change are most readily explained through their relationship with ecosystem carrying capacity and available habitat. This connection is particularly evident in this study, in which baleen filtering of planktonic communities is required to support large biomasses. As ocean productivity is largely determined by temperature (Behrenfeld et al. 2006), long-term climatic oscillations have likely shaped trends in the carrying capacity of the northern oceans. Behrenfeld and Falkowski (1997) estimated that the entire global ocean phytoplankton biomass is transferred through marine ecosystems (partly by grazing) every 2–6 days. The biomass turnover documented by these authors indicates the close connection between climate and effective population size of marine species. These climatologically directed ecosystem changes appear to be manifested in our reconstructions of B. mysticetus effective population size over time. Green-houses gases such as methane and carbon dioxide are suitable indicators for past climate, although related oceanographic factors (e.g., ice cover, meltwater, salinity, circulation) also likely contributed to environmental conditions, and hence bowhead whale historic demography in the Arctic.

Evaluation of an anthropogenic bottleneck scenario

Initial bottleneck analysis applied the heterozygosity excess test under a range of mutation model definitions, finding a signal for a genetic bottleneck only under the IAM model. Given that this model is unrealistic for most “real” microsatellites (Di Rienzo et al. 1994) and that a TPM model comprising ~5% IAM mutations is more likely (Piry et al. 1999), results from this analysis appear most consistent with the population having undergone a historic expansion. Moreover, a shift in the allele frequency distribution from an L-shaped distribution was not observed, suggesting that the population was not recently bottlenecked. Results of the ABC analysis, conducted as an alternative analytical bottleneck assessment, were generally in agreement with the classical bottleneck tests, yet provided greater demographic resolution. Given that the compared models only differed in the prior constraint on Ne(bottleneck), and that Type I and II error were both estimated at approximately 30%, results established that a population model involving a recent Ne reduction is not compatible with the observed distribution of genetic variation in B. mysticetus. In addition, the ABC analysis found no evidence for the population expansion circa 8500 ybp postulated by Rooney et al. (1999). The absence of these events in the recent demographic history is reflected in the posterior distributions of parameters. Similar Ne posteriors for Ne(historicA) and Ne(historicB) and flat distributions of time posteriors associated with these parameters indicate a lack of signal for shift in population size during this time frame (i.e., no population expansion 8500 ybp). Similarly, we found no clear signal for Ne during the past 300 years (Ne(contemporary) and Ne(bottleneck)). The absence of a strong posterior estimate for Ne over recent times reflects that a wide range of Ne values over this time period in conjunction with estimated historic Ne values produce summary statistics similar to that observed in the BCB. In other words, any range of plausible population sizes over the past 300 years is not sufficient to drive genetic signal in bowhead whales. This is an important point, as it reflects the relationships among genetic signal, generation time, and μ, which are clearly manifested in this analysis because of the design of approximate Bayesian computation, which allows for the parameterization of models incorporating stratified temporal events.

The results of bottleneck analyses could be viewed as contrary to what would be expected given what is known about stock depletion through parts of the 19th and 20th centuries. The period of unregulated whaling, which ensued for nearly 70 years, resulted in greater than 90% population depletion. However, this time period measured in terms of B. mysticetus generations is only 1.3 generations. Thus, it seems that the perceived brief duration of the bottleneck has prevented the loss of genetic variation in B. mysticetus. This hypothesis is supported through theoretical work by Allendorf (1986), who through simulation showed that the reduction in heterozygosity resulting from population bottlenecks is largely determined by bottleneck duration. A few empirical studies have also observed a buffering effect against bottlenecks by long generation time in certain species (Dinerstein and McCracken 1990; Hailer et al. 2006; Lippé et al. 2006). By comparison, demographic studies investigating population depletion in the gray whale (Alter et al. 2007, 2012) have indicated population size reductions; however, the influence of human disturbances to the patterns is not clear. To illustrate this point, we carried out a set of post hoc simulations using SPAms (Parreira et al. 2009). The major parameters of the simulations involved μ (assumed as 1 × 10−4), Ne(ancestral), Ne(contemporary), and generations since the change in population size. Multiple sets of simulations were conducted to encompass a range of combinations of the latter three parameters. Figure 5 shows that the set of simulations most similar to that characterizing B. mysticetus resulted in no loss of heterozygosity. In fact, much smaller contemporary Ne coupled with bottlenecks of greater duration would be required to appreciably reduce heterozygosity. Apart from this simple relationship between generation time and bottleneck duration that has directed a lack of genetic signal in bowhead whales, an additional consideration is that contemporary sampling includes individuals that were born both before and after the population reduction (Givens et al. 2010). Because individuals born during or before the depletion are still living and potentially breeding provides additional buffering to bottleneck effects and influences the ability to detect genetic signature for the depletion.

Figure 5.

Box plots of heterozygosity values obtained through 1000 simulations for various combinations of Ne(contemporary), Ne(historic), and generations since population reduction. The combination of parameters labeled 1-9 on the horizontal axis were 1 = 10,000, 50, 5; 2 = 10,000, 50, 50; 3 = 10,000, 50, 500; 4 = 10,000, 500, 5, 5 = 10,000, 500; 50; 6 = 10,000, 500, 500, 7 = 10,000, 1000, 5; 8 = 10,000, 1000, 50; and 9 = 10,000, 1000, 500, respectively. Error bars indicate maximum and minimum observed values, gray boxes are 50th and 75th percentiles, and the median value is denoted as a black horizontal line. Simulation 7 represented the combination of parameter values most similar to that postulated for the BCB stock of bowhead whales.

An additional observation of the ABC analysis was the influence of prior assumptions on posterior estimations, and hence biological interpretations. This observation is not a characteristic unique to ABC, but to Bayesian methods in general. In this study, through a series of simulations that evoked differing degrees of constraint on Ne values, it was found that level of constraint applied was reflected in the posterior estimates. Specifically, tighter and lower prior assumptions on Ne yielded lower point estimates for this parameter. In addition, the inversely proportional relationship between μ and Ne was illustrated through observed posterior estimates obtained through a range of prior assumptions. In the case of B. mysticetus, or any species of management concern and with a tendency for small effective population sizes, assumptions (and uncertainty) on these parameters can directly influence the biological results upon which management decisions are made. As a priori information about Ne in bowhead whales and μ for whales are available, results obtained from some of the simulations can likely be excluded. For example, although posterior estimates for historic Ne (5700) stemming from simulations evoking a 1–10,000 Ne prior could be considered biologically plausible, a rather large value of μ was required, which is not likely given the generally low rates of molecular evolution previously observed in whales (Alter and Palumbi 2009). The true value of historic Ne likely lies within the estimates provided by the simulations with Ne priors of 1–20,000 and 1–50,000, which yielded mean estimates for historic Ne of about 10,000 and 20,000, respectively. However, the ability to further pinpoint historic Ne within this range would require a firmer assumption on μ.

Practical implications

Our results have important implications for understanding the evolutionary basis of contemporary management practices. For B. mysticetus, an annual harvest quota for the Alaskan and Russian aboriginal hunt is provided by the International Whaling Commission (IWC) following the Aboriginal Whaling Management Procedure (AWMP). Following the AWMP, the existing B. mysticetus assessment is driven by the past catch series, the current estimate of abundance in absolute terms, the current rate of population increase, and other life-history parameters. The current population growth rate is estimated to be 3.5% (George and Zeh 2012). However, the long-term growth rate over the past century is dependent upon the size of the population at its nadir. Taken together and under the conventional population model assumption of compensatory dynamics (the larger the population, the lower the per-capita growth rate), there is an indication that the population was small at the end of the 19th century (D. Butterworth, pers. comm.). Despite abundance estimates not being available for this period, the results of this study suggest that the period of unregulated harvest was initiated during, or perhaps at the end of, a natural population reduction, which corresponds to climatic oscillations. Thus, the current population increase is likely a response to the compound effects of natural and anthropogenic occurrences.

The estimated population size prior to commercial whaling is another important component of the IWC's management strategy. In 1974, the IWC established that a species' original population size is to be used as an index for comparison of the estimated current population size to classify species into management categories and for setting harvest quotas (International Whaling Commission 1976). Woodby and Botkin (1993) reviewed the previous estimates of pre-whaling population size of the BCB population and developed a “simple recruitment model” to obtain a best estimate of the range of possible population sizes. These authors conclude that the estimated population size ranged from 10,400 to 23,000. Brandon and Wade (2006) suggest much the same range, but favor estimates near 14,000 using a Bayesian model averaging approach. However, they note “there is no [visual] evidence in the abundance estimates for a reduction in trend” suggesting that a value larger than 14,000 is likely (which is close to present abundance of ~12,600; Koski et al. 2010). Brandon and Wade (2006) do not estimate the population nadir. These estimates agree closely with the historic Ne of 10,000–20,000 estimated from the simulations in this article following estimates that the mature proportion of the BCB population is about 40% (Bockstoce and Botkin 1983; Woodby and Botkin 1993; Koski et al. 2006). This gives further reassurance to the current strike limit established by the IWC as population estimates based on both genetic data and historic catch data (Woodby and Botkin 1993) are based on entirely different sets of assumptions.


Important findings have emerged from this course of study pertaining to different aspects of demographic reconstruction. Demographic responses to both anthropogenic and natural environmental pressures are dictated by magnitude, duration, and species-specific life-history characteristics. For B. mysticetus, long generation time and dietary strategy are life-history characteristics that appear to be central to this species' demographic history. While long generation time served as a buffer to population bottleneck effects, putative long-term changes in Arctic carrying capacity drove a dynamic historic demography that is readily detected using a variety of statistical approaches.


We acknowledge the efforts of the Alaska Eskimo Whaling Commission (AEWC) for helping secure funding for genetic and stock structure studies through NOAA (Award Number: NA05NMF4391108). The North Slope Borough Department of Wildlife Management also contributed substantial funding for this study. We thank the whale hunters of Alaska and Russia for their collaboration and support. We appreciate the efforts of our collaborators and their support staff at the AEWC, NMML, NOAA SWFSC, Colorado State University, Purdue University, Texas A&M University, University of Washington, and North Slope Borough for supporting the bowhead genetic/stock structure research. We are grateful to the late Alaska Senator Ted Stevens for his support of our work; and Stanley Speaks with the Bureau of Indian Affairs for helping secure additional funding (through BIA) for this project. The National Park Services Program, Shared Beringian Heritage Program for funding support for the biopsy work in Chukotka (through the North Slope Borough) for the years 2003–2006. The bowhead whale photograph was taken under Scientific Research Permit 782-1719 issued to NMML under the provisions of the US Marine Mammal Protection Act and Endangered Species Act.

Conflict of Interest

None declared.

Appendix 1

Sample identifications and corresponding haplotypes for each mitochondrial gene region. Haplotype sequences are available at Genbank under accession numbers [numbers pending]. Lettering within each sample ID indicates collection locality, with B = Barrow, KK = Kaktovik, G = Gambell, S = Savoonga, N = Nuiqsut, H = Point Hope, WW = Wainwright.