Are we underestimating the occurrence of sympatric populations?

Sympatric populations are conspecific populations that coexist spatially. They are of interest in evolutionary biology by representing the potential first steps of sympatric speciation and are important to identify and monitor in conservation management. Reviewing the literature pertaining to sympatric populations, we find that most cases of sympatry appear coupled to phenotypic divergence, implying ease of detection. In comparison, phenotypically cryptic, sympatric populations seem rarely documented. We explore the statistical power for detecting population mixtures from genetic marker data, using commonly applied tests for heterozygote deficiency (i.e., Wahlund effect) and the structure software, through computer simulations. We find that both tests are efficient at detecting population mixture only when genetic differentiation is high, sample size and number of genetic markers are reasonable and the sympatric populations happen to occur in similar proportions in the sample. We present an approximate expression based on these experimental factors for the lower limit of FST, beyond which power for structure collapses and only the heterozygote‐deficiency tests retain some, although low, power. The findings suggest that cases of cryptic sympatry may have passed unnoticed in population genetic screenings using number of loci typical of the pre‐genomics era. Hence, cryptic sympatric populations may be more common than hitherto thought, and we urge more attention being diverted to their detection and characterization.

Reviewing the literature pertaining to sympatric populations, we find that most cases of sympatry appear coupled to phenotypic divergence, implying ease of detection. In comparison, phenotypically cryptic, sympatric populations seem rarely documented. We explore the statistical power for detecting population mixtures from genetic marker data, using commonly applied tests for heterozygote deficiency (i.e., Wahlund effect) and the STRUCTURE software, through computer simulations. We find that both tests are efficient at detecting population mixture only when genetic differentiation is high, sample size and number of genetic markers are reasonable and the sympatric populations happen to occur in similar proportions in the sample. We present an approximate expression based on these experimental factors for the lower limit of F ST , beyond which power for STRUCTURE collapses and only the heterozygote-deficiency tests retain some, although low, power. The findings suggest that cases of cryptic sympatry may have passed unnoticed in population genetic screenings using number of loci typical of the pre-genomics era. Hence, cryptic sympatric populations may be more common than hitherto thought, and we urge more attention being diverted to their detection and characterization.

K E Y W O R D S
biodiversity monitoring, conservation management, genetic biodiversity, population genetic structure 1 | INTRODUCTION Sympatric populations represent conspecific populations that coexist spatially during at least a part of their life cycle (Futuyama & Mayer, 1980;Mallet, Mayer, Nosil, & Feder, 2009). Such populations are of great interest in studies of ecological interaction and microevolutionary processes since their existence may represent the first steps of sympatric speciation processes (Maynard Smith, 1966;Via, 2001).
From perspectives of management and conservation, sympatric populations are important to identify and monitor; they represent population diversity below the species level and such diversity has been documented to contribute to the portfolio effect in ecosystem stability (Schindler, Armstrong, & Reed, 2015;Schindler et al., 2010).
Further, genetic diversity is identified as the basis for all biological variation that should be protected and sustainably managed according to international agreements, such as the Convention on Biological Diversity (www.cbd.int). Sympatric populations have been described in a wide range of taxa and ecosystems. Marine examples include sympatric killer whale populations specializing on different diets (Ford et al., 1998) and blue whales (Attard, Beheregaray, & Möller, 2016) and beluga whales on summer foraging migration (Hauser, Laidre, Ruydom, & Richard, 2014).
Host-specific races are also known for the brood-parasitic common cuckoo, although it is unclear whether these "gentes" represent different populations or genetic polymorphism within populations (Fossøy et al., 2016). So-called "chromosome races" or "cytotypes" are known from small rodents that coexist in sympatry at least in zones of contact (house mouse: Corti & Rohlf, 2001;common screw: Orlov et al., 2012). In plants, there is a large literature on co-occurring populations that differ in ploidy (Schönswetter et al., 2007). A common pattern in most, but not all, of these instances is that members of the sympatric populations differ to some extent in visual characteristics and this appears to have been a key feature for detecting such populations. Sympatric populations may be described as cryptic when causal inspection had not previously revealed clear morphological or behavioural differences between them (Bickford et al., 2007). In such situations, the detection of sympatric populations typically requires some form of genetic data. Whether cryptic or not, there is a problem of demarcating sympatric populations against sympatric, closely related, sister species. Researchers adhering to a strict interpretation of the biological species concept may classify all sympatric, reproductively isolated populations as full species. There are thus likely to be differences among taxa, ecosystems and field of research in the detection of cryptic biodiversity and how this diversity is recognized at the species level or below (Bickford et al., 2007;Struck et al., 2018). There is also the problem of defining sympatry: At what spatial and temporal scales should coexistence be defined? Sympatric populations may coexist in the same area only relatively briefly, for example during seasonal feeding migration (beluga whales: Hauser et al., 2014) or during their entire lifespan (brown trout: Ryman et al., 1979;Palmé, Laikre, & Ryman, 2013). Sympatry is more readily defined within a confined environment such as a lake or an island than in the open ocean or in open terrestrial landscapes, and this may add to differences among taxa and environments in recognition of the existence and occurrence of sympatry.
We hypothesize that cryptic sympatric populations may have gone largely undetected and therefore might be under-reported in the literature. First, sympatric populations in general may be perceived as somewhat of an exception under the dominating ecological view emphasizing niche specialization and competitive exclusion (Harding, 1960), possibly diverting attention away from a systematic search for them. Hence, phenotypically cryptic, sympatric populations may go unnoticed except as chance detection in genetic screenings for, for example genetic diversity assessment.
Second, the statistical power for detecting sympatric populations may be relatively low, at least in the absence of obvious phenotypic differences. Without observable phenotypic differences and using genetic data alone, the classical test for the presence of more than one population in a sample of individuals relies on the Wahlund effect, that is, a deficiency of heterozygotes relative to the Hardy-Weinberg expectation Waples, 2015). Deviations from Hardy-Weinberg genotype proportions may have gone unnoticed due to low power of detection (Fairbairn & Roff, 1980). Indirect evidence that power of genetic methods has been weak is the observation that most cases of reported sympatry appear to be coupled to phenotypic differences (Taylor, 1999;this study). Individuals can then be grouped according to phenotype, and potential genetic differences between groups are investigated. This kind of comparison is frequently associated with higher statistical power than a general exploration of Hardy-Weinberg deviations (Palmé et al., 2013).
Third, as microsatellites became the marker of choice over allozymes, it early became clear that technical artefacts (allelic dropout: Taberlet et al., 1996;short allele dominance: Wattier, Engel, Saumitou-Laprade, & Valero, 1998;stutter bands: Miller & Yuan, 1997) and segregating null alleles (Chapuis & Estoup, 2007) all could lead to deficiencies of heterozygotes unrelated to any population mixture (Band & Ron, 1997). Concerns over such artefacts may have led researchers to dismiss also real heterozygote deficiencies and thereby overlook signals from population mixtures in their samples (Waples, 2015).
More generally, studies reporting heterozygote deficiencies often fail to follow up on those observations with further investigations, and this lack of follow-up investigations leaves the possibility of population mixture unresolved (Castric, Bernatchez, Belkhir, & Bonhomme, 2002).
Finally, statistical tools beyond the Hardy-Weinberg test have traditionally been lacking for detecting mixtures of phenotypically cryptic populations occurring in sympatry. Mixture of genetically differentiated populations leads not only to non-random association of alleles within loci, but also among alleles at different loci (so-called "linkage" disequilibrium or LD; Makela & Richardson, 1977) and potentially more powerful methods that explore both effects to detect population mixture were not generally available until the turn of the century (i.e., the STRUCTURE software: Pritchard, Stephens, & Donnelly, 2000). However, little is presently known about the statistical power of STRUCTURE relative to tests for heterozygote deficiency.
The purpose of the present paper is twofold. First, we review literature pertaining to sympatric populations. As pointed out above, there may be considerable differences among taxa and ecosystems with regard to how researchers recognize and interpret biological diversity.
To maintain a level of consistency and uniformity, we therefore limit our review to freshwater salmonids, for which we have the most experience, sympatry is fairly easily defined and a relative rich literature exists. This review is pursued to summarize documentation of sympatric populations, particularly comparing the detection of cryptic, sympatric populations vs. non-cryptic detection, and further to find out whether commonly used genetic markers might have led to underdetection and hence under-reporting of sympatric populations. Second, we assess statistical power of detecting phenotypically cryptic populations from genetic data using computer simulations and focusing on realistic levels of genetic divergence, number of gene markers and sample sizes as revealed by the literature survey. The question addressed by these computer simulations is: what is the probability of detecting population admixture/structure from genotype data from a single sample or locality, that is, without additional information on habitat or phenotype differences?

| Literature survey
We carried out a literature survey on evidence for sympatric populations of salmonid fishes in freshwater environments using the Web of Science.
We performed six topic searches using keyword combinations of "sympatric populations" AND either of the following "salmonid," "trout," "char," "charr," "whitefish" OR "salmon." The search included all years available in the database and was carried out in April 2018. In a next step, we examined the papers obtained for relevance with respect to our focus, that is, occurrence of sympatric populations in freshwater habitats.
Further, we added nine papers that we knew of, but which did not appear in the searches. All in all, we included 80 studies in our survey. We classified the sympatric populations reported in these studies as cryptic if they were initially detected through genetics only, without prior identification of, or grouping based on, phenotypic, ecological or other divergence. The sympatric populations were classified as non-cryptic if the basis for detection was phenotypic differences and as ambiguous if they could not be classified as either cryptic or non-cryptic based on the information given in the studies.

| Computer simulations
Simulations employing an in-house computer program were used to assess statistical power of detecting a significant indication of population mixture in genetic data from a sample of individuals when no phenotypic or non-genetic cues to population membership exist.
Simulated sample data sets were generated by random sampling from two interconnected populations in approximate migration-drift equilibrium, and statistical tests included Hardy-Weinberg tests equilibrium and tests derived from cluster analyses.
Each simulated population consisted of N = 1,000 diploid, sexually reproducing individuals with discrete (non-overlapping) generations.
Populations were initiated with even sex ratios and with a number L of freely combining (i.e., unlinked) loci with a specified allele frequency profile. Various numbers of alleles (2 or 20) and loci (up to 100) were used to represent popular genetic marker types and numbers commonly used in past and present population screenings as identified in our literature review (Table 1). In particular, we consider a set of 10 loci with (initially) 20 alleles each and refer to this set as the "microsatellite panel" and a set of 100 di-allelic loci referred to as the "SNP panel." Smaller numbers of di-allelic loci were also simulated in order to represent allozyme-based studies. For the microsatellite panel, we incorporated mutations by randomly changing genes from their allelic state to one of the (19) other allelic states. We used a mutation rate of u = 0.0005, implying that one of the 2,000 genes in the population mutated per generation on average. The SNPs and allozymes were simulated without mutations. Simulations were initiated with even allele frequencies and were run for a sufficient number of generations (1,000) to thoroughly redistribute alleles within and among loci. Each generation after initiation (generation t = 0), N haploid gametes, including L loci plus the sex-determining locus, were drawn with replacement from males and from female parents, respectively, and merged into N diploid offspring which immediately replaced the parental generation. Thus, generations were discrete (non-overlapping) and population size was kept exactly constant, while the sex ratio varied randomly (i.e., binomially with a mean of 0.5 and a standard deviation of 0.0158). Migration was simulated by exchanging a fixed number (M) of diploid individuals between the two populations each generation, following reproduction and mutation. A range of levels of genetic divergence between populations (F ST : from 0.00025 to 0.39) was generated by exchanging different numbers of migrants (M = 43, 23, 12, 5, 2.5, 1 or 0 per generation). Fractional numbers of migrants (e.g., 2.5) were accommodated by passing on the fractional part to the subsequent generation. Thus, in the case of M = 2.5, the actual number of migrants alternated between two and three in successive generations for an average of 2.5 per generation.
When sampling from the two populations, n 1 and n 2 diploid individuals were drawn from population 1 and 2, respectively, in generation t = 1,000 and both samples were pooled into a common file for statistical analyses. Different proportions of the two populations in samples were explored, from 1:1 (i.e., even representation) to 1:19 (highly skewed representation). The total sample size was set to cover the range over most empirical studies (Table 1), from 20 to 400 individuals combined (n 1 + n 2 ). When testing the case of no divergence (for assessment of alpha errors), that is F ST = 0, a single, isolated population was simulated for 1,000 generations before the sample was drawn. Samples were drawn with replacement, in accordance with common-but typically not explicitly stated-assumptions of estimation procedures (Weir & Cockerham, 1984). The realized divergence (F^S T ) between the two populations in the sample JORDE ET AL.

| Statistical analyses of simulated data
Tests for heterozygote deficiency in the pooled samples (sample size = n 1 + n 2 ) utilized the sampled genotypes, anonymized with respect to population of origin by erasing the population identifiers from the input file prior to statistical analyses. The calculations were carried out with GENEPOP option 1 (Hardy-Weinberg exact test) suboption 4 (global tests for heterozygote deficiency), with default dememorization number (10,000), number of batches (20) and iterations per batch (5,000).
Results were summarized as the proportion of the 5,000 replicate simulation runs that yielded a significant, at the 5% level, global test.
The pooled and anonymized samples were further analysed for population structure with the command line version of the STRUC-TURE software (v. 2.3: Falush, Stephens, & Pritchard, 2003;Pritchard et al., 2000). The software was run with the default number of BURNINS (10,000) and NUMREPS (20,000) and, as per default, with the following settings activated (i.e., set to 1): FREQSCORR, COMPUTEPROBS, INFERALPHA or deactivated (set to 0): NOAD-MIX, USEPOPINFO, LOCPRIOR. For each simulation run, STRUCTURE was employed three times, with assumed number of populations (K) set to 1, 2 and 3, respectively. We chose 3 as an upper limit, partly to limit the computational burden (nearly 90% of CPU time was spent on the STRUCTURE analyses) and partly because few empirical investigators would consider a large number of populations in a single sample as a biologically realistic proposition. The posterior probability of K = 1 (i.e., the probability of the sample representing a single biological population) given the data were calculated from the reported Ln Prob(data|K) using Bayes' rule, as described in the manual (Pritchard, Xiaoquan, & Falush, 2010, sec- . Results of simulation runs were summarized as the proportion of replicate runs that yielded a Prob(K = 1|data) less than 5% and interpreted as a significant (at the alpha = 5% level) detection of population mixing in the sample. In simulations involving a single population, that is with no true population mixing, the proportion of significant runs was interpreted as the alpha error of the test. number of clusters or populations (K) in the data. We did not find any application in the population genetics literature of using these BIC values to calculate posterior probabilities for the models, but the procedure is described in the general statistics literature on model selection (Burnham & Anderson, 2004, p. 275;Raftery, 1995) and is similar to that used for STRUCTURE. For testing the null hypothesis of a single population (K = 1) in the sample, we calculated Prob(K = 1|data) = exp(−1/2 delta BIC_K = 1)/sum (exp(−1/2 delta BIC_K = i)), where exp is the exponential function, BIC_K = i is the BIC value reported by the find. clusters function for K = i genetic clusters or populations, delta BIC_K = i is the difference between the BIC value for K = i and the lowest BIC value, and the summation is over i = 1…10. We calculated the power and alpha errors of this test as the proportions of replicate runs that yielded Prob(K = 1|data) < 0.05.

| Literature survey
Review of the 80 papers identified in our literature survey revealed that for the case of salmonid fishes in freshwater habitats, sympatric populations have been reported in 136 cases in 135 localities in 17 countries, including at least 17 separate species (Table 2; Supporting   Information Table S1). Arctic charr is the species with the largest number of reported sympatric cases with 39 localities where such existence has been documented. Sympatric populations have most commonly been found in freshwater lakes (108 cases), whereas river and creek habitats have been less commonly reported to harbour such populations (12 vs. 15 cases).
T A B L E 1 Summary information on sample size (loci and individuals) and F ST found in the studies of the literature search (  Loch Awe in Scotland houses both two cryptic Arctic charr and two non-cryptic brown trout populations and is included both in the total number localities for both cryptic (9) and non-cryptic (98) populations. Thus, the total number of localities with sympatric populations identified in this study is 135. & Fraser, 2016;Palmé et al., 2013;Ryman et al., 1979;Smith & Engle, 2011;Wilson et al., 2004; Table 1). We classify 29 cases concerning Arctic charr and brook charr as ambiguous. Most commonly, only two coexisting sympatric populations have been documented. A total of 23 cases with three or more coexisting populations have been found, and all these refer to non-cryptic populations (Supporting Information Table S1).
Difference in resource use has been reported in several cases of non-cryptic sympatry. In lake habitats, such differences include food niches (21 lakes), spawning time (3 lakes), spawning place (4), anadromous vs. resident strategy (2), both spawning time and place differences (3), and both spawning time and habitat differences (2).
All non-cryptic sympatry in creeks is associated with anadromous vs. resident life history strategy (11 cases). In rivers, such differences are also found (three rivers) but here spawning time differences is another diverging factor (three rivers), whereas food niche separation has not been reported in creeks or rivers (Supporting Information   Table S1).
In the cases of cryptic sympatry, clear life history strategy differences have only been observed in the case of Atlantic salmon in the Teno River (Aykanat et al., 2015). There are some indications of trophic divergence in sympatric charr in Lochs Maree and Stack (Adams, Wilson, & Ferguson, 2008) and extensive screening of the two cases of cryptic brown trout populations in tiny mountain lakes in Sweden found no trophic divergence but small growth and maturation differences as well as a tendency for a spacial separation at spawning (Andersson, Johansson, Sundbom, Ryman, & Laikre, 2017;Palmé et al., 2013;Ryman et al., 1979). Growth differences between cryptic, sympatric populations have been reported in a total of six cases (five lakes and one river).
Microsatellites and/or allozymes were the most frequently used markers for investigating genetic structure and had been employed in 40 vs. 21 studies, respectively. Typically, 1-16 loci were employed for allozymes and 5-22 for microsatellites. Four studies had used SNPs, five employed gene sequencing and several studies used combinations of different markers (Supporting Information Table S1). Studies identifying cryptic sympatric populations were based exclusively on heterozygote deficiency in one case (no heterozygotes observed; Ryman et al., 1979) and exclusively on STRUCTURE software in two cases (Aykanat et al., 2015;Marin et al., 2016). Two studies used a combination of heterozygote-deficiency tests and the STRUCTURE software (Palmé et al., 2013;Wilson et al., 2004), and one study applied an assignment software exclusively (ONCOR; Smith & Engle, 2011).
We wanted to find out if studies reporting cryptic vs. non-cryptic sympatric populations differed with respect to sample size, number of loci or degree of genetic divergence. For such a comparison, we selected studies reporting all the relevant quantities, that is number of fish, number of loci and significant F ST (or equivalent). We limited our selection to studies using allozymes and/or microsatellites, since these were the most frequently applied markers. Of the 80 studies, 35 fulfilled these criteria and they represent 58 localities, seven with cryptic populations and 51 with non-cryptic ones (Supporting Information Tables S2). F ST was consistently higher among cryptic populations as compared to non-cryptic ones using allozymes, microsatellites or a combination of both (Table 3). However, statistical significance was only obtained for allozymes using a t test (median test non-significant). Similarly, a larger number of individuals had been sampled in studies reporting cryptic populations based on allozymes or both markers as compared to studies reporting noncryptic populations. However, this difference was only significant for the median test. The number of loci appointed were essentially the same (Table 3).

| Computer simulations
The overall impression from the computer simulations (Supporting Information Table S3) is that STRUCTURE was superior to DAPC and also more powerful than the heterozygote-deficiency test to detect T A B L E 3 Results from comparisons of genetic divergence (F ST ) and number of individuals and loci sampled between sympatric populations that were classified as cryptic or non-cryptic using information reported in the literature. Information from a total of 35 studies involving seven cases of cryptic populations and 59 cases of non-cryptic populations was used (Supporting Information  (Figure 1). On the other hand, statistical power was very low when levels of genetic divergence were low and particularly so for DAPC and STRUCTURE. Our implementation of DAPC in these simulations was always inferior to the two other tests and often did not yield a meaningful result at all (i.e., a power of zero, unity or no estimate at all: cf. Figure 1; Supporting Information Table S3). Thus, the approach implemented in DAPC is not considered further in the present paper.
The two genetic marker panels that are the focus of the present Statistical power of the two marker panels was therefore, as expected, fairly similar (cf. Figure 1).
The power of STRUCTURE to detect population mixtures fell rather rapidly with declining levels of genetic divergence, and the rate of  Table S3). For a similar sample size but using the SNP panel, the major drop in power (from 77.6% to just 2.3%) occurred at a somewhat lower F ST level, between 0.025 and 0.010 (cf. Figure 2, lower panel and Supporting Information Table S3). These major drops in statistical power indicate the existence of certain "thresh-   (Figure 4, bottom panels), such differences among tests rarely occurred and runs that were significant for the least powerful test (here, the heterozygote-deficiency test) were almost always significant also for the more powerful one (STRUCTURE). With

STRUCTURE test 100 SNPs
F ST Power F I G U R E 2 Power of detecting significant evidence for different size (n individuals), genotyped for the microsatellite panel (top) and the SNP panel (bottom) and tested with the GENEPOP heterozygote-deficiency test (left) and STRUCTURE test for K > 1 (right). Dots on the left margins indicate proportions of significant runs from a single, panmictic population and represent the alpha errors of the tests (note that some dots overlap) extent ( Figure 5). STRUCTURE was somewhat more affected by uneven representation than was the heterozygosity-deficiency test but for highly divergent populations (F ST > 0.1) power remained reasonable high (>0.5) for both methods also with highly skewed representation (5/95 proportions).
Alpha errors, that is, the proportion of simulation runs with a single, panmictic population only but which nevertheless resulted in a significant test outcome for mixture, were always close to the nominal alpha (5%) level for the heterozygote-deficiency test (cf.  Table S3). The STRUCTURE test also tended to display alpha errors in the vicinity of the alpha level, but was more variable and ranged between 0.01 and 0.16 (Figure 2; Supporting Information Table S3).

| DISCUSSION
In our literature case study of sympatric populations using salmonid fishes in freshwater habitats as models, we found that the majority of reported cases-98 out of 136-refer to non-cryptic populations that were identified by differences in phenotypic and/ or behavioural traits. Only nine of the 136 examples that we found refer to cryptic, sympatric populations, leaving the impression that such populations are rare. Moreover, we found that genetic divergence was on average higher between cryptic than between non-cryptic populations. This is contrary to expectation because phenotypically cryptic populations are commonly thought to be evolutionary young (see review and discussion by Fišer, Robinson, & Malard, 2018) and therefore less differentiated at neutral loci. Thus, the finding that cryptic populations instead tended to be more differentiated suggests that reported cases provide a biased view and represent situations where statistical power was high. Ecological divergence in sympatric populations appear to differ with respect to habitat but in the case of cryptic sympatry, obvious genetic differences are typically associated with only weak and unclear resource use divergence leaving the evolutionary mechanisms behind such structuring presently unclear.
Recent works that were not included in our literature review find refined food niche separation in three sympatric genetically divergent groups of brown trout in Loch Leidon, Scotland (Piggott et al., 2018), and genomic signals indicating selection between the noncryptic life history forms of brown trout of Loch Maree, Scotland (Jacobs, Hughes, Robinson, Adams, & Elmer, 2018). Evidence for sympatric genetic divergence between non-cryptic Arctic charr populations was reported by Salisbury et al. (2018) in Ramah Lake in Labrador, Canada, and Guðbrandsson et al. (2018) found gene expression divergence during early development among non-cryptic populations of this species in Lake Thingvallavatn on Iceland .
We used computer simulations to evaluate the statistical power of methods that utilize genetic markers for detecting sympatric populations in a sample of individuals, without prior groupings. Among methods, the Hardy-Weinberg test represents the classical approach and different variants of this test have been developed. For the particular purpose of detecting Wahlund effects, the exact heterozygote-deficiency test  was used as this seems the most appropriate and has seen wide use in empirical studies (the original paper was cited >600 times at Web of Science by June 2018, but most papers using this method probably cite the GENEPOP papers, Ray-  (Chakraborty & Zhong, 1994;Salanti, Amountza, Ntzani, & Ioannidis, 2005).
Methods that simultaneously utilize LD and heterozygote deficiencies have been developed, most notably in the STRUCTURE software (Pritchard et al., 2000), and been widely applied (>16,000 citations).
Simulation studies characterizing statistical properties of STRUCTURE include Castric et al. (2002), Manel, Berthier, and Luikart (2002), Evanno, Regnaut, andGoudet (2005), Latch, Dharmarajan, Glaubitz, andRhodes (2006), Patterson, Price, and Reich (2006), Waples and Gagiotti (2006), Anderson and Dunham (2008) Castric et al., 2002;Latch et al., 2006;Waples & Gagiotti, 2006). More specifically, Patterson et al. (2006)   are not entirely independent and that more alleles occurred in low frequencies as compared to the di-allelic SNPs. The expression is nevertheless useful also for microsatellites as a rough guideline for conditions where the statistical power of the STRUCTURE test becomes too low to be of practical use. However, this guideline refers to population mixtures in equal proportions and our simulation results show that power of STRUCTURE is reduced when populations are unequally represented in the sample ( Figure 5, right), as also found earlier (Puechmaille, 2016;Wang, 2017). This reduction in power and subsequent increase in threshold F ST appears to be directly related to the relative proportion of the two populations in the sample, or r = n 1 /n 2 (where n 2 is the larger, so that r ≤ 1). Using r, we may tentatively consider a modified expression for the threshold F ST for STRUCTURE:  ranged from 5 to 25 with a mean of 10, and for allozymes, the number of (polymorphic) loci was even lower. Combined with a moderate sample size, we conclude that most studies would not have been able to detect sympatric populations from genetic data alone. In the present genomics era increasing the number of loci substantially is no longer a problem, although sample sizes still tend to be moderately low.
Perhaps more problematic from a planning purpose is our finding that uneven population representation in the sample reduces power of detection substantially. Unless there is some unknown biological reason for sympatric populations to occur in even proportions in the sample area, simple combinatorics dictate that most samples will contain populations in uneven proportions and often highly so. As a case in point, both the sympatric brown trout populations in Lakes Bunnersjöarna and in Trollsvattnen occurred in very similar proportions, averaging 45% and 55% (Ryman et al., 1979) and 47% and 53% (Palmé et al., 2013), respectively. Although the alternative fixation of the LDH-1 alleles in Lake Bunnersjöarna makes statistical power in that particular case largely irrelevant, the high proportion of both types certainly brought attention to the phenomenon as not just a technical artefact with a few samples.
A complicating factor relating to detection of cryptic sympatry that we have not addressed here is that degree of divergence most likely differs in different regions of the genome. Such differences might explain the difficulty in detecting the two cryptic populations of brown trout in Lakes Trollsvattnen that we have reported and monitored over time (Andersson, Jansson et al., 2017;Andersson, Johansson et al., 2017;Jorde & Ryman, 1996;Palmé et al., 2013) with six microsatellites as compared to 14 allozyme loci. In fact, the degree of divergence between these population using allozymes is estimated as F ST = 0.1, whereas when applying~3,000 SNPs, we find a lower F ST = 0.03 (Andersson, Jansson et al., 2017). Clearly, more research is needed into the issue of cryptic sympatry to understand the evolutionary background to their existence. From the perspective of conservation management, mapping the existence of this type of biodiversity over space and monitoring such existences over time is important.

ACKNOWLEDG EMENTS
We thank three anonymous reviewers and the subject editor for important comments on a previous version of this manuscript.

DATA ACCESSIBILI TY
Results of the computer simulations and R-scripts for generating the