Mate choice by phenotype matching, whereby individuals prefer a mate whose phenotype is similar to their own, should facilitate speciation with gene flow. This is because the genes that control mate signal (the phenotype being matched) also determine the preferred mate signal (“mate preference”). Speciation is made even easier if phenotype matching is based on a trait under divergent natural selection. In this case, assortative mating should readily evolve as a byproduct of divergent selection on the trait. Previous observational studies of assortative mating between sympatric, hybridizing threespine stickleback species (Gasterosteus aculeatus complex) suggested that phenotype matching might occur by body size, a trait under divergent natural selection. To test this, we used experimental manipulation of body size to rule out the effects of confounding variables. We found that size-manipulated benthic and limnetic stickleback females prefer mates whose body size more closely matches their own. It is thus likely that assortative mating by phenotype matching has facilitated the origin and persistence of benthic and limnetic threespine sticklebacks in the face of gene flow.

Preference for mates having a similar phenotype to one's self, which we term “mate choice by phenotype matching” (after “phenotype matching” in Lacy and Sherman 1983), is interesting in the context of speciation, because of what it implies about the genetics underlying mate signal and mate preference. The evolution of assortative mating between diverging populations that still experience gene flow can be hindered by recombination between alleles for mate preference and those for mate signal (e.g., model 2 in Kondrashov and Kondrashov 1999, Doebeli 2005). When mate choice occurs by phenotype matching, however, such recombination cannot occur, because mate preference is based on one's own signal phenotype (e.g., model 2 in Dieckmann and Doebeli 1999, model 1 in Kondrashov and Kondrashov 1999). This implies that the genes underlying mate signal also determine mate preference values and, therefore, recombination cannot dissociate the two. Mate choice by phenotype matching should thus facilitate the evolution of assortative mating between diverging populations. Furthermore, if mate preference is for matching mate signal traits (mate choice by phenotype matching), and divergent natural selection acts on the mate signal (a.k.a. magic trait; reviewed in Servedio et al. 2011) then ecological speciation with gene flow (Schluter 2001) is greatly facilitated (e.g., model 1 in Dieckmann and Doebeli 1999). This is because a target(s) of divergent selection and both components of mate recognition are all controlled/determined by the same genes and assortative mating may readily evolve as an automatic byproduct when divergent selection acts.

Here we test the hypothesis that mate choice by phenotype matching based on body size (hereafter “size matching”) occurs in the benthic and limnetic species pair of threespine stickleback (Gasterosteus aculeatus complex) residing in Paxton Lake on Texada Island, British Columbia (BC). For convenience, and following previous practice, we refer to benthics and limnetics as species because they are almost completely reproductively isolated in the wild (Schluter 1993; McPhail 1994; Schluter 1995; Nagel and Schluter 1998; Hatfield and Schluter 1999; Vamosi and Schluter 1999; Rundle et al. 2000; McKinnon and Rundle 2002; Boughman et al. 2005; Gow et al. 2007), although they have not been formally designated. The Paxton Lake pair is one of several stickleback species pairs found in small lakes of coastal BC (Schluter and McPhail 1992). In each case, one of the species, the limnetic, is small-bodied and feeds primarily on plankton in the open water whereas the second species, the benthic, is large-bodied and feeds primarily on benthos in the littoral and benthic zones (Schluter and McPhail 1992; Schluter 1993). The lakes formed shortly after the Pleistocene glaciers receded (10,000–12,000 years ago) and likely experienced two separate invasions by ancestral stickleback. Thus, an initial period of allopatry has likely contributed to divergence between the sympatric species (Schluter and McPhail 1992; McPhail 1994; Taylor and McPhail 2000). Nonetheless, since the establishment of sympatry, these pairs have been subject to gene flow (Taylor and McPhail 1999; Gow et al. 2006), yet they have evolved and/or maintained large differences in morphology, ecology, and mate recognition (McPhail 1992; Schluter 1996; Taylor and McPhail 2000). A conspicuous body size difference is important not only because it represents adaptation to the alternate foraging habitats (big fish have higher foraging efficiency and growth rate in the littoral zone, whereas small fish have the advantage in the open-water; Schluter 1993, 1995; Hatfield and Schluter 1999), but because it also appears to be an important mate signal trait involved in assortative mating between the two species (Nagel and Schluter 1998; Boughman et al. 2005; Albert 2005). Previous observational studies suggest that size matching may occur. Using no-choice mating trials between wild-caught individuals, Nagel and Schluter (1998) found that females hybridized only when placed with males of the opposite species who were similar to her in size, indicating that preference changes with size difference between a mating pair. However, hybridization events tended to occur late in the season when the smallest benthics and the largest limnetics were in breeding condition. Any effect of date on level of discrimination is confounded with differences in body size (Nagel and Schluter 1998). Boughman et al. (2005) found the same negative correlation of size difference and mating compatibility in the Paxton lake pair and two other independently evolved species pairs. However, the effects of trial date were not investigated. Furthermore, other variables that tend to be correlated with body size in the wild, such as trophic niche and age, add uncertainty to the conclusions based on observational studies that size matching occurs.

To overcome this problem, we used experimental manipulation of body size followed by mate choice trails to determine whether size matching contributes to assortative mating between benthic and limnetic sticklebacks. This experimental approach was used previously by McKinnon et al. (2004), who found that in world-wide pairs of stream and marine threespine stickleback populations, female preference changed when female size was manipulated; they were more likely to prefer a male of the opposite ecotype when they were manipulated to be more similar in size to him. This approach, however, has never been used for the benthic and limnetic pairs, a study system that has become an important model of species divergence and persistence in the face of gene flow. Here, we compare the propensity of benthic and limnetic females to accept males of the opposite species when they were manipulated by diet either to be more similar or more different in size to males. Importantly, within each species, all females came from the same population and were randomly assigned to treatments. Thus, the effects of genotype, age, early-life experience, and other differences potentially affecting mate preferences were randomized. If size matching is not occurring, then manipulation of females’ body size should not affect their probability of accepting the heterospecific male. However, if size matching is occurring, then females manipulated to be similar in size to the heterospecific male will be more likely to accept him than females manipulated to be a different in size from him.



In the fall of 2009, we collected juvenile fish from two ponds at the UBC Experimental Pond Facility, one containing only Paxton benthics and the other containing only Paxton limnetics. All individuals were thus, naive to the other species. Progenitors of the pond populations had been collected from the wild and introduced to ponds in April of 2008. We collected 400 juveniles of each species and randomly assigned them to tanks of their own species and to size-manipulation groups (either “abundant-food” or “reduced-food”). In total, for each of the size manipulation groups there were 10 benthic tanks and 6 limnetic tanks.


To generate size differences between groups within a species, we provided them with alternate amounts of food (a mix of blood worms and mysis shrimp). We took measures of standard length, body mass, and condition factor (a scaled ratio of body mass to standard length (body mass × 105)/standard length3 (Williams, 2000) periodically from a sample of the fish from each group. To determine the appropriate amount of food for the abundant-food group that would maximize their growth rate, we periodically gave them ad libitum feedings and converted the amount consumed into the percent of the group's estimated average body weight that was consumed per fish on average. We gave each abundant-food tank this amount of food multiplied by the number of fish in the tank, each day. We fed the reduced-food group two to three times less (corrected by their groups’ estimated average body weight) than that of the abundant-food group of their same species. For any given period of time, the exact reduction in food availability to the reduced-food group depended on our desired condition and growth rate for them, with the goal being that they be fed as little as possible, while preventing significant reductions in condition factor relative to the abundant-food group.


We overwintered the fish in a temperature-controlled chamber to synchronize the onset of breeding condition upon release from winter conditions. Overwintering took place from 15 February through 26 June 2010. At the start of “winter,” temperature was gradually lowered (1°/day) from 17°C to 8°C, and photoperiod was gradually decreased (1 h/day) from 16:8 L:D to 8:16 L:D. At the end of winter, the temperature and photoperiod were gradually increased, reversing the above changes.


We conducted no-choice trials, in which a single female and a single male were allowed to interact. No-choice trials result in a score reflecting female acceptance of the male and comparison of such scores between treatments is generally thought to provide a good estimate of the female preference function (Wagner 1998; Bush et al. 2002). Females used in the trials came from both abundant and reduced-food tanks. All benthic males were from the reduced-food group (making them more similar in size to limnetics) and all limnetic males were from the abundant-food group (making them more similar in size to benthics). Thus, the two treatments for each species of female were “different size” (control) and “similar size” (experimental) relative to the heterospecific male (Fig. S1). In total, 15 successful trials were conducted for benthic females similar in size to a limnetic male and 19 for benthic females different in size to a limnetic male. Thirteen successful trials were conducted for limnetic females similar in size to a benthic male and 10 for limnetic female different in size to a benthic male.

As a result of providing different amounts of food to generate size differences among treatments, the comparison of female mate preference between large and small experimental fish was confounded with that between abundant- and reduced-food fish. However, one feature of the experimental design provides a control for the effects of diet manipulation, nutrition, and other correlated effects. In the larger benthic species, the control trial involved a large (abundant-food) female and the experimental trial involved a small (reduced-food) female. Conversely, in the smaller limnetic species, the control involved a small (reduced-food) female and the experimental trial involved a large (abundant-food) female. Thus, if diet/nutrition were to bias female mate preference in a predictable direction (e.g., reduced-food fish prefer larger males) then we would expect the direction of effect to be consistent with female food amounts in both species rather than control versus experimental group.

Mating trials took place in 110-L tanks that were visually isolated from neighboring tanks. Each tank contained limestone gravel covering the bottom, small sprigs of plastic plants located in the two rear corners, a nesting dish filled with sand and soil located in the rear left corner, and a small bunch of java moss anchored next to the right side of the nesting dish. In addition, tanks were regularly replenished with short pine needles, which stickleback males use for structural support in nest building.

We added males showing signs of breeding condition (i.e., red throat color) to trial tanks and gave them five full days to build a nest. A gravid female contained in a jar was placed in the tank of each male for at least 15 min/day. These “motivator females” were not used in mating trials. If a male built no nest within 5 days, we returned him to his rearing tank and replaced him.

Gravid females were randomly assigned to heterospecific males with nests. At the start of a trial, we released the female into the male's tank, as far from the male as possible. In all trials conducted for this study, the male began to court the female within 5 min. Trials lasted 40 min or until spawning. Because spawning occurred only once at around 39 min, all trials were about the same duration. We recorded the following well-established courtship behaviors (see Tinbergen 1952; Rowland 1994 for a more complete description) if and when they occurred using the software Event Recorder (Berger and Bleed 2003): male behaviors—male approaches female, male bites female, male zigzags toward female, male attempts to lead female to his nest, male performs nest-maintenance behaviors, male creeps through his nest, female behaviors—female exhibits “head-up” posture, female follows male toward his nest, female approaches male, female inspects male's nest, and female deposits eggs in nest.

Immediately after a trial, we recorded a score to visually qualify males’ red nuptial color brightness/intensity, ranging from 1 to 3. We also measured standard length and body mass for both the female and the male. Finally, if a female did not spawn, we gently squeezed the eggs from her oviduct at the end of the trial to confirm receptivity (Nagel and Schluter 1998). Any female found not ready to mate was excluded (this occurred in 15 of 72 trials). Among those deemed ready to mate, variation in degree of readiness likely existed. However, this variation was randomized among treatments.

Each female was used in only one trial. However, due to male limitation, we used some males in a second trial with the opposite type of female 1–3 days later. For males used twice, we measured standard length and body mass after their second trial. To ensure that using some males twice did not affect the results of the experiment, the data were also analyzed using only the first trial for each male.

We used a dichotomous score to describe female acceptance of the heterospecific male. A score of 0 indicates no acceptance and a score of 1 indicates that some degree of acceptance was shown. More specifically, a score of 1 was assigned if the female reciprocated in courtship with one or more of the following behaviors: head-up posture, female follows male to his nest, female approached male at his nest, female inspects male's nest, female enters male's nest/deposits eggs. Alternatively, a score of 0 was assigned if the female did not reciprocate with any courtship behaviors. We chose this scoring method to maximize the amount of variation available to be analyzed (spawning occurred in only 1 of 57 trials and courtship rarely proceeded to the penultimate step of mating). Female courtship behaviors are known to proceed in a fixed sequence (as listed above) and females may terminate courtship at any point along this sequence. Therefore, it can be inferred that with any given threshold for a score of 1 along this sequence, the females with a score of 1 would be more likely to eventually mate than those with a score of 0. To demonstrate that our results were not dependent on the particular threshold chosen, we also analyze the data using a score with a more stringent requirement for a score of 1, whereby the female had to at least follow a male to his nest to be assigned a score of 1.


For benthic females only, there was a significant difference in the dates of control and experimental trials (F1, 32 = 14.194, p = 7×10−4), because females manipulated to be large began to come into and go out of breeding condition earlier in the season than those manipulated to be small. This difference did not exist for limnetic females (F1,21 = 0.1716, p = 0.683). To completely eliminate trial date as a confounding variable, we also analyzed a dataset containing all trials with limnetic females and only those trials with benthic females that occurred after the first trails involving benthic females from the reduced-food regime (“similar-size” treatment) and before the last trails with the abundant-food regime (“different-size” treatment). This eliminated 10 of 19 benthic control trials and 3 of 15 benthic experimental trials. In this dataset, no statistical difference in the dates of control and experimental trails was found (F1, 42 = 0.2275, p = 0.601).


We used linear models to test for the effects of size manipulation on standard length, body mass, and condition factor. We used logistic regression in a generalized linear model context to test for effects on female acceptance. Our main analysis tested for the effects of experimental treatment, as well as female species and the interaction of treatment and female species. We further examined whether any other measured explanatory variables correlated with female acceptance after accounting for the effects of treatment, by including them one-at-a-time as the second term in a model also including treatment as the first term. These variables included breeding date, male nuptial color, and six male courtship behaviors. Finally, to determine the effects of treatment on male behavior, we used linear models to analyze data from only males’ first trials. All analyses were performed in R v2.12.1 (R Development Team 2010).



The size manipulation resulted in nearly nonoverlapping female size distributions between treatments within each species, with highly significant differences in mean for both standard length (Fig. 1) and body mass (Table S1). In addition, condition factor was only slightly greater in females manipulated to be large than in those manipulated to be small, but not significantly (Table S1), suggesting that the manipulation successfully affected the extent of growth without greatly compromising the relative mass for a given size.

Figure 1.

Resulting standard length distributions after manipulation of body size. (A) Benthic females manipulated to be large and small compared with the limnetic males (manipulated to large) that they were paired with. (B) Limnetic females manipulated to be large and small compared with the benthic males (manipulated to be small) that they were paired with.


Treatment (similar size vs. different size) had a highly significant effect on the female acceptance score (Table 1). Females paired with a male of the opposite species were more likely to reciprocate courtship behaviors when they were manipulated to be similar in size to him than when they were manipulated to be different in size to him (Fig. 2). This treatment effect was present in all models and no other explanatory variables significantly correlated with female acceptance either before (tests not shown) or after the effects of treatment were accounted for (Table 1). The results from the analysis using (1) the more stringent female acceptance score (Table S2), (2) only males’ first trials (Table S3), and (3) a subset of the data for which trial date was not a confounding variable (Table S4), produced the same results as these.

Table 1. Logistic regressions to test for the effects of explanatory variables on female acceptance scores. Each model includes treatment and the one additional variable indicated. Treatment was entered first and so has identical effects in all models except model 1i, due to missing data points for male nuptial color. The second explanatory variable in models 1c–h are male courtship behaviors. N = 57 mate choice trials
ModelExplanatory variabledfX2p
1a–hTreatment (similar vs. different size)120.955×10−6 ***
1iTreatment (similar vs. different size)118.482×10−5 ***
1aFemale species12.140.14
1aTreatment×female species10.570.45
1bTrial date10.580.45
1cNo. of approaches10.620.43
1dNo. of zig-zags10.100.75
1eNo. of bites10.040.84
1fNo. of leads to nest10.080.78
1gNo. of nest maintenance events10.230.63
1hNo. of nest creep-throughs13.280.07
1iMale nuptial color10.200.29
Figure 2.

Mean female acceptance score for treatments in which the female was manipulated to be different in body size versus similar in body size to a male of the opposite species, with 95% confidence intervals. Means for limnetic females are connected by a solid line, and for benthic females by a dashed line.


Males did not behave significantly differently toward heterospecific females of different size groups except in 1 of 12 comparisons (Table 2): limnetic males maintained their nest slightly more often when paired with similar-sized females than when paired with different-sized females (Table 2), whereas there was no detectable difference in the other five behaviors. Benthic males did not behave detectably differently toward females in different treatments.

Table 2. Linear models to test for effects of female treatment (similar vs. different size) on the number of times males exhibited each of six courtship behaviors using data from their first trial only
  Benthic malesLimnetic males
ModelResponse variableF(df)PF(df)p
2aNo. of approaches0.06 (1,14)0.815×10−4 (1,25)0.98
2bNo. of zig-zags1.03 (1,14)0.330.02 (1,25)0.90
2cNo. of bites0.20 (1,14)0.661.44 (1,25)0.24
2dNo. of leads to nest0.02 (1,14)0.890.22 (1,25)0.65
2eNo. of nest maintenance events0.80 (1,14)0.394.23 (1,25)0.05 *
2fNo. of nest creep-throughs0.02 (1,14)0.901.10 (1,25)0.30


This study provides experimental evidence that both benthic and limnetic stickleback females prefer mates of the opposite species whose body size more closely matches their own. Males appeared to exhibit little-to-no preference and no effect of male courtship behaviors were detected on female acceptance scores. Unlike previous studies, these results cannot be explained by timing in the mating season, or by any other traits that were not directly affected by manipulation of body size. Our results imply that body size, a trait under divergent natural selection that functions as a mate signal (Schluter 2001; McKinnon and Rundle 2002), also determines a female's preferred size (referred to as simply “mate preference” below) via phenotype matching.

Along with body size, other traits that distinguish the species, such as shape, color, and behavior are likely involved in assortative mating as well. Indeed, Southcott et al. (pers. comm.) attribute a large increase in premating reproductive isolation to interactions of some other species-specific trait(s), with body size and, in benthics, color. Conspicuous shape differences, which have not traditionally been quantified in stickleback mate choice studies, are an interesting candidate. Other studies too have found effects of male nuptial color on female mate preference (Boughman 2001; Boughman et al. 2005). However, note that any traits that were not directly affected by body size were randomized among treatments and therefore cannot explain our results.

Size matching in benthic and limnetic sticklebacks is interesting because of what it implies for the genetics of divergence with gene flow between them. Size-matching implies that the genes underlying a target of divergent natural selection, mate signal, and mate preference are one and the same and, thus, these traits cannot be dissociated during divergence with gene flow. Although the phenomenon of size matching itself may have a separate genetic basis (encoding, for example, a male choice rule that dictates: “prefer others whose size is like mine”), such loci effectively transfer the determination of mate preference to body size loci. As a result, divergent selection on body size should lead to assortative mating by body size as a byproduct. Thus, the genetics of divergence between the Paxton Lake species is, in this one way, extremely favorable for the speciation process. This finding helps to explain the evolution and persistence of the species pair in the face of gene flow. These results also agree with those of a similar study of marine and stream pairs of threespine stickleback (Mckinnon et al. 2004). Because the marine population is ancestral to most freshwater forms, this raises the possibility that the evolution of size-matching predated and subsequently facilitated the evolution of the benthic and limnetic species pair and, indeed, the entire adaptive radiation of threespine sticklebacks into freshwater environments around the northern hemisphere in the late Pleistocene.

How do stickleback females know their own size? They may learn their own size by “self-referencing” (Hauber and Sherman 2001), for example, by directly judging how well-matched their own size (or some other phenotype that changes as a byproduct of changing body-size) is with that of others. Termed “self-referent mate choice,” “self referent assortative mating,” or “self-referent phenotype matching” this mechanism has been an important component in many theoretical models of speciation with gene flow (e.g., Gavrilets and Boake 1998; Dieckmann and Doebeli 1999; Kondrashov and Kondrashov 1999; Kirkpatrick and Nuismer 2004; Verzijden et al. 2005; Kisdi and Priklopil 2011). However, as of yet, there have been no tests of self-referencing mechanisms in stickleback. Alternatively, stickleback females may indirectly learn their own size via sexual imprinting or social learning by associating with other stickleback that tend to have the same phenotype as they do, such as their father or siblings (Lacy and Sherman 1983; Hauber and Sherman 2001). This mechanism too has received a good deal of attention in both the theoretical and empirical literature on speciation with gene flow (Laland 1994; Irwin and Price 1999; Owens et al. 1999; Verzijden et al. 2005). Ours and other studies can shed light on the role that sexual imprinting and/or social learning may play in size matching. Two independent studies in benthics and limnetics have failed to find evidence that daughters sexually imprint on their father's body size (Albert 2005; Kozak et al. 2011). Furthermore, in our study, any effects of sexual imprinting were randomized between treatments. However, individuals in our study were raised with conspecifics, and from late adolescence to adulthood were kept in tanks with other individuals in the same size manipulation group. Thus, it is possible that social learning from conspecifics contributes to size matching. Because our experimental sticklebacks were naïve to heterospecifics up until their mate choice trial, our results suggest that experience with heterospecifics is not necessary for the development of size matching. Interestingly, Kozak and Boughman (2008) found that juveniles raised with mostly conspecifics spent more time shoaling with conspecifics of similar size than those of different size, whereas, juveniles raised with mostly heterospecifics spent equal amounts of time shoaling with conspecifics of similar and different sizes. Their results suggest that juvenile experience with conspecifics may be necessary for the development of size matching and thus, a social learning component exists, at least in the case of shoal member preferences. Nonetheless, the relative contributions to size matching of self-reference and learning from siblings or conspecifics remain to be determined.

In general, phenotype matching has previously been studied extensively as a means of kin recognition, inbreeding avoidance, and mate choice for an optimal complementation of major histocompatibility complex (MHC) alleles (Bateson 1978; Lacy and Sherman 1983; Holmes 1986; Heth et al. 1998; Petrie et al. 1999; Hauber and Sherman 2001; Landry et al. 2001; Mateo and Johnston 2001; Milinski 2006; Le Vin et al. 2010). Its potential role in assortative mating and speciation has received less attention. Theoretical models of speciation have implemented various types of mate choice by phenotype matching (Laland 1994; Gavrilets and Boake 1998; Dieckmann and Doebeli 1999; Kondrashov and Kondrashov 1999; Kirkpatrick and Nuismer 2004; Verzijden et al. 2005; Kisdi and Priklopil 2011), but empirical tests are still lacking. Our results along with those of McKinnon et al. (2004) indicate that assortative mating by size matching may have greatly facilitated the rapid and repeated diversification of sticklebacks in freshwater ecosystems where body size is under divergent selection. Other similar examples involve sexual imprinting (reviewed in Irwin and Price 1999). For example, Darwin's finches sexually imprint on their father's song, which is directly affected by beak shape, a trait under divergent selection (Podos 2001; Grant and Grant 2008). Although the signal trait is actually song, a pattern of beak shape phenotype matching emerges because song changes as a byproduct of changing beak shape. Divergent selection on beak shape should readily lead to assortative mating by beak shape as a byproduct. More studies are greatly needed to determine the prevalence of similar phenomena in nature, as there have been few tests to date. For example, while many studies have looked at whether a trait under divergent selection is also a mate signal, whether mate preference is determined by that trait though phenotype matching has rarely been investigated.


Special thanks to J. Best for help with the set-up and implementation of the experiment. Thanks to M. Arnegard and G. Velema for help with collection of fish. M. Arnegard, G. Blackburn, R. Fitzjohn, T. Ingram, J. Lee-Yaw, L. MGonigle, S. Otto, K. Samuk, M. Servedio, L. Southcott, M. Whitlock, and S. Via provided valuable comments on the manuscript. The authors have no conflict of interest to declare. This research was funded by the Natural Sciences and Engineering Research Council (NSERC) of Canada. GLC was supported by the NSERC Collaborative Research and Training Experience Program (CREATE).