Male-specific DNA markers provide genetic evidence of an XY chromosome system, a recombination arrest and allow the tracing of paternal lineages in date palm


Author for correspondence:

Frédérique Aberlenc-Bertossi

Tel: +33 467 416193



  • Whether sex chromosomes are differentiated is an important aspect of our knowledge of dioecious plants, such as date palm (Phoenix dactylifera). In this crop plant, the female individuals produce dates, and are thus the more valuable sex. However, there is no way to identify the sex of date palm plants before reproductive age, and the sex-determining mechanism is still unclear.
  • To identify sex-linked microsatellite markers, we surveyed a set of 52 male and 55 female genotypes representing the geographical diversity of the species.
  • We found three genetically linked loci that are heterozygous only in males. Male-specific alleles allowed us to identify the gender in 100% of individuals. These results confirm the existence of an XY chromosomal system with a nonrecombining XY-like region in the date palm genome. The distribution of Y haplotypes in western and eastern haplogroups allowed us to trace two male ancestral paternal lineages that account for all known Y diversity in date palm.
  • The very low diversity associated with Y haplotypes is consistent with clonal paternal transmission of a nonrecombining male-determining region. Our results establish the date palm as a biological model with one of the most ancient sex chromosomes in flowering plants.


Dioecy in plants, equivalent to gonochory in animals, is characterized by male (staminate) and female (pistillate) functions on different individuals. In contrast to animals where the rigorous separation of the sexes predominates, most plants are hermaphroditic, producing both male and female organs within the same flower. Only 6% of angiosperm species are dioecious (Renner & Ricklefs, 1995). Dioecy may have evolved several times in plants (Weiblen et al., 2000) from bisexual (hermaphroditic or monoecious) ancestors, by two independent successive mutations: a first mutation causing male sterility generating a gynodioecious population and a second mutation resulting in a decreased female fertility in the population leading to functional dioecy (Charlesworth, 1991). Theoretical predictions have shown that sex chromosomes can evolve in dioecious species only when the two sex determination genes are closely linked on the same chromosome, thus avoiding the production of sterile individuals (Charlesworth & Charlesworth, 1978). The mutations should also have complementary dominance, with the alleles controlling the heterogametic sex being dominant, whereas those controlling the homogametic sex are recessive. Sex chromosomes may then evolve a nonrecombining region around the sex-determining genes (Bull, 1983), for example, by chromosomal rearrangements such as inversions, translocations, deletions, and duplications (Lemaitre et al., 2009). The XY chromosome system and heterogametic males prevail in dioecious plant species (Westergaard, 1958). Here, we studied the sex-determining chromosomal system in the dioecious date palm, Phoenix dactylifera (Arecaceae), by searching for sex-specific genetic markers. This diploid species (2n = 2x = 36) has pronounced floral sexual dimorphism. The floral buds appear potentially bisexual, producing primordia of both sexes, and selective abortion of flower organs then occurs (Daher et al., 2010).

Date palm is currently the main crop of arid and semiarid countries of North Africa and the Middle East, producing fruits with high nutritional value and maintaining fertile areas of life in deserts. Sexual propagation has been practiced in the Near East since the Neolithic period (Tengberg, 2003), but the resulting progeny do not conserve the organoleptic fruit traits of the mother plant. Vegetative propagation through offshoots or tissue culture has thus been set up to maintain desired fruit characteristics. Today, date palm breeding is threatened by severe loss in genetic diversity, so it is essential to set up breeding programs to select tolerant varieties to biotic and abiotic stresses and enrich the germplasm.

Early sex identification of female plants (which produce dates) would allow new breeding approaches via controlled crossings and facilitate marker-assisted selection (MAS) and genetic association studies. For these reasons, sex-linked markers have been sought in date palm for decades (Bekheet & Hanafy, 2011). Recently, Al-Mahmoud et al. (2012) identified male-linked markers, but this requires confirmation, as the markers were validated only in a small sample and with an unsatisfactory accuracy of 90%.

The sex ratio of date palm progeny is 1 : 1 (Saadi, 1990), which suggests genetic sex determination through a single locus. An XY chromosome system was proposed, based on heteromorphic chromocentres in male interphase nuclei (Siljak-Yakovlev et al., 1996), karyotype studies (AbdAlla & Abd El-Kawy, 2010) and on single nucleotide polymorphisms that appeared to segregate with gender (Al-Dous et al., 2011). The sex-determining mechanism is thus still uncertain and there is no reliable way to determine the sex of date palm plants before reproductive age.

In studies of other dioecious plants, associations between markers and sex provided the first reliable indication of the presence of sex chromosomes, for example, in Dioscorea tokoro (Terauchi & Kahl, 1999), Carica papaya (Parasnis et al., 1999) and Asparagus (Spada et al., 1998).

We thus searched for sex-specific genetic markers in P. dactylifera. Because of the date palm's long life cycle, cultivation practices favouring females, and the resulting lack of families with reliable segregation information, we used a population approach over a broad geographical range, along the lines of genetic association studies. We identified three sex-linked simple sequence repeat (SSR) loci which are reliable markers of sex. They demonstrate an XY chromosome system, reveal the existence of nonrecombining XY-like regions, and allow us to trace date palm paternal lineages.

Materials and Methods

Plant material

Date palms (Phoenix dactylifera L.) were collected over the geographical distribution of the species in order to capture the greatest possible genetic diversity, including 52 males and 55 females from traditional western (Tunisia, Morocco, and Italy) and eastern (Djibouti, Oman, Iraq and Syria) areas of cultivation (Supporting Information, Table S1).

DNA extraction

Leaf samples were freeze-dried for 72 h with an Alpha1-4LD Plus lyophilizer (Fisher Scientific, Illkirch, France) and ground with a Tissue Lyser System (Qiagen). DNA extraction was carried out using the Dneasy plant mini kit (Qiagen) according to the manufacturer's instructions. The DNA was quantified with a Tecan GENios™ fluorescence microplate reader (Tecan, Männedorf, Switzerland). All samples were adjusted to a concentration of 10 ng μl−1 for subsequent analyses.

Identification of autosomal and sex-linked SSR sequences

We performed an in silico search for SSR sequences in the whole date palm genome sequence (Al-Dous et al., 2011) using the Websat program ( and we retained three autosomal SSRs (mPdIRD031, mPdIRD033, mPdIRD040). We added the locus mPdCIR078, previously randomly identified by Billotte et al. (2004). Sex-linked SSRs were searched for in the 24 gender-linked scaffolds identified by Al-Dous et al. (2011). They are located in noncoding regions. Primer 3 software included in the Websat program was used to design primers flanking the potential SSRs. Features of the SSR loci are presented in the Table S2.

Genetic analyses

Polymerase chain reactions were performed in an Eppendorf (AG, Hamburg, Germany) thermocycler. The reaction volume was 20 μl and contained 10 ng of genomic DNA, 10× reaction buffer, 2 mM MgCl2, 200 μM dNTPs, 0.5 U polymerase, 0.4 pmol of the forward primer labelled with a 5′ M13 tail, 2 pmol of the reverse primer, and 2 pmol of the fluorochrome-marked M13 tail and MilliQ water. A touchdown PCR was carried out with following parameters: denaturation for 2 min at 94°C, followed by six cycles of 94°C for 45 s, 60°C for 1 min and 72°C for 1 min, then 30 cycles of 94°C for 45 s, 55°C for 1 min and 72°C for 1.5 min, then 10 cycles of 94°C for 45 min, 53°C for 1 min, 72°C for 1.5 min, and a final elongation step at 72°C for 10 min. PCR products were analysed using an ABI 3130XL Genetic Analyzer (Applied BioSystems, Foster City, CA, USA). Allele size scoring was performed with GeneMapper software v3.7 (Applied BioSystems). We then compared allelic frequencies and allele size distributions between male and female individuals using the GenAlEx 6.41 program (Peakall & Smouse, 2006; Table S3); only loci generating sex-specific alleles were retained for the subsequent study.

We compared the genetic structure of the whole population inferred from the autosomal and the sex-linked SSRs, including comparisons between the male and female plants.

Genetic differentiation of the population was estimated by calculating Rst according to the stepwise mutation model (Slatkin, 1995; 1). The AMOVA procedure implemented using the GenAlEx 6.41 program (Peakall & Smouse, 2006) was carried out using a microsatellite distance matrix data input for calculation of Rst. The significance of Rst was determined by running 10 000 permutations (Fig. S1). Correspondence analysis (CA) was also performed using the GENETIX program (Belkhir et al., 2004).

Table 1. Genetic structuring of the date palm (Phoenix dactylifera) population generated by each sex-linked microsatellite and autosomal microsatellite locus between the two sexes
Locus R st a
  1. a

    The Rst index measures the genetic differentiation generated by the two loci sets (autosomal and sex-linked). We observe that the values of Rst obtained with the autosomal loci are around zero, which indicates an absence of structuring and is consistent with a high level of allele exchanges between sexes. By contrast, Rst values obtained with the sex-linked loci set (in bold) are significantly higher than zero and clearly show a genetic structuring of the population in relation to sex.

 mPdIRDP52 0.461
mPdIRDP50 0.182
mPdIRDP80 0.171
Table 2. Genetic variation generated by each sex-linked and autosomal microsatellite loci among male and female groups of date palm (Phoenix dactylifera)
Group Locus N A HoHeFis P-Valuec
  1. N, Sample size; A, number of allele; Ho, observed heterozygosity; He, expected heterozygosity,

  2. a

    Weir & Cockerham's (1984) estimate.

  3. b

    Robertson & Hill's (1984) estimate.

  4. c

    Exact P-values estimated by the Markov chain method.

  5. Ho and He generated with autosomal loci are similar in both male and female groups. Ho values are ranged between 0.173 and 0.769 for male group and between 0.182 and 0.800 for female group which reveals a wide genetic diversity within each group. By contrast, He values generated by sex-linked loci in male group (in bold) are equal to 1 which means that all male individuals are heterozygous, with a highly negative significant values of the fixation index Fis (P = 0) which reveals a departure from Hardy–Weinberg equilibrium toward the fixation of the heterozygous genotypes. While, He and Ho values in female group are similar to the values of He and Ho generated by autosomal loci and Fis estimates are not significant (P > 0.05).

MaleSex-LinkedmPdIRDP525211 1.000 0.803−0.237−0.1020.000
mPdIRDP50529 1.000 0.744−0.336−0.2430.000
mPdIRDP80524 1.000 0.743−0.3380−0.3270.000

For the male and female groups, we compared the observed frequencies of heterozygotes (Ho) with those expected assuming Hardy–Weinberg genotype frequencies (He) using the GenAlEx 6.41 program (Peakall & Smouse, 2006). The excess of heterozygotes in the two groups was evaluated by two estimates of Fis, Weir and Cockerham's estimate (Weir & Cockerham, 1984) and Robertson and Hill's estimate (Robertson & Hill, 1984; Table 2). The latter has a lower variance under the null hypothesis (Robertson & Hill, 1984). The significance of Fis was determined by calculation of P-values estimated by the Markov chain method (Guo & Thompson, 1992; Table S4). These computations were performed with the Genepop 4.0.10 program (Raymond & Rousset, 1995; Rousset, 2008).

To determine which alleles of the three sex-linked loci, mPdIRDP80, mPdIRDP50 and mPdIRDP52, were the most frequently associated, we computed haplotype frequencies for unrelated populations with the EM algorithm (Excoffier & Slatkin, 1995). We used the PowerMarker program, in which the EM algorithm is implemented to estimate haplotype frequencies and to assign phase tables with the probability that each genotype could be resolved into each of the possible haplotype pairs (Notes S1).

To compare between the X and Y genetic diversity, we calculated the estimator θ = 4 Neμ (Table S5) assuming a stepwise mutation model, as commonly assumed for microsatellites (Haasl & Payseur, 2010). Under this model, θ can be estimated from the expected heterozygosity, He:

inline image

(Kimura & Ohta, 1978).

To do so, we computed allelic frequencies and estimated He for X and Y alleles separately. Because we found a strong population differentiation between western and eastern individuals for Y-linked alleles (Fig. 2) we computed He and θ for each group separately (Notes S2, Tables S5, S6).

Results and Discussion

Identification of sex-linked markers

To test for loci with sex-specific alleles, we identified 34 SSRs in putatively gender-linked scaffolds of the date palm genome, three of which (mPdIRDP80, mPdIRDP50 and mPdIRDP52) were found to be potentially sex-linked. These three loci showed significantly higher genetic differentiation between the sexes, as measured by the Rst index (Slatkin, 1995; Table 1, Fig. S1), than the four SSRs randomly identified in the genome (mPdCIR078, mPdIRD031, mPdIRD033 and mPdIRD040); the latter showed low and nonsignificant Rst values. Correspondence analysis (CA) highlighted a clear structure among the alleles of the first three loci, with distinct male and female subgroups suggesting sex linkage (Fig. 1a), compared with a homogeneous male/female distribution for the latter four, without any clustering of individuals based on gender (Fig. 1b), consistent with autosomal locations of these loci. For these loci, we compared allele frequencies and sizes between our male and female samples. Of the four mPdIRDP80 alleles, two (mPdIRDP80_311, mPdIRDP80_320) were shared between males and females, but alleles mPdIRDP80_213 and mPdIRDP80_329 appeared strictly limited to the male phenotype, suggesting Y-linkage (Fig. 2a). The mPdIRDP50 locus also had two male-specific alleles, that is mPdIRDP50_199 and mPdIRDP50_201 (Fig. 2b); surprisingly, eastern males always had three alleles, two of which were male-specific (Figs 2b, S2), suggesting that the Y chromosome carries a duplication of this locus. The mPdIRDP52 locus yielded four male-specific alleles, also with a duplicated allele in eastern males (Figs 2c, S2).

Figure 1.

Correspondence analysis graphs showing genetic structuring within the studied date palm (Phoenix dactylifera) population generated by sex-linked single sequence repeats (SSRs) (a) and autosomal SSRs (b). Blue, female subgroup; yellow, male subgroup. Each entry corresponds to one individual of the subgroup. (a) The graph shows a clear clustering according to sex phenotype. (b) Individuals are distributed on a scatterplot without any clustering.

Figure 2.

Allelic distribution of P80 (a), P50 (b) and P52 (c) loci within the studied date palm (Phoenix dactylifera) population. The left side corresponds to male genotypes and the right side corresponds to female genotypes. Each dot represents an allele. Alleles shared between male and female individuals (X) are represented in red, and male-specific alleles (Y) are represented in blue. Female individuals only have alleles shared between males and females, while male individuals have shared alleles and male-specific alleles. Western and Eastern male genotypes have one and two male-specific alleles, respectively.

Importantly, all males were heterozygous for all three loci that appeared to be sex-linked on the basis of having male-specific alleles. As shown in Table 2, all three have Ho = 1, and their genotype frequencies differ significantly from Hardy–Weinberg expectations, based on testing the null hypothesis (Fis = 0). By contrast, females showed no significant departures from Hardy–Weinberg genotype frequencies (Fis did not differ from 0, see Table 2), nor did the four autosomal markers. We did not observe any genotype carrying only the male-specific allele (YY) at the sex-linked loci. The four other markers showed no evidence of sex-linkage.

Our loci represent the first set of reliable and validated sex-differentiating molecular markers for date palms. The sizes of the male-specific (Y-linked) alleles and the male and female shared (X-linked) alleles found in our sampling are given in the Table S7. It is possible that more samples would generate more patterns. However, the three markers with Y-linked alleles assure a high degree of confidence for sexing genotypes of various origins, shortening the time necessary for selecting female plants and facilitating genetic improvement of the species.

This population approach offers an alternative to conventional genetic approaches for searching for sex-linked markers in long-lived plants that had been previously excluded from study because of the near impossibility of obtaining pedigrees with a sufficient number of generations, or large enough family sizes, to ensure that genes closely linked to the sex-determining locus will recombine. In date palm, it now provides the first conclusive genetic evidence of an XY chromosome system, with males being the heterogametic sex, confirming previous indirect results (Siljak-Yakovlev et al., 1996; AbdAlla & Abd El-Kawy, 2010; Al-Dous et al., 2011).

Evidence of recombination arrest

The exclusive clustering of male alleles in the ‘Y haplotypes’ without any mixing with shared alleles confirms the recombination arrest between the Y and X regions carrying the mPdIRDP80, mPdIRDP50 and mPdIRDP52 loci. The male-specific alleles, and the strict heterozygosity of males for these alleles, indicate that there is a region in which the X and Y homologues do not recombine. Therefore, as expected, we could infer haplotypes for the three loci that appear to be fully sex-linked (Tables S8, S9). We found five ‘Y haplotypes’ carrying only male-specific alleles, and 16 ‘X haplotypes’, carrying alleles shared by male and female plants (Fig. 3). As also expected, no such sex-specific phase assignment was possible for the autosomal SSRs (data not shown).

Figure 3.

Haplotype distribution within the studied date palm (Phoenix dactylifera) population. Light red and dark red histograms represent X female and X male haplotypes, respectively. Blue histograms correspond to Y haplotypes.

A lack of recombination between Y-linked loci predicts that both genetic and haplotypic diversity should be lower than for X-linked loci (Charlesworth & Charlesworth, 2000). Accordingly we found much lower theta values for Y-linked than for X-linked alleles, especially in the western group (see later; Table S5), and haplotype diversity is three times lower for inferred Y than X haplotypes (Table S8). Taken together, our results strengthen the hypothesis of physical linkage of the three sex-linked loci in a male-determining region that might be called a Y chromosome, as in some other dioecious plants (Westergaard, 1958) and theoretical predictions for the evolution of sex-determining regions (Charlesworth et al., 2005). However, to date, we have no information on the size of the predicted nonrecombining region, notably because of the absence of genetic and physical maps.

Y-linked population structure

We noticed that the five male-specific haplotypes could be subdivided into just two haplogroups: an ‘A’ haplogroup with the two haplotypes from western male and a ‘B’ haplogroup composed of the three haplotypes from eastern males (Fig. S3). The same strong geographic structure was also observed for each locus separately, with male-specific alleles presenting a strict east/west distribution (Fig. 2). In humans, the male-specific region of the Y chromosome (MSY) is transmitted clonally from father to son (Hughes & Rozen, 2012), and MSY haplogroups of the human Y chromosome allowed reconstruction of the evolution of paternal lineages. In date palm, the clonal transmission of Y-linked regions highlighted two potential original Y haplotypes arising from two different ancestral male lineages underlying the western and eastern haplogroups that gave rise to the global Y diversity in date palm. Strong population structure for Y-linked but not X-linked alleles also supports strong allelic effective number reduction for the Y-linked genomic region.

Evolution of the XY system in the genus Phoenix

All species in the genus Phoenix are dioecious, suggesting the existence of a common dioecious ancestor before speciation of the genus. Divergence of the genus Phoenix within the Coryphoideae subfamily has been dated at c. 50 million yr ago using molecular methods (Couvreur et al., 2011). In addition, fossils of Phoenix male flowers have been identified in sediments dating from the middle Eocene period (55.8–33.9 million yr ago; Ogg, 2004; Dransfield et al., 2008). The unisexuality of Phoenix flowers, and dioecy, is therefore probably very ancient.

Westergaard (1958) defined different stages of plant sex chromosome evolution, and our results make it clear that the date palm has an XY system that has reached the stage of recombination arrest and strict absence of YY genotypes. The observed duplications of two loci in eastern males also support the view that this Y chromosome is undergoing genetic degeneration which starts after the arrest of recombination. A threefold reduction in diversity is expected purely because of the threefold higher number of X vs Y chromosomes in populations (assuming a 1 : 1 sex ratio, as observed in date palms). In addition, if degeneration is in progress, Y diversity is expected to be reduced further because of hitchhiking effects resulting from the arrest of recombination (Charlesworth & Charlesworth, 2000; Gordo & Charlesworth, 2001).

Furthermore, cytological studies have suggested that the date palm Y chromosome is smaller than the X (AbdAlla & Abd El-Kawy, 2010; an unusual situation in plants, though Cycas revoluta is a known case – see Segawa et al., 1971). Much more detailed physical mapping of the Y will be needed to determine if this is the case. In some species, heteromorphism has remained undetected until very detailed data can be obtained, for example, in stickleback fish (Ross & Peichel, 2008).

Taken together, these data indicate that, in date palm, sex chromosomes are at least partly differentiated and that the Y chromosome has started degenerating. This situation is consistent with the probable ancient origin of dioecy in the genus Phoenix, which could therefore possess one of the most ancient sex chromosomes encountered in angiosperms.


We thank Ali Zouba, Adbelmadjid Rhouma, Karim Kadri, Ahmed Othmani (CRRAO, Tunisia), Abdourahman Daher (CERD, Djibouti), Claudio Littardi (CRSP, Italy), Marco Ballardini (CRA-FSO, Italy), Robert Krueger (USDA, USA), Al-Ghaliya Al-Mamari and Sean Mayes, (University of Nottingham, United Kingdom) for providing plant material; Muriel Latreille (INRA, France) for technical assistance; and Alain Rival, James Tregear and Thomas Couvreur for corrections to the manuscript. The authors acknowledge the two anonymous referees for their comments. This work was supported by the AUF-Mersi Project and the ‘Ministère de l'Enseignement Supérieur et de la Recherche’ of Tunisia. In memory of the late Mokhtar Trifi, one of the initiators of the project.