Gene expression divergence and nucleotide differentiation between males of different color morphs and mating strategies in the ruff

By next generation transcriptome sequencing, it is possible to obtain data on both nucleotide sequence variation and gene expression. We have used this approach (RNA-Seq) to investigate the genetic basis for differences in plumage coloration and mating strategies in a non-model bird species, the ruff (Philomachus pugnax). Ruff males show enormous variation in the coloration of ornamental feathers, used for individual recognition. This polymorphism is linked to reproductive strategies, with dark males (Independents) defending territories on leks against other Independents, whereas white morphs (Satellites) co-occupy Independent's courts without agonistic interactions. Previous work found a strong genetic component for mating strategy, but the genes involved were not identified. We present feather transcriptome data of more than 6,000 de-novo sequenced ruff genes (although with limited coverage for many of them). None of the identified genes showed significant expression divergence between males, but many genetic markers showed nucleotide differentiation between different color morphs and mating strategies. These include several feather keratin genes, splicing factors, and the Xg blood-group gene. Many of the genes with significant genetic structure between mating strategies have not yet been annotated and their functions remain to be elucidated. We also conducted in-depth investigations of 28 pre-identified coloration candidate genes. Two of these (EDNRB and TYR) were specifically expressed in black- and rust-colored males, respectively. We have demonstrated the utility of next generation transcriptome sequencing for identifying and genotyping large number of genetic markers in a non-model species without previous genomic resources, and highlight the potential of this approach for addressing the genetic basis of ecologically important variation.


Introduction
Understanding the genetic mechanisms controlling stable morphological or behavioral polymorphisms in natural population is currently a very active field of research (Stapley et al. 2010). Both nucleotide divergence and differences in expression rates of genes may be involved in such variation. For example, a recent study of horned beetles (Onthophagus spp.) found that patterns of gene expression in growing ornaments between two male morphs (that also represent different mating strategies) were as divergent as they were between males and females (Snell-Rood et al. deposition of pigments have been identified. In the beach mouse (Peromyscus polionotus), for example, a mutation in the MC1R gene causes color pattern variation (Hoekstra et al. 2006); and in gray wolf (Canis lupus), a mutation in a beta-defensin gene is involved (Anderson et al. 2009). A natural color polymorphism (white or tan crown stripe) in the white-throated sparrow (Zonotrichia albicollis), which is also linked to dominance and mating strategies, is genetically determined through a dominant (for white) chromosomal polymorphism on autosome 2 (Watt et al. 1984;Tuttle 2003). This chromosomal inversion of almost 100 Mb has been studied in some detail, but the genes responsible for color and behavioral polymorphisms have yet not been identified (Thomas et al. 2008;Romanov et al. 2009). Rather simple genetic mechanisms for color polymorphisms and a link to behavior and reproductive strategies have also been found in a number of other birds (Roulin 2004) and other animals (McKinnon and Pierotti 2010). However, it remains largely unknown whether such correlated genetic polymorphisms arise due to shared regulatory mechanisms (such as variation in transcription factors influencing both traits), joint hormone pathways or physical linkage of different causative mutations. In the Soay sheep (Ovis aries), there is a clear genetic linkage between the gene responsible for coat color polymorphism (TYRP) and size (Gratten et al. 2008), indicating that several genetic polymorphisms important to fitness are co-localized in a small genomic region.
The ruff Philomachus pugnax is a lek-breeding shorebird with uniquely hyper-variable male breeding plumage colorations and patterns, which are permanent features of individuals ( Fig. 1). Each spring, males grow elaborate ornamental neck ruffs and ear tufts, which vary independently from each other in color and pattern Lank and Dale 2001). The background color within the ruff and tuft feathers is essentially white (no melanin), reddish brown ("rust," phaeomelanin), or black (eumelanin). The individual ruff and head tuft feathers can be either plain or patterned with thick or thin bars or spots, and may be diluted. The ruff and head tufts as feather tracts may consist of uniform feather types or regular or irregular mosaics. This extensive and complex variation means that male ruffs within a lek are morphologically distinct, suggesting that plumage variation may be used in individual identification. Ruff mating displays are silent, and the plumage variation may have replaced the vocal signals commonly used by birds for this function . Polymorphism in genes determining coloration has been proposed to be maintained by negative frequency selection in the ruff system. Given that a major function of the plumage polymorphism is to promote individual identification, novel mutations would provide more information used to distinguish individuals, and would thus be selected for .
Ranges of color variation in breeding plumage in ruff are tightly correlated with a fixed behavioral dimorphism in male mating strategy, which is strongly heritable and controlled by a simple Mendelian genetic polymorphism. An autosomal gene (the S locus) co-determines predominantly light versus dark coloration of the elaborate breeding plumage and male mating behavior (Lank et al. 1995(Lank et al. , 1999. Males with substantial amounts of black in either their ruff or head tufts are "Independents," which defend small lek-mating courts against other Independents. Males with predominantly white plumages behave as non-territorial "Satellites," which form uneasy transient alliances with court holders. Co-occupied courts attract more females than do individual territorial males, stabilizing some level of cooperation between these reproductive competitors (van Rhijn 1991;Hugie and Lank 1997;Widemo 1998). Recently, a remarkable third "Faeder" morph was also described; these males resemble slightly large unadorned females morphologically and refrain from obvious male courtship displays, behaving as "sneakers" instead (Jukema and Piersma 2006).
The inheritance of ruff plumage variation has not yet been formally modeled, but the tight association between behavioral morph and coloration strongly suggests genetic determination. Also, inspection of pedigrees clearly shows a strong heritable component to coloration of the male ornamental feathers (D.B.L unpubl. data). Like other scolopacidae, feather colorations are produced by combinations of eumelanin and phaeomelanin. Only a few genes that affect plumage pigmentation or patterning in birds have so far been identified (Hubbard et al. 2010), and there are no obvious candidates for the ruff loci at present. The specific genetic mechanism that maintains the association between the behavioral strategy and plumage coloration for ornamented males is also unknown. Identifying the genetic region(s) involved would thus provide a novel mechanistic link between morphology and behavior variation (Bertossa 2010).
Next generation sequencing is currently revolutionizing the field of adaptation genetics (Stapley et al. 2010). With the novel techniques available, it is now possible to identify molecular markers on a genome-wide scale. Genomic scans for genes and markers segregating between morphs or populations can also be conducted at a very reasonable cost, even in non-model organisms without prior genomic information. One big advantage of this strategy, compared with previous genome scans (that relied on anonymous AFLP markers or microsatellites) is that the markers can be annotated and the genes or genetic regions with positive results can easily be identified, especially if the transcriptome (all expressed genes) is being sequenced (Wheat 2010;Ekblom and Galindo 2011). Such a transcriptomic approach, commonly known as RNA-Seq (Wang et al. 2009), can yield information, not only about nucleotide polymorphisms and genetic structure but also on differential gene expression levels (t Hoen et al. 2008).
The aim of this study was to identify genetic regions that might be involved in determining variation in color morph and mating strategies in males of the ruff. We have used 454 sequencing to characterize the transcriptome from feather samples of several individuals in a captive ruff population, and use these data to investigate both gene expression level divergence and nucleotide sequence differentiation between different males. The genes identified here as potential candidates for regulating color and mating strategy polymorphism can be investigated in more depth in the future using a candidate gene approach. To this end, we also identify several hundred Single Nucleotide Polymorphisms (SNPs) and microsatellites from the ruff transcriptome, markers that can be used in follow-up studies of this intriguing study species.

Materials and Methods
Study population, behavior observations, and feather sampling Actively growing ornamental feathers were plucked from 11 males in a captive breeding population of ruffs maintained by DBL, and immediately placed into RNAlater (Table 1, Fig. 2). The founders of the captive breeding flock were 56 males and 64 females hatched from eggs collected near Oulu Finland in 1985Finland in , 1989Finland in , and 1990, and bred continuously thereafter. Population size has varied from 34 in 1985 to ca. 175-200 in 2009 when feathers were sampled. The feathers sampled were classified as: black, rust, or white; two were patterned. Black ruff feath- Figure 1. Typical male color and feather pattern variation in the ruff (Philomachus pugnax). All pictures are from the captive population used in this study. Color pattern and variation differs between individuals, between and within morph types in this species. The two rightmost individuals on the lower panel have the Satellite mating strategy and the rest of the males are Independents. Table 1. Sequence data on each sample separately, Mean length of reads are shown within brackets. Information on color morph and mating strategy of the individual males is also provided. Individual 313 with color "Straw" was included in the "Rust" category in the analyses of color pattern, whereas individuals 294 and 306, which have multiple colors (barred or flecked), were included both in the "Black" and in the "Rust" categories. ers are typically iridescent, produced by keratin structure. As we were interested in the genes active when the feather was plucked, we classified feathers based on the color growing at the time they were plucked; two feathers with patterns at the tip were called uniform because the growing lower part of the feather were solid. The behavioral morphs of males as Independents or Satellites were known from behavioral observations in previous years.

cDNA library preparation and sequencing
Total RNA was extracted from the feather samples using the miRvana kit (Ambion, Carlsbad, California). The RNA was treated with TURBO DNase (Ambion) to eliminate any contamination of gDNA. Synthesis of cDNA was performed using the SMART kit (Clontech, Saint-German-en-Laye, France), following the manufacturers' protocol and using 22 cycles for the final amplification step. Each cDNA library was individually tagged and sequenced on half a plate of a 454 Genome Sequences FLX system (Roche, Branford, Conneticut) at the sequencing facility in Liverpool (School of Biological Sciences, University of Liverpool). All raw 454sequencing reads are available through the GenBank/SRA database under accession number SRA049313.1.

Assembly and annotation
A total of 350,648 sequence reads were produced from the 11 males. After trimming away bad quality, SMART primer and poly A sequence, using NGen 2.0 (DNASTAR Figure 2. Photographs of the male ruffs and sampled feathers (inserted frames) used in this study. Clockwise numbering: 267, 294, 242, 318, 314, 306, 301, 190, 240, 313, 1241. 2488 Inc. Madison, Wisconsin), 274,325 reads with a mean read length of 297 nucleotides remained (see Table 1 for information about number and length of reads for each individual separately). De-novo assembly of trimmed reads (from all individuals together) was performed using NGen 2.0 (DNASTAR Inc.) using parameters specified in Ekblom et al. (2010). The sequences of all contigs produced are available upon request from the corresponding author. Mapped assemblies for sequence data from each individual separately were also performed using the contigs from the de-novo assembly as a template. All contigs and singletons were annotated using a blast approach. The contig and singleton sequences were compared with the chicken protein database (WASHUC2.56, downloaded from the ENSEMBL ftp site; http://www. ensembl.org/info/data/ftp/index.html) using stand alone version 2.2.18 of blastx (Altschul et al. 1997). Only the best blast hit for each query sequence was kept and only hits with an e-value below 10 À5 , and where the difference in e-values between the best blast hit and the second best hit was at least one order of magnitude. For contigs with SNPs (see below) that could not be annotated from the chicken protein database, we also blasted sequences against zebra finch gene predictions, human protein database, and the nr (non-redundant) sequence database using the web blast interface of NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi).

Expression divergence
The level of transcription for each gene and individual was measured by counting up the number of reads from that individual that mapped to the gene in question. Expression analyses were performed using the bioconductor, edgeR package (Robinson et al. 2010), using a common dispersion parameter and manually adjusting the library size to the number of trimmed reads entering the templated assembly. Differential expression of genes between different color morphs and mating strategies were tested using an exact test for the negative binomial distribution and applying false discovery rate (FDR) correction (Benjamini and Hochberg 1995).

Nucleotide differentiation
Single nucleotide polymorphisms were identified from the ruff transcriptome using the PanGEA software (Kofler et al. 2009), using mostly default settings. First, all the reads were mapped to the contigs using the homopolymer Smith-Waterman algorithm. Then, a de-novo SNP identification was performed using the "454 SNP-identification mode" and allowing only one SNP (the one with highest coverage) per contig, no indels, at least 10 sequences coverage at the SNP site, a minimum of two reads with the minor allele and a maximum of two alleles at a given site. We only used one SNP per contig to avoid problems of pseudo-replication due to linkage of closely situated markers, a minimum of 10 reads per SNP to be able to confidently score genotypes and a minimum of two reads with the minor allele to account for sequencing errors. All individuals where both alleles occurred were considered to be heterozygotes, whereas all individuals where only one of the alleles was found were scored as homozygotes. For analyses of nucleotide differentiation, only SNPs that were scored in at least nine individuals were considered. One hundred nucleotides upstream and downstream of the SNP (or to the end of the contig sequence if less than 100 nucleotides) were extracted from the consensus contig sequences. These were then mapped onto the chicken genome sequence (using blastn) to assess the genomic locations of the SNPs. In general, there are high levels of synteny even between quite divergent bird species like the chicken and the zebra finch (Warren et al. 2010), and the location of the marker in the chicken genome can thus be used as an estimation of the location in the ruff as well.
Tests for genetic differentiation between groups of individuals with different color and mating strategy were performed using the "population differentiation" option in GenePop (http://genepop.curtin.edu.au/) (Raymond and Rousset 1995) testing for both genic differentiation (differences in allele frequencies) and genotypic differentiation (differences in allele combinations). Test for genotypic differentiation was also conducted using the G-test implemented in the SAM software (Joost et al. 2008). LOSITAN (Antao et al. 2008) was used to conduct an F ST outlier analysis, testing each loci for deviations in structure (between color morphs and mating strategies) from neutral expectations of the relationship between heterozygosity and F ST . Positive selection was inferred from the LOSITAN analysis if the given P-value was higher than 0.975 and the estimated F ST was above 0.25.

Coloration candidate gene annotation
We searched the literature for genes involved in bird coloration and/or territorial aggression and compiled a list of 28 candidates (Appendix A1). The 454-sequence blast results were then manually searched for these genes using the Ensembl chicken protein identifiers. We extracted both information of expression of the candidate genes and sequence data from contigs and singletons mapping to the genes in question and used this in manual annotation of these genes. We also aligned all reads and contigs mapping to a candidate gene with the chicken coding sequence for the gene (downloaded from ENSEMBL BioMart: http:// www.ensembl.org/biomart) using the ClustalW algorithm (Thompson et al. 1994).

Microsatellite identification
Microsatellites were identified from the transcriptome 454-sequence data (both contigs and singletons) using the program MsatCommander (Faircloth 2008). We used a minimum number of 10 repeats for di-nucleotide repeats, 8 repeats for tri-nucleotides, and 4 nucleotides for tetra-, penta-, and hexa-nucleotides.

Results
Sequencing and assembly cDNA libraries from 11 different ruff males were sequenced using the 454 technology (Table 1). A total of 274,325 reads remained after trimming. Of these, 193,929 (70%) assembled into 8943 contigs with a mean contig length of 828 nucleotides (range 42-4837) and a mean of 21.7 reads per contig (Fig. 3). As expected, there was a strong positive correlation between (log) contig length and (log) contig depth (r p = 0.55, df = 8941, P < 0.0001; Fig. 4).
We were able to identify 6309 genes in the Ruff transcriptome by blasting the contigs and singletons against the chicken protein database. This represents about onefourth of all currently annotated chicken genes, but most of these were only partly covered with sequence reads (the mean length of chicken genes are around 1800 base pairs while our mean contig size was only 828 base pairs). Note, however, that several contigs may map to different parts of the same gene. A total of 371 genes were present in the transcripts of all 11 sampled individuals, while 2286 genes were present in the transcripts of one individual only.

Expression divergence
The differential expression analysis revealed 22 genes that were up-regulated and 20 down-regulated (unadjusted P < 0.01) in black males compared with others (Fig. 5a). In rust-colored individuals, 7 genes were found to be up-regulated and 50 down-regulated compared with individuals with other colors (Fig. 5b). Six genes had higher expression in Satellite males compared with Independents and 25 had higher expression in Independents (Fig. 5c). However, none of these expression differences between male mating strategies remained statistically significant after multiple test correction (P > 0.05).
Out of the 28 pre-identified coloration candidate genes (Appendix S2), 11 had observed expression in the feather tissue. However, most of these had very low levels of transcription with only one or a few reads present in the dataset. The Endothelin receptor B gene (EDNRB; known to be involved in mice coat color variation) was expressed in two uniformly black males (out of three), but was completely absent from all eight differently colored males. This is suggestive of differential expression in pure black morphs, but as expression was so low, this could not be verified using the expression divergence test. The gene coding for tyrosinase (TYR; coding for an important catalytic enzyme in the melanin synthesis pathway) was represented by five reads in two rust-colored males (out of four), while none of the other males showed expression of this gene. This expression difference was actually significant (P = 0.0039), but did not remain significant after the false discovery rate correction in the test across all transcripts.

Nucleotide differentiation
SNPs were identified in 822 of the 8943 contigs (only one SNP per contig and only sites with a depth of more than 10 reads were considered, see methods; Appendix A2, A3). These polymorphisms represented 681 transitions and 141 transversions, and the GC content at the SNP sites were 50.67%. Minor allele frequencies (MAF) ranged between 0.045 and 0.5 (mean = 0.20; Fig. 6). Three hundred and sixty-six of the SNPs could be scored in at least nine individuals (mean MAF = 0.18) and were used for nucleotide differentiation analyses.
Fifteen SNP markers were significantly differentiated between black and non-black individuals for either allele or genotype frequencies using GenePop (Raymond and Rousset 1995) (Fig. 7a, Table 2). One of these (c01440) was also identified as having significant structure using SAM (Joost et al. 2008). This marker is located in the Xg blood-group gene. Two SNPs with significant structure between black and non-black males were located in feather keratin genes (c00285 and c00777).
Comparing rust-colored individuals against other individuals, there were 15 significantly differentiated markers (Fig. 7b, Table 3). Four of these were also structured , and Independent (c) males have negative log-fold expression levels. The blue horizontal lines represent four-fold differences in expression between morphs. Genes that are only expressed in one of the morphs are plotted in yellow smears to the left in the graph. Red points represent genes with expression divergence between morphs (P < 0.01, without multiple test correction). None of the genes were significantly differentially expressed after applying false discovery rate correction.
according to the SAM analysis. Several of the differentiated markers (c00401, c00533, c1172, c16211) were found to be situated in the same gene; "similar to type 1 hair keratin KA31." These markers were all heterozygous in most of the rust-colored individuals, whereas homozygous for the major allele in males with other coloration. Two other keratin genes ("Feather keratin 2" and "similar to feather keratin") were also among the differentiated markers.
Seven SNPs were significantly differentiated between Independent and Satellite males (Fig. 7c, Table 4). Five markers were also found to have structure between these mating strategies using the SAM software, but only one of these overlapped with the GenePop results. Several of the structured SNPs were matching to un-annotated genes (c00255, c00275) or chicken cDNA clones of unknown origin (c00063, c05665).

F ST outlier analysis
Another way of investigating genetic structure between morphs is the F ST outlier approach. We performed such an analysis using the software LOSITAN (Antao et al. 2008). Here, 13 SNPs were identified as having higher structure between black and non-black morphs compared with neutral expectations based on heterozygosity (Fig. 8a, Table 2). This is an indication of positive selection acting on the genes where these are situated (or closely linked genes). Nine of these overlapped with the markers were identified using the GenePop approach (Table 2). For rust-colored individuals, there were 23 F ST outliers, of which eight were overlapping with the genetic structure analyses (Fig. 8b, Table 3). Finally, for males with the Independent mating strategy compared with Satellite males, there were 14 LOS-ITAN outliers, five of which overlapped with PopGen/SAM results (Fig. 8c, Table 4).
There was no correlation between gene-wise nucleotide differentiation and expression divergence between color morphs (black vs. others, P genic differentiation against logfold difference in expression: r s = 0.107, df = 186, P = 0.145; rust vs. non-rust, r s = 0.09, df = 186, P = 0.902) or between different mating strategies (Independents vs. Satellites: r s = 0.072, df = 186, P = 0.3299). No SNPs were identified in any of the 28 pre-identified coloration candidate genes.

Microsatellite identification
A total of 567 microsatellites were identified from the contigs and singletons, but only 199 had sufficient flanking sequence information to allow for primer design (Appendix A4). The most common type of microsatellite was tetra-nucleotide repeats with 252 identified loci; the least common was hexa-nucleotide repeats with only 28 loci (Table 5).

Discussion
We have successfully sequenced, assembled, and annotated the feather transcriptome from a non-model bird species, the ruff, where there was previously very limited genomic data or tools available. By acquiring transcriptome sequence data separately from 11 different males, we were able to analyze differential gene expression as well as genetic sequence divergence between different color morphs and mating strategies of this lekking bird. Most studies that have been characterizing transcriptomes in non-model organisms so far have been very descriptive (Vera et al. 2008;Ekblom and Galindo 2011), addressing for example, tissue specific gene expression . A few studies have also used this approach to investigate genomics of speciation and adaptation by comparing transcriptome libraries between different subspecies or ecotypes of the same species Hohenlohe et al. 2010a;Wolf et al. 2010). For example, Goetz et al. (2010) used an approach very similar to ours, to address genetic bases for morphological differentiation between two forms of lake trout (Salvelinus namaycush). They identified a number of genes (mainly related to lipid metabolism and immunity) with differential expression between the two morphs. To the best of our knowledge, the only RNA-Seq study so far published that has investigated variation in sexually selected traits in a non-model organism was performed on antler growth in the Sika deer (Yao et al. 2012).
We observed no significant differential gene expression between males of different color morphs or mating

2492
strategies, but some of the coloration candidate genes showed tendencies toward such differences (see below). Our small sample size and the large individual variation in gene expression severely limited our power to detect small differences in gene expression between morphs, especially for lowly transcribed genes. A number of recent studies have showed significant gene expression divergence between individuals differing in ecologically important traits, such as disease resistance (Bonneaud et al. 2011) and coloration (Wolf et al. 2010). In a recent study on the cheetah (Acinonyx jubatus), gene expression was compared between two sequenced cDNA libraries from skin tissue. One of these was from a black-pigmented spot and the other from an adjacent yellow pigment region. Several genes downstream of the MC1R gene in the melanin synthesis pathway were identified as having higher expression in the black tissue compared with the light one (Hong et al. 2011). Interestingly, one of the most highly expressed genes in our dataset belongs to a gene family (connexins) that have recently been identified as important in color pattern development in zebra fish (Watanabe et al. 2012). Due to our limited sequencing effort, we could only make robust inferences of gene expression differences in a small fraction of the total number of genes identified in the ruff transcriptome. Only 371 of 6309 identified genes showed evidence of expression in all 11 sampled individuals and out of these, a mere 117 had a mean transcript count of more than 10 reads per individual. With more sequencing (for example using an Illumina Hiseq instrument), future studies will be able to make better estimations of gene expression levels between and within individuals, as well as obtaining expression information from a larger number of genes. For future studies, it will also be preferable to include comparisons between samples from differentially colored feathers from the same individual to control for inter-individual effects.
Several SNP markers showed significant nucleotide divergence between males of different color morphs and mating strategies. Most striking was, perhaps, the large number of hits on feather keratin genes. The keratin gene family has been well characterized in chicken (Presland et al. 1989) and keratin structures are closely associated with both structural (Prum et al. 2009) and melanic (Bonser 1995) coloration. Black ruff feathers commonly show iridescence, which results from structural patterning. The only gene to be identified as having structure between color morphs using all analysis approaches was the Xg blood group. To the best of our knowledge, the Xg gene is not known to function in any coloration pathway. This highly polymorphic gene is positioned on the X chromosome in humans (Cartron and Colin 2001), but is not sex linked in chicken. Because of the limited sampling of individuals and color morphs, the genes identified as significantly structured in our study should only be considered as a list of potential candidates for involvement in ornamental color variation. There are likely to be false positives in these analyses as well as a number of important genes that were not sampled due Table 2. SNP markers identified as being genetically structured between black and non-black ruff males. "Allelep" and "genotypep" are outputted P-values from GenePop using the gene and genotype option. "G" and the p value for G (pG) are calculated from the SAM software. Heterozygosity (Het), F ST (Fst), and P were calculated in LOSITAN and markers with pos.sel = 1 were identified as being under positive selection with this software. Markers highlighted with bold text were identified as structured using at least two of these independent tests.

Locus
Allelep to the limited sequence coverage of many transcripts. Future studies should try to verify the importance of these, for example using a candidate gene approach. Many of the transcripts identified as being genetically structured between the two different mating strategies of ruff males (Independents and Satellites) belong to non-annotated genes or sequenced cDNA clones with unknown function. These genes may be specific to birds or are too rapidly evolving to identify the mammalian homologs. Follow-up studies need to verify if these are truly important for mating strategy decisions, and if so what their molecular functions are. Table 3. SNP markers identified as being genetically structured between rust colored and other ruff males. "allelep" and "genotypep" are outputted p values from GenePop using the gene and genotype option. "G" and the p value for G (pG) are calculated from the SAM software. Heterozygosity (Het), F ST (Fst), and P were calculated in LOSITAN and markers with pos.sel = 1 were identified as being under positive selection with this software. Markers highlighted with bold text were identified as structured using at least two of these independent tests.

Locus
Allelep The different approaches taken to identify markers with significant nucleotide structure (GenePop, SAM, LOSIT-AN) gave only partially overlapping results. In particular, the F ST outlier analysis performed using the LOSITAN software (Antao et al. 2008) often identified a larger amount of high-divergence loci compared with the other programs used. It is not unusual to get conflicting results using different outlier detection approaches. Simulations have shown that both type I and type II errors occur for many of the methods routinely used, calling for some caution when interpreting the results (Hohenlohe et al. 2010b;Narum and Hess 2011). The approach taken here, to use multiple software, can, to a certain extent, guard against inferring falsely positive results (Luikart et al. 2003).
We also used a candidate gene approach to investigate variation in known coloration genes in more depth. Previous studies have investigated variation specifically in the MC1R gene involved in the melanin synthesis pathway (MacDougall-Shackleton et al. 2003;Nadeau et al. 2007). However, other types of pigmentation such as genes involved in carotenoid coloration (Wade et al. 2009) have also been specifically targeted. Two recent studies adopted a "multiple candidate gene approach" to investigate several genes involved in avian pigmentation and vision genes, to search for elevated levels of genetic structure (Skoglund and Höglund 2010;Lehtonen et al. 2012). We specifically investigated 28 pre-identified coloration candidate genes. We could not identify any nucleotide sequence variation in any of these, but two (EDNRB and TYR) showed signs of differential expression. The EBNRB gene, which has previously been shown to be involved in mice coat color variation, by regulating melanocyte formation (Cook et al. 2005), was only expressed in black males. In contrast, we were only able to detect expression of the TYR gene, a catalyst in the melanin synthesis pathway (Sato et al. 2007), in individuals with rust color. Table 4. SNP markers identified as being genetically structured between ruff males with Independent and Satellite mating strategies. "allelep" and "genotypep" are outputted p values from GenePop using the gene and genotype option. "G" and the P-value for G (pG) are calculated from the SAM software. Heterozygosity (Het), F ST (Fst), and P were calculated in LOSITAN and markers with pos.sel = 1 were identified as being under positive selection with this software. Markers highlighted with bold text were identified as structured using at least two of these independent tests.

Locus
Allelep Although only half a plate of Roche 454 sequencing was used, we still managed to identify and annotate over six thousand genes expressed in ruff feathers (but note that there was very limited sequence coverage for many of these), which represent about a quarter of all genes annotated from the very well-studied chicken transcriptome, where data are available from a range of different tissues. Almost ten thousand of the chicken genes have been identified as being expressed in skin tissue (Chan et al. 2009), which might be expected to show similar expression patterns as feathers. In the zebra finch (Taeniopygia guttata), 6460 genes were identified as being expressed in skin tissue using a similar sized RNA-Seq dataset as presented here. Biological functions related to cytoskeletal structures and cell proliferation were overrepresented in genes primarily expressed in zebra finch skin . Feather pulp was among the sampled tissues analyzed using an RNA-Seq approach in the bobwhite quail (Colinus virginianus). In this species, 8825 unique genes were identified and functionally annotated (Rawat et al. 2010).
From our ruff transcriptome sequence data, we were also able to identify more than 800 SNPs and almost 200 microsatellites with flanking sequences. These molecular markers, together with the presented inference of genes important in coloration and mating strategy variation, will provide a valuable resource for further studies of Red points in the graphs represent candidates for positive selection (higher differentiation than expected by chance) and names of some of the strongest candidates are given in red labels. Note that some markers have identical F ST and He, and are thus overlapping in the graphs. ecology and genetics in this extremely interesting and ecologically well-studied species.