Locus-specific view of flax domestication history

Crop domestication has been inferred genetically from neutral markers and increasingly from specific domestication-associated loci. However, some crops are utilized for multiple purposes that may or may not be reflected in a single domestication-associated locus. One such example is cultivated flax (Linum usitatissimum L.), the earliest oil and fiber crop, for which domestication history remains poorly understood. Oil composition of cultivated flax and pale flax (L. bienne Mill.) indicates that the sad2 locus is a candidate domestication locus associated with increased unsaturated fatty acid production in cultivated flax. A phylogenetic analysis of the sad2 locus in 43 pale and 70 cultivated flax accessions established a complex domestication history for flax that has not been observed previously. The analysis supports an early, independent domestication of a primitive flax lineage, in which the loss of seed dispersal through capsular indehiscence was not established, but increased oil content was likely occurred. A subsequent flax domestication process occurred that probably involved multiple domestications and includes lineages that contain oil, fiber, and winter varieties. In agreement with previous studies, oil rather than fiber varieties occupy basal phylogenetic positions. The data support multiple paths of flax domestication for oil-associated traits before selection of the other domestication-associated traits of seed dispersal loss and fiber production. The sad2 locus is less revealing about the origin of winter tolerance. In this case, a single domestication-associated locus is informative about the history of domesticated forms with the associated trait while partially informative on forms less associated with the trait.


Introduction
Genetic studies of crop domestication have increased in the last two decades, largely thanks to the development of many informative molecular techniques (Zeder et al. 2006;Burke et al. 2007;Purugganan and Fuller 2009). Considerable research has been performed to investigate domestication events using selectively neutral and genome-wide molecular markers (Heun et al. 1997;Badr et al. 2000;Matsuoka et al. 2002;Morrell and Clegg 2007;Fu 2011). However, there has been an increasing trend to reconstruct the evolutionary history of domestication through the loci that have been subject to selection (Sang 2009;Gross and Olsen 2010;Blackman et al. 2011), as the genetic bases have been uncovered for many domestication-associated traits, including plant structural changes (Wang et al. 1999;Doebley et al. 2006;Li et al. 2006) and food quantity and quality (Sweeney et al. 2007;Shomura et al. 2008;Kovach et al. 2009). In the case of plants, the domestication process is increasingly considered to have been a protracted process (Allaby et al. 2008), and the assemblage of domestication-associated traits a staggered process (Fuller 2007). Under this scenario, there is an increased likelihood that each trait may have a quite disparate evolutionary history (Allaby 2010). Crops that have multiple purposes are interesting in this respect because genes governing traits relating to specific groups of the crop may carry variable domestication signatures. Thus, a locus-specific inference of different trait groups should reveal not only the group-specific domestication history but also the correlated domestication processes.
Flax (Linum usitatissimum L.) is a good example of a multiple purpose crop being utilized for oil and fiber. It was one of the eight "founder crops" of agriculture, was a principal source of oil and fiber from prehistoric times until the early 20th century, and still remains a crop of considerable economic importance (Zohary and Hopf 2000;Muir and Westcott 2003). The archaeological record shows that flax was domesticated for oil and/or fiber use more than 8000 years ago in the Near East (Helbaek 1959;van Zeist and Bakker-Heeres 1975) and suggests its wild progenitor as pale flax (L. bienne Mill. or previously L. usitatissimum L. subsp. angustifolium (Huds.) Thell. ;Hammer 1986). The earliest reliable evidence of pale flax for human usage comes from Tell Abu Hureyra 11,200-10,500 years before present (yBP) (Hillman 1975), although recent claims have been made for much earlier usage that are disputed (Kvavadze et al. 2009;Bergfjord et al. 2010).
Morphological, cytological, and molecular characterizations confirm that pale flax is the wild progenitor of cultivated flax (Tammes 1928;Gill 1966Gill , 1987Diederichsen and Hammer 1995;Fu et al. 2002;Fu and Allaby 2010). Pale flax is a winter annual or perennial plant with narrow leaves and dehiscent capsules, and usually displays large variation in the vegetative plant parts and variable growth habit (Diederichsen and Hammer 1995;Uysal et al. 2011). Recent studies that expanded the available pale flax germplasm (Uysal et al. 2010(Uysal et al. , 2011 are informative about flax domestication syndromes (Hammer 1984). Generally, cultivated flax has variable seed dormancy, grows fast with large variation in the generative plant parts, and has early flowering, almost indehiscent capsules, and large seeds. In addition to oil and fiber varieties, flax accessions with winter hardiness and capsular dehiscence are also available for research (Diederichsen and Fu 2006). Domestication-associated genes offer an approach to reconstructing the specific domestication history of a crop with an associated trait with phylogenetic and phylogeographic resolution. Such approaches have provided insights into the independent origins of various traits in rice (e.g., Shomura et al. 2008;Kovach et al. 2009) and in sunflower (Blackman et al. 2011). However, flax presents several problems for this type of approach. First, no domestication-associated genes have yet been identified in flax. Second, the variety of uses of flax implies that different subsets of cultivated flax have different trait combinations, and consequently it is not clear to what extent a single domestication gene may be used to infer the domestication history of the crop as a whole.
The sad2 locus is a potential domestication target in flax because of its role in fatty acid metabolism. The sad2 gene is responsible for converting stearoyl-ACP to oleoyl-ACP by introducing a double bond at C 9 and thus can increase the unsaturated fatty acid content of the plant (Ohlrogge and Jaworski 1997;Jain et al. 1999). This gene has been well characterized due to commercial interest for the manipulation of unsaturated fatty acids in major crop plants (Shanklin and Sommerville 1991;Knutzon et al. 1992;Singh et al. 1994). Preliminary molecular evidence based on the sad2 locus in a relatively small sample of flax suggested that the initial purpose of flax domestication was for its oil use (Allaby et al. 2005). This study was limited because it considered very few pale flax accessions and only oil, fiber, and landrace varieties of cultivated flax. More recently, expressed sequence tagderived simple sequence repeat (EST-SSR) markers (Cloutier et al. 2009;Fu and Peterson 2010) were applied to the expanded pale flax germplasm (Fu 2011). This study established that the primitive dehiscent type of cultivated flax assumed a basal position in genome-wide marker phenograms, suggesting that these varieties were important in the early stages of domestication. However, the overall resolution was not high.
The aim of this study was to assess whether the sad2 locus increases oil production in cultivated flax and whether a reconstruction of the domestication history of this trait based on the gene is widely informative for cultivated flax using a broad sample of accessions representing the recently expanded pale flax germplasm set and the four cultivated flax groups.

Materials and Methods
All flax accessions studied here were obtained from the flax collection at the Plant Genetic Resources of Canada (PGRC ;  Table 1). They include 43 pale flax accessions and 70 cultivated flax accessions. The pale flax accessions were selected largely from recently acquired pale flax accessions from Turkey and Greece, in addition to those representing the old pale flax collection in PGRC. This expanded set of pale flax accessions represent only part of its natural distribution spanning the western Europe and the Mediterranean, north Africa, western and southern Asia, and the Caucasus regions (Diederichsen and Hammer 1995). The cultivated flax accessions were selected to represent five major groups of cultivated flax (landrace, fiber, oil, winter, and dehiscent). The landrace group represents a collection of local oil and/or fiber varieties from different countries. The winter flax accessions sampled cultivated flax developed with winter hardiness from 12 countries. The dehiscent flax accessions represent the primitive form of cultivated flax with dehiscent capsules and have been long accumulated from flax cultivation in the cultivated flax gene pool (Hegi 1925). For this study, the dehiscent flax accessions were empirically verified for capsular dehiscence and the selected pale flax accessions were assessed for their taxonomic identity in the greenhouse. Also, the accession selection process took into account the country of origin to widen genetic diversity for this study.

Oil profile
The oil profile data used in this study were collected from two separate characterization efforts. The first one was completed before 2006 on 2934 accessions of cultivated flax (Diederichsen and Raney 2006) and the second one was performed  Origin of country, following ISO 3166-1 alpha-3 country code. UN = unknown origin, but the seed source is shown with a number in parentheses: Greece. Both characterizations employed the same experimental procedures as described in Diederichsen and Raney (2006). Briefly, the seed oil content was measured using continuous wave nuclear magnetic resonance spectroscopy based on a sample of 10 g of flax seed at 3-4% water content. The fatty acid composition of the seed oil was analyzed by gas chromatography.

DNA extraction
Plants were grown from seed for 2-3 weeks for cultivated flax and up to 2 months for pale flax in a greenhouse at the Saskatoon Research Centre, Agriculture and Agri-Food Canada. Young leaves were individually collected, freeze-dried [in a Labconco Freeze Dry System (Kansas City, MO, USA) for 1-3 days], and stored at -20 • C. A freeze-dried leaf sample of one individual plant from each accession was selected, and its genomic DNA was extracted with the DNEasy Plant Mini kit (Qiagen, Mississauga, Ontario, Canada). Extracted DNA was quantified with a Thermo Scientific NanoDrop 8000 spectrometer (Fisher Scientific Canada, Toronto, Ontario, Canada).

PCR and sequencing
The protocols and procedures to amplify and to sequence the sad2 locus were given in Allaby et al. (2005). Briefly, two sets of PCR primer pairs were applied to amplify the whole region of the sad2 locus. PCR was performed on either a DYAD or PTC-200 thermocycler (Bio Rad, Mississauga, Ontario, Canada) and the PCR products were separated on 2% agarose (Sigma, Oakville, Ontario, Canada). Amplicons were excised from agarose gel, purified using a QiaQuick Gel purification kit (Qiagen), and resuspended in 16-μl Qiagen elution buffer. Sequencing was done using an Applied Biosystems capillary DNA sequencer (DNA Technologies Unit, Plant Biotechnology Institute, National Research Council of Canada, Saskatoon, Saskatchewan, Canada).

Sequence analysis
All sequencing products were assembled with Vector NTI Suite's ContigExpress v9.0.0 (Invitrogen, Carlsbad, CA) and aligned using MUSCLE v3.6 (Edgar 2004). All aligned sequences were deposited into GenBank under accessions JN653341-JN653453. Population genetic analyses of aligned DNA sequences were performed using DnaSP program (Librado and Rozas 2009). Several measures of sequence variation were obtained, and they are the number of segregating sites, haplotype number, nucleotide diversity (π; Tajima 1983), the signal of selection (i.e., deviation from neutrality; Tajima 1989;Fu and Li 1993), and the frequency of recombination (i.e., the minimum number of recombination events; Hudson and Kaplan 1985). The comparative diversity analyses were also done for various groups of flax germplasm. Haplotype analyses with and without gaps and indels were performed using DnaSP program. The positions of SNPs and indels for each haplotype were generated. An analysis of molecular variance (AMOVA) was also performed using Arlequin v3.01 (Excoffier et al. 2005) to quantify nucleotide variation between species and among various groups of Linum accessions. Three models of genetic structuring were considered: pale versus cultivated flax, two originating groups of pale flax, and five groups of cultivated flax. The significance of variance components and intergroup genetic distances for each model was tested with 10,010 random permutations.
A network analysis was applied to display phylogenetic relationships among taxa because this approach allows for extant ancestral states in which taxa occupy internal node positions, and reticulate relationships caused by character conflict, such as those resulting from recombination events. Briefly, networks provide a graphical approach to describing character conflict, instances where characters support different trees, as reticulations. The resulting graphs may then be interpreted as either containing all the most parsimonious trees, or as a visualization of recombination events. The phylogenetic network of the studied accessions was constructed as described previously (Allaby and Brown 2001;Allaby et al. 2005). A deletion of 46 nucleotides occurred at position 562 of the alignment, which was used as a character in building the network. The phylogenetic topology of the network was confirmed through maximum likelihood (fastDNAml; Olsen et al. 1994), neighbor-joining analyses (NEIGHBOR; Felsenstein 1989), and NeighborNet (SplitsTree; Huson and Bryant 2006).
The date estimates for nodes from the network were corroborated using the Bayesian MCMC approach implemented in BEAST v1.4 (Drummond and Rambaut 2007). Maximum clade credibility (MCC) phylogenies were generated using the node II/III split calibrated under a uniform prior with a range of 11,000-9500 yBP, under a GTR model with gamma distribution for site heterogeneity and a relaxed uncorrelated lognormal clock. Three tree prior models were investigated: (1) with tree prior as constant size; (2) with tree prior as expansion growth; and (3) with tree prior as exponential growth. The rest of the options were applied with default values. The Bayesian MCMC approach should be more informative for dating a lineage involving recombination events, as it directly calculates ultrametric phylogenies based only on sequence data and model parameters and incorporates both the branch length errors and the topological uncertainties (Rutschmann 2006).

Oil profile
A considerably higher ratio of 18:1 oleoyl-ACP to precursor saturated fatty acids was obtained for the assayed groups of cultivated flax than the pale flax samples ( Table 2). The ratio of total unsaturated to saturated precursor fatty acids, although not statistically significant, was generally higher in the cultivated, than pale, flax samples. These two sets of oil data can be interpreted as either an increase in total unsaturated fatty acids, or a decrease in the saturated acid precursors, or both. Figure 1 illustrates the salient features of fatty acid metabolism considered here. While an increase in the unsaturated fatty acid products would naturally be expected to lead to a decrease in unsaturated precursors, a second sink for the latter occurs through the production of long-chain  (  saturated acids. However, the ratio of long-chain saturated fatty acids to the precursors is generally less in the cultivated, than pale, flax samples ( Table 2), indicating that this path did not explain the decrease in precursor saturated fatty acids and indeed less long chain saturated fatty acids were produced in cultivated flax. Thus, there was an increase in the product of the sad2 locus, 18:1 oleoyl-ACP. Such increase could be either due to a higher productivity of the sad2 locus or a decrease in productivity downstream in the metabolic pathway at loci such as fad2, fad3, or fae1, so causing an accumulation of oleoyl-ACP. However, if the latter were to entirely explain the high levels of oleoyl-ACP, one would not expect the increase in the overall unsaturated to saturated fatty acid ratio observed. The downstream products in the metabolic pathway after the production of oleoyl-ACP are present in quantities approximately 20-fold higher than oleoyl-ACP. Thus, it is not surprising that the shift in ratio to saturated precursors is less pronounced for unsaturated acids as a whole than for just oleoyl-ACP. The oil composition data support that an increased productivity of the sad2 locus in cultivated flax was associated with the increased unsaturated fatty acid content. Based on this oil profile, we can reason that the sad2 locus is a candidate domestication-associated locus.

Nucleotide polymorphism
The aligned nucleotide sequences of sad2 amplified from 113 accessions are 2560 bp in length, covering three exons, two introns, and the upstream and downstream flanking regions (     Table 1), and the composition in the columns 3-41. The pale flax samples from western Turkey are highlighted with italic and bold. The numbers in the composition columns are the positions of substitutions. Different background colors were used to make the haplotype identification easier.
accessions collected from Turkey and the remaining 62 accessions of cultivated flax. These findings support the existence of two distinctive backgrounds in pale flax germplasm collected from Turkey (Uysal et al. 2011). A large set of pale flax accessions was more closely related to cultivated flax (Fig. 2). An overall average pairwise nucleotide diversity of 0.00371 was obtained across all 113 samples. Deviation from neutrality was not significant with Tajima's D of 0.8856 (at P > 0.10), but significant by Fu and Li's D * and F * test statistics (D * = 2.0778 at P < 0.02 and F * = 1.9155 at P < 0.05, respectively; Fu and Li 1993). There were 13 synonymous mutations observed in all three exons, but only one nonsynonymous change at exon 3 (at position 2245) from serine to proline in the sad2 protein was detected in cultivated flax. Such a nonsynonymous change was also observed in the Genbank accession AJ006958. revealed several more patterns of genetic diversity at the locus (Table 3). First, as expected, there was more genetic variation in pale flax than cultivated flax. The pale flax had 34 polymorphic sites with a nucleotide diversity of 0.00515, while the cultivated flax had 17 polymorphic sites with a nucleotide diversity of 0.00167. There were 14 polymorphic sites shared by two species, 20 unique to pale flax, and only four unique to cultivated flax (Fig. 2). Second, a significant deviation from neutrality was found only in intron 2 with Tajima's D of 2.4867, resulting in an overall significant selection observed in 43 pale flax accessions. However, such neutrality deviation disappeared when pale flax accessions were separated based on country origin between Turkey and other countries. Clearly, the pale flax accessions from Turkey had more variation than those from other countries with nucleotide diversity values of 0.00435 and 0.00274, respectively. Third, among various groups of cultivated flax, winter flax had the largest nucleotide diversity at the locus (0.00099) with four haplotypes, followed by the oil flax (0.00077) with four haplotypes, landrace flax (0.00074) with three haplotypes, and fiber flax (0.00036) with three haplotypes. All eight dehiscent flax samples had the same haplotype and one monomorphic site (at position 740) unique to its own (Table 3; Fig.  2). Quantifying nucleotide differences by AMOVA revealed 32.9% nucleotide variation present between two flax species, 44.5% between pale flax samples from Turkey and those from other countries, and 65% among five groups of cultivated flax. The largest nucleotide difference in cultivated flax was due to the unique haplotype in dehiscent flax samples, and removing dehiscent flax samples generated nonsignificant nucleotide differences among other four groups of cultivated flax.
In summary, the observed nucleotide polymorphism at the locus indicates that cultivated flax has been subjected to a reduced genetic diversity either through a population bottleneck or selection undetected in this study, probably during the domestication process. The genetic diversity is not bilaterally partitioned between pale and cultivated flax, suggesting that the cultivated flax gene pool represents multiple samples of the pale flax gene pool during the domestication process. However, evidence for selection at the sad2 locus was not strong as revealed with Tajima's D or Fu and Li's D * and F * test statistics.

Phylogenetic network
A network was constructed from 113 samples in this study (Fig. 3). Eleven nodes labeled from I to XI were detected and represented 11 haplotypes across all the samples. The pale flax had five private nodes (I, II, IV, V, XI) and one node (IX) shared with cultivated flax. The largest pale flax node (I), including all four accessions from Greece and some accessions from Turkey and other countries, was distant from cultivated flax, while the others were closely associated with some groups of cultivated flax. The node II of four pale flax samples was closely associated with the node III of eight dehiscent flax samples. The other four nodes (IV, V, X, XI) of 24 pale flax samples collected largely from northern Turkey formed some degree of reticulation with cultivated flax. For cultivated flax, the oil flax occupied four nodes (VI, VII, IX, and X), fiber flax three nodes (VIII, IX, X), and winter flax four nodes (VI, VIII, IX, X). Also, the oil flax appeared to have one private node (VII), while the fiber flax had none. The largest node IX had 36 samples representing the pale flax from Turkey and four groups of cultivated flax (landrace, oil, fiber, and winter). The node X was generated by a nonsynonymous substitution (at the position 2245) and consisted of four groups of cultivated flax (landrace, oil, fiber, and winter). Moreover, the oil and winter flax samples shared one more substitution (at the position 332) with the pale flax samples and seem to be more directly linked to the pale flax from Turkey than the fiber flax. The most likely domestication common ancestor (DCA) detected with the shortest branch length (i.e., with the fewest homoplasies) was consistent with those from the previous analysis with an outgroup (Allaby et al. 2005). It is interesting to note that the dehiscent flax samples from four different countries shared the same distinct genotype, while the locus-specific divergence among other four groups of cultivated flax was not clear-cut.
The network has one large area of reticulation. The recombination analysis (Hudson and Kaplan 1985) performed with DnaSP program revealed at least three recombination events between the sites (26, 332), (335, 1508), and (1729, 2349), respectively. Five recombination events were detected following the methodology of Fu and Allaby (2010) and labeled in Figure 3. The probability of a mutation being homoplasious in the alignment is 1/3P (Forster et al. 1996) where P = 1/2560 in this study. The network contains 39 substitutions, and the probability of any one mutation being homoplasious in the network is 0.005 [= (1/3) × (1/2560) × 39], leading to an expectation of 0.19 homoplasies in total. However, a total of seven homoplasies were observed (1.14 × 10 -9 ) in the network. For each recombination event, two ancestral nodes are possible, corresponding opposite corners of the associated reticulation. All reticulations were found to be significant at the 5% level, and R3-R5 were significant at the 1% level (Table 4). Our analysis revealed two more recombination events than those detected following Hudson and Kaplan (1985). Therefore, the reticulations in the network largely represent recombination events rather than homoplasies.

Dating flax haplotypes
Pale flax samples were closely associated with groups III and IX, offering some inference of divergence time for these lineages. It seems likely that a group of pale flaxes close to node  Table 1). The size of node circles relates to sample frequency. The numbers by branches indicate the positions of substitutions detected for that branch. Character conflicts are described as reticulations within the network. The position of the most likely domestication common ancestor (DCA) to all alleles in cultivated flax is indicated by DCA. Five recombination events were detected and numerically labeled.
VI either still exists but has not been sampled or has gone extinct, given the proximity in the network of pale flaxes to the other cultivated groups. The node leading to group III for the dehiscent type of cultivated flax appears to be the deepest in the network for which there are closely related pale flaxes. This suggests that this group is probably the oldest of the cultivated flax groups, which is in agreement with previous EST-SSR data (Fu 2011). Therefore, the node leading to group III makes the most reasonable calibration point using a domestication period of 10,000 yBP (Hillman 1975), which yields a reasonable rate of 3.9 × 10 -8 subs/site/year (Wolfe et al. 1989). Given this rate of change, the DCA becomes 33,000 years old and the cultivated lineage founded in group IX began roughly 3300 years ago. This rate estimate contrasts with previous estimates (Allaby et al. 2005) in which the DCA node on a much simpler network was used as the basis of a 10,000 years calibration leading to a rate estimate of 1.71 × 10 -7 subs/site/year, which is about 10-fold higher than would be expected for a plant synonymous substitution rate.
These dates were further investigated with BEAST v1.4, using as a calibration point the II/III group split to a time ranging from 11,000to 9500 yBP. Three MCC trees obtained using BEAST v1.4 are shown in Figure 4. Interestingly, the three models correctly identified each member of 11 haplotypes with one exception and generated similar topologies that are compatible with the network shown in Figure 3. The exception occurred under the exponential expansion model where groups IX and X become an unresolved polytomy, which is a minor deviation in topology due to the effects of recent expansion. The topology of the constant sized population matched the topology of the network most closely, while the expansion growth model placed group VIII as a c 2011 The Authors. Published by Blackwell Publishing Ltd. sister taxon to group X, and the exponential growth model failed to resolve groups IX and X. The ages of the nodes are shown on the trees expressed as yBP. The age estimates obtained overlap with the estimates from the network in the case of the constant population size and expansion models. The exponential model, however, poorly fitted the data with the formation of the dehiscent and associated pale flax clade being close to the root of the entire tree despite only two substitutions along these branches. Under the constant population size model, the origin of the IX lineage is around 6045 yBP, and the DCA node is dated to around 15,861 yBP with a wide margin of error extending to around 47,000 yBP. The expansion model, which is perhaps more likely to reflect the true underlying population process for cultivated flax at least, yielded younger dates, with the IX lineage being around 3097 yBP, which is close to the network estimate. In this case, the DCA node is very young at around 6247 yBP, also with a wide error extending to 26,000 yBP. In reality, it is likely that the true underlying population process that gave rise to this phylogeny would have been a combination of long-term constant population size for the pale flax populations, followed by an expanding cultivated population. The existence of group XI suggests that the DCA node would have been represented more likely by pale flax rather than cultivated flax. Consequently, the DCA node probably relates to a time before an expansion process associated with cultivation would have taken place, leading to the optimum date under the expansion model of 6247 years probably being inappropriate.

Flax domestication history at the sad2 locus
The level of unsaturated fatty acids in flax seeds increased during domestication involving an apparent increased productivity of the sad2 locus, indicating that the sad2 locus may be considered a candidate domestication locus. However, the mechanism of increased productivity has not yet been discovered. It is highly possible that the sad2 locus may not be the only candidate locus, as other cis-and trans-acting loci may have been involved with the fatty acid metabolism. Thus, the true contribution of the sad2 sequence variation to the difference observed in fatty acid composition between the cultivated and pale flax samples remains unknown. Either way, the phylogenetic reconstruction of the sad2 locus reflects a specific history associated with increased unsaturated oil production in cultivated flax. The network analysis involving a large set of pale flax and four groups of cultivated flax revealed a complex domestication history of flax that has not been previously observed. The pale flax displayed two different groupings in agreement with other studies (Uysal et al. 2011). One group represents two lineages (I and II) including pale flax samples collected from different countries including western Turkey and Greece and has an indel shared with the dehiscent type of cultivated flax (III). Comparison with the sad1 sequence, which does not have the deleted character state, indicates that this indel is a deletion in the branches leading to groups I to III rather than an insertion in groups IV and above. The second group (XI) represents only the pale flax samples from northern Turkey along the Black Sea coast and has a genetic background shared more closely with the indehiscent groups of cultivated flax (VI, VII, VIII, IX, and X). These genetic associations expand on the previous observation of the basal position of the dehiscent flax group (Fu 2011). These data indicate that the dehiscent cultivated flax lineage should be regarded as an independent domestication. The molecular dating used in this study confirms the early nature of this domestication, before the subsequent domestication process that led to the indehiscent cultivated flax groups. Interestingly, the oil profile data (Table 2) indicate that the increase in unsaturated fats had occurred in the dehiscent cultivated flax lineage, but loss of seed dispersal through capsular indehiscence had not. This suggests that selection for oil composition came before loss of seed dispersal, and that the dehiscent cultivated flax lineage represents an alternative or incomplete domestication trajectory as compared to the other cultivated flax groups. This order of trait fixation is similar to the case of cereals in which loss of seed dispersal was a trait that was fixed late in the domestication process (Tanno and Willcox 2006;Fuller 2007).
The indehiscent cultivated flax groups appear to represent a domestication process that may have involved more than one domestication. The close proximity of group IX to the pale flax group XI suggests that this is a separate domestication to that associated with group VI. An alternative explanation could be that the indehiscent cultivated flax was domesticated from a genetically diverse population (Charlesworth 2010) that has maintained two distinct lineages. However, the distinct geographical clustering of the pale flax suggests that the pale flax populations tend not to be so diverse, making this is a less likely explanation based on current evidence.
Oil flax varieties occur in both the indehiscent cultivated flax lineages, but fiber varieties appear to be restricted to the IX-X lineage. Note that group VIII, which includes fiber varieties, was formed through a recombination event between groups IX and VII, so these fiber accessions should be considered as part of the IX-X lineage. This phylogenetic restriction suggests that flax was used for oil before fiber, which agreed to previous studies (Allaby et al. 2005;Fu and Allaby 2010). If the sad2 locus was directly responsible for the increase in unsaturated oil composition, then the phylogenetic pattern in Figure 3 suggests that parallel changes happened in the dehiscent cultivated flax and indehiscent cultivated flax groups. The data support multiple independent pathways of domestication of flax for oil composition. Therefore, it is likely that fiber varieties evolved from a lineage of flax domesticated for oil. In support of this scenario, the oil profile data indicate that fiber flax also showed increased unsaturated fatty acid content despite its usage, suggesting that this is a vestigial feature of fiber flax. The dating analysis further supports this scenario, with an origin of the fiber lineages occurring around 3000 years ago.

Domestication-associated locus-specific analysis
The analysis of the sad2 locus revealed a complex history of cultivated flax that is informative about the origins of oil, fiber, and dehiscent varieties despite the apparent c 2011 The Authors. Published by Blackwell Publishing Ltd.
restriction that the locus is specifically associated with the oil production. The results expand on, rather than conflict with, earlier studies (Allaby et al. 2005;Fu 2011). It may be relevant that the trait of oil composition is clearly primary, and so underlies all the flax varieties which all have the trait despite not necessarily being exploited for it. However, it is less clear what is revealed about winter tolerance. The winter tolerant varieties were not topologically restricted in the network as in the case of fiber varieties. It therefore seems likely that winter tolerance preceded fiber production in these lineages, but we have no resolution between oil production and winter tolerance in the indehiscent cultivated flax samples. The oil profile data demonstrate that oil production has been enhanced in all varieties making it much more likely that this domestication-associated trait occurred before winter tolerance. It was not until flax spread from the Near East into the Danube valley some time after the initial domestication that winter tolerance was required (Helbaek 1959;Diederichsen and Hammer 1995). In this case, it is likely that increased phylogenetic resolution could be obtained through the study of a locus specifically associated with winter tolerance.
We suggest that there are some general principles for domestication-locus specific studies, which should be considered in future studies. First, the domestication processes influenced flax traits, and loci governing different traits may have different patterns of genetic diversity, depending on different selection processes and the underlying genetics of the target traits (Fu 2011). Thus, it is important to analyze as many domestication loci as possible to infer the processes in which domestication traits were acquired by cultivated plants. This is, particularly true for the candidate domestication loci without direct function evidence. The sad2 locus is a candidate domestication locus, but not necessarily the most informative one. Inferences based on other related candidate loci such as the fad loci (Banik et al. 2011) may help to expand the historical view presented here. Second, a direct inference of causative domestication loci should always be encouraged for high resolution. However, most loci governing domestication traits are not cloned and sequenced in crops such as flax and so may not be accessible to such inference , which thus limits the power of the locus-specific analysis Blackman et al. 2011). Third, the pattern of genetic diversity in an influenced locus depends in part on the degree of human-mediated selection that has acted on various target traits, and any single locus may capture a variable domestication signal. In this case, the sad2 locus may carry more information on oil selection than the other domestication traits of fiber production, winter habit, and dehiscence, so a bias toward oil selection could exist.

Conclusion
The domestication-associated locus-specific analysis in this study has revealed a complex picture of flax domestication involving multiple paths of domestication, initially for oil. An independent alternative or incomplete domestication trajectory occurred in the dehiscent flax group in which the loss of seed dispersal did not occur. It may be the case that the human-mediated selection pressures were different for these plants than for the indehiscent cultivated flax. Furthermore, a recent origin of fiber varieties is apparent, probably in the order of 3000 yBP. Consequently, it is apparent that despite being a locus that is associated with oil rather than fiber or dehiscence, sad2 has been informative to a degree for more than just oil varieties. However, it is clear that there is a limit to the resolution achievable in that little could be resolved about winter tolerance other than it occurred prior to fiber varieties, and the locus-specific approach would be enhanced by considering more loci relevant to the other traits also, and within the wider context of genome-wide information.