- Top of page
- Materials and methods
- Data Accessibility
- Supporting Information
Although the Compositae harbours only two major food crops, sunflower and lettuce, many other species in this family are utilized by humans and have experienced various levels of domestication. Here, we have used next-generation sequencing technology to develop 15 reference transcriptome assemblies for Compositae crops or their wild relatives. These data allow us to gain insight into the evolutionary and genomic consequences of plant domestication. Specifically, we performed Illumina sequencing of Cichorium endivia, Cichorium intybus, Echinacea angustifolia, Iva annua, Helianthus tuberosus, Dahlia hybrida, Leontodon taraxacoides and Glebionis segetum, as well 454 sequencing of Guizotia scabra, Stevia rebaudiana, Parthenium argentatum and Smallanthus sonchifolius. Illumina reads were assembled using Trinity, and 454 reads were assembled using MIRA and CAP3. We evaluated the coverage of the transcriptomes using BLASTX analysis of a set of ultra-conserved orthologs (UCOs) and recovered most of these genes (88–98%). We found a correlation between contig length and read length for the 454 assemblies, and greater contig lengths for the 454 compared with the Illumina assemblies. This suggests that longer reads can aid in the assembly of more complete transcripts. Finally, we compared the divergence of orthologs at synonymous sites (Ks) between Compositae crops and their wild relatives and found greater divergence when the progenitors were self-incompatible. We also found greater divergence between pairs of taxa that had some evidence of postzygotic isolation. For several more distantly related congeners, such as chicory and endive, we identified a signature of introgression in the distribution of Ks values.
- Top of page
- Materials and methods
- Data Accessibility
- Supporting Information
The domestication of plants represented a critical development in human history that permitted the establishment of large, sedentary civilizations. Consequently, unravelling the origin of crops, as well as the molecular and genetic changes that accompany domestication and crop diversification represents an important undertaking. Until recently, such studies were confined to a handful of crops of major economic and nutritional importance, such as maize, wheat and rice (Burger et al. 2008). However, advances in next-generation sequencing technology have allowed the extension of genomic knowledge beyond these species to a wider array of crops and their wild relatives (e.g. Dempewolf et al. 2010; Agarwal et al. 2012; Scaglione et al. 2012). This information will not only be an important agronomic resource, but it will also improve the understanding of the genomic basis of domestication and adaptation.
The Compositae (Asteraceae) is one of the largest and most successful flowering plant families. Despite the large number of species in this family, only two – sunflower and lettuce – have become major food crops. However, there are many other species in the Compositae that have been cultivated by humans and attained various degrees of domestication. Although the number of species in the Compositae that have been strongly domesticated is disproportionately small compared with some other groups, such as the Fabaceae or Poaceae, no other family has been cultivated for such a wide variety of uses (Dempewolf et al. 2008). Species in the family have been domesticated for seed oil (e.g. sunflower), edible leaves (e.g. lettuce), edible inflorescences or stems (e.g. globe artichoke), tubers and roots (e.g. yacon), phytochemicals (e.g. guayule), and ornamental flowers (e.g. gerbera). This diversity of uses makes investigations into the genomic basis of domestication in this group particularly interesting.
Here, we describe the development of genomic resources for 12 Compositae species (Table 1): Cichorium endivia (chicory, wild and cultivated), Cichorium intybus (endive, wild and cultivated), Echinacea angustifolia, Iva annua (sumpweed), Helianthus tuberosus (Jerusalem artichoke), Dahlia hybrida, Leontodon taraxacoides, Glebionis segetum (corn chrysanthemum), Guizotia scabra ssp. schimperii, Stevia rebaudiana (sweetleaf), Parthenium argentatum (guayule) and Smallanthus sonchifolius (yacón). Most of these species are crops, or crop wild relatives, and have been cultivated for a wide variety of uses. Similar to lettuce, chicory and endive are native to the Old World and are grown mainly for their edible leaves, although chicory is also grown for its tubers (Kiers et al. 2000). Echinacea, native to North America, is cultivated for its believed immunostimulator properties (Percival 2000). There are several oil producing seed crops in the Compositae including sunflower, safflower (Carthamus tinctorius) and noug (Guizotia abyssinica), of which G. scabra ssp. schimperii is thought to be its closest living wild relative. Sumpweed was once cultivated by North American First Nations people for its edible seeds, but was abandoned prior to the arrival of Europeans, perhaps due to its allergenic properties (Diamond 1997). Yacón and Jerusalem artichoke, both of which are New World crops, have been domesticated for their inulin-rich tuberous roots (Dempewolf et al. 2008). Many species of the Compositae are cultivated as ornamentals, including dahlias, originating mainly in Mexico (Saar et al. 2003), and chrysanthemums. Corn chrysanthemum, native to Europe and the Mediterranean, is grown as an ornamental, but is no longer considered to be part of the economically important florist chrysanthemum genus (Paciolla et al. 2010). Sweetleaf, native to Paraguay, is propagated as a sweetener (Brandle et al. 1998), and guayule, native to the south-west United States and Mexico, is cultivated as a source of natural rubber (Ray 1993). We have also sequenced the transcriptome of Leontodon taraxacoides, a weed originating in Europe and introduced into the United States, which is being developed as a small genome model for the Compositae. The genome size of this species (0.29 Gb, E. Baack & L. H. Rieseberg, unpublished) is dwarfed by other members of the family, which usually have genomes exceeding 1 Gb (Bennett & Leitch 2012).
Table 1. Location information for Compositae crops and their wild relatives targeted in this study and the tissue type sampled
|Taxon||Common name||Collection locality||Collection ID||Tissue type|
|Cichorium endivia ssp. pumilum (wild)||Endive||Pakistan||PI 652029||Seedling|
|Cichorium endivia ssp. endivia (cultivar)||Endive||Germany||PI 503595||Seedling|
|Cichorium intybus (wild)||Chicory||Krasnodar, Russian Federation (latitude 45.033, longitude 35.977)||PI 652028||Seedling|
|Cichorium intybus (cultivar)||Chicory – Witloof||Germany||PI 504468||Seedling|
| Dahlia hybrida ||Dahlia ‘Thomas Edison’||NA||NA||Leaves, flowers|
|Echinacea angustifolia (wild)||Coneflower||Oklahoma, United States, Section 24, T19N, R2W, Logan County 36.1–97.367||PI 421331||Seedling|
| Glebionis segetum ||Corn chrysanthemum||Cleden-cap-Sizun, Finistere, France||PI 586603||Seedling|
|Glebionis segetum (wild)||Corn chrysanthemum||Porto, Portugal. Between Lordelo do Ouro and Porto, Douro Litoral Province. Latitude 41.15, Longitude -8.633||PI 641689||Seedling|
| Iva annua ||Sumpweed||Granite City, IL Latitude 38.804 Longitude -90.114||NA||Seedling|
|Leontodon taraxacoides Lam. ssp. saxatilis||Lesser hawkbit||Oregon, Benton City, OSU campus, vacant lot at corner of SW 11th St and Washington||NA||Seedling|
|Helianthus tuberosus (wild)||Jerusalem artichoke||Ohio, United States, Hwy. 81W, 16.8 km west of Ada, Allen County. Latitude 40.733, Longitude -84.017||PI 547230||Seedling|
|Guizotia scabra (wild) ssp. schimperii||Mech||Jimma. Ethiopia. 5 km from Jimma on the way to Bonga, 1775 m evolution. Latitude 7.626, Longitude 36.760||RC-4||Seedling|
| Stevia rebaudiana ||Sweetleaf||Garden origin, West Coast Seeds, B.C. Canada||NA||Seedling|
| Parthenium argentatum ||Guayule||NA||PI 478640||Roots, leaves, flowers, stem|
| Smallanthus sonchifolius ||Yacon||Peru||CIP 205029||Roots, leaves, flowers, stem|
The Compositae also harbours many of the world's most notorious weeds and several Compositae crops are closely related to weedy taxa. Genomic resources will be valuable for detecting gene flow between various crops and their wild and weedy relatives. Such gene flow can have implications for the spread of genetically engineered genes from crops into wild species (Ellstrand 2003; Snow et al. 2003) or contamination of seed lots by foreign germplasm (Bateman 1947a,b; Warburton et al. 2011). More generally, the study of gene flow between domesticated species and their progenitors could give insight into the strength of reproductive barriers and the process of speciation (Dempewolf et al. 2012), as well as the evolutionary consequences of hybridization and introgression (Hufford et al. 2013). Although there have been an increasing number of studies using genetic markers to estimate gene flow between cultivated and weedy populations (Arias & Rieseberg 1994; Ellstrand 2003; Song et al. 2003; Hufford et al. 2013), there have been few genome-wide studies especially across multiple crop/wild species pairs.
Two Compositae crops, lettuce and sunflower are particularly interesting with respect to their histories of domestication and invasiveness. Sunflower was domesticated in North America, yet today, H. annuus and many taxa in the genus are naturalized or invasive in Europe (Rehorek 1997; Forman 2003). High levels of gene flow between cultivated and weedy sunflower are known to occur (Arias & Rieseberg 1994; Linder et al. 1998; Burke et al. 2002) fuelling debates about transgene escape in the evolution of ‘super weeds’ (e.g. Burke & Rieseberg 2003; Ellstrand 2003; Snow et al. 2003). Lettuce was domesticated in the Mediterranean region, yet its progenitor, prickly lettuce (L. serriola), and several other taxa in the genus are considered weeds in the United States. There are also concerns about wild-crop gene flow in a number of other Compositae crops, including chicory (Kiær et al. 2007, 2009) and safflower (Berville et al. 2005a). These concerns are often well founded due to sympatry of crops with weedy and wild relatives, high outcrossing rates, and few postzygotic barriers between crops and their progenitors or feral weeds.
The genomic resources that we have developed are a valuable resource for future population and comparative genomic analyses of crop and weed evolution. We also compared de novo assemblies across platforms and correlated features of the read sets and the final assemblies to examine the impact of different sequencing strategies on the quality of the final assemblies. In addition, we used the resulting assemblies as well as others previously generated by the Compositae Genome Project, to examine divergence between crops and their putative progenitors and to consider evidence for introgression between crops and their wild relatives.
L.H.R., J.M.B., R.W.M. and Z.L. designed the research. Z.L., L.O. and M.S. performed the laboratory work. K.A.H., M.B. and N.C.K. performed the assemblies. A.K. developed the UCOs set. K.A.H. conducted the analysis and wrote the paper. D.W.S., R.V.K., L.H.R., Z.L., M.S. and H.D. provided tissue. All authors contributed to the final version of the study.