Ancient DNA reveals the timing and persistence of organellar genetic bottlenecks over 3,000 years of sunflower domestication and improvement

Abstract Here, we report a comprehensive paleogenomic study of archaeological and ethnographic sunflower remains that provides significant new insights into the process of domestication of this important crop. DNA from both ancient and historic contexts yielded high proportions of endogenous DNA, and although archaeological DNA was found to be highly degraded, it still provided sufficient coverage to analyze genetic changes over time. Shotgun sequencing data from specimens from the Eden's Bluff archaeological site in Arkansas yielded organellar DNA sequence from specimens up to 3,100 years old. Their sequences match those of modern cultivated sunflowers and are consistent with an early domestication bottleneck in this species. Our findings also suggest that recent breeding of sunflowers has led to a loss of genetic diversity that was present only a century ago in Native American landraces. These breeding episodes also left a profound signature on the mitochondrial and plastid haplotypes in cultivars, as two types were intentionally introduced from other Helianthus species for crop improvement. These findings gained from ancient and historic sunflower specimens underscore how future in‐depth gene‐based analyses can advance our understanding of the pace and targets of selection during the domestication of sunflower and other crop species.


| INTRODUCTION
Over the last 12,000 years, human populations in many different regions of the world independently domesticated local plant species by selecting for desirable traits, in many cases initiating a symbiotic partnership that formed the economic foundation of complex societies (Zeder, 2015). Researchers have identified over a dozen centers of plant domestication (Purugganan & Fuller, 2009), and gaining a refined understanding of the varied evolutionary trajectories that have led to the emergence of key crops requires investigating the cultivars and the archaeological context found in each of the world's independent centers of domestication. Eastern North America (ENA) presents a useful case to examine initial plant domestication and millennial-scale changes in agriculture (Smith, 2011), in part because its archaeological record challenges the paradigm that domestication is an evolutionary strategy implemented when expanding human populations experience declining resource catchments (Smith, 2016).
Starting around 4000 years before present (BP), a crop complex consisting of acorn/crookneck squash (Cucurbita pepo L. ssp. ovifera D.S. Decker), goosefoot (Chenopodium berlandieri Moq.), marshelder (Iva annua L.), and the common sunflower (Helianthus annuus L.) was grown by low-level food-producing societies inhabiting the watershed of the Mississippi River (Smith, 2006). Archaeobotanical remains from ENA sites exhibit telltale signs of the so-called domestication syndrome (Hammer, 1984), a suite of traits that commonly distinguishes domesticates from their wild progenitors and that may include larger seeds and disruption of natural seed dispersal mechanisms. Of the four core species of the ENA crop complex, sunflower is particularly well suited for in-depth domestication research thanks to the existence of rich archaeobotanical collections (Smith, 2014), a century of breeding experiments (Heiser, 1976;Škorić, 1992), and the development of many germplasm and genomic resources for genetic investigations (Badouin et al., 2017;Burke, Tang, Knapp, & Rieseberg, 2002;Kane et al., 2011;Rieseberg & Seiler, 1990;Wills & Burke, 2007).
Through human selection, the weedy H. annuus spp. annuus was transformed from a highly branching plant with numerous small disks, also known as heads or capitula, to H. annuus spp. macrocarpus (D.C.) Ckll., the cultivated sunflower, which is typically characterized by strong apical dominance and a single massive disk that can produce hundreds to thousands of achenes. Sunflowers served important nutritional, ceremonial, medicinal, cosmetic, and structural purposes in Native American cultures. For instance, an account from 1615 by French explorer Samuel de Champlain indicates that peoples of the Iroquois Confederacy of Nations in the Great Lakes region of North America cultivated sunflower, grinding and eating the seeds as well as processing them into oil used ceremonially for anointing the hair (Heiser, 1951). After roasting sunflower achenes in clay pots or reed baskets, the Mandan, Arikara, and Hidatsa peoples of the Missouri River basin would make sunflower flour or boil the achenes with maize, beans, and squash to make a porridge (Heiser, 1976). The Hopi people of the American Southwest were unique in extracting a dye from the deeply purple-colored achenes of their landraces (Heiser, 1951(Heiser, , 1976. Archaeological sunflower remains have been excavated from dozens of ENA sites, enabling temporal and spatial investigations on the origins of sunflower domestication. The Koster site in Illinois yielded the oldest known sunflower remains, with two achenes and one kernel dating between 8500 and 5800 BP (Asch & Asch, 1985;Smith, 2014) ( Figure 1). Based on their small size, these specimens likely reflect the collection of wild resources (Smith, 2014). The oldest evidence for sunflower cultivation comes from the Hayes site in central Tennessee, dating to 5034-4583 BP (95% confidence interval, CI) (Crites, 1993).
Kernels from the site are larger than commonly observed in wild sunflowers, suggesting the initial steps of sunflower domestication were underway circa 4800 BP (Smith, 2014). Three other sites provide evidence of sunflower cultivation before 3000 BP (Figure 1): 3800 BP at the Riverton site in Illinois (Smith & Yarnell, 2009), 3300 BP at the Newt Kash Shelter in Kentucky (Smith, 2014), and 3050 BP at the Marble Bluff Shelter in Arkansas (Fritz, 1997).
Based on archaeological, morphological, and geographical data, Heiser (1951) concluded that sunflower was domesticated once in ENA, a hypothesis that has been supported by population genetics studies of modern elite-bred cultivars, extant Native American landraces, and wild H. annuus populations. For instance, Rieseberg and Seiler (1990) demonstrated with isozymes and chloroplast markers that domesticated landraces share haplotypes with wild sunflowers from ENA and show a signature of a genetic bottleneck. Although archaeological remains putatively identified as sunflower, some dating to 4130 BP, were subsequently recovered from excavations in Mexico and raised the possibility of an independent domestication event (Lentz, Pohl, Pope, & Wyatt, 2001), population genetic studies that include extant Mexican wild and cultivated germplasm have only found evidence that extant cultivars derive from a single ENA domestication event. Wills and Burke (2006) showed that domesticated populations have one common and two rare chloroplast microsatellite marker haplotypes that cluster with wild ENA rather than wild Mexican sunflowers.
Patterns of sequence variation at nuclear microsatellite markers and candidate domestication loci have likewise reinforced the conclusion that all extant landraces, whether collected in ENA or Mexico, descend from a single origin most likely occurring from ancestral wild populations in the eastern and central USA (Blackman et al., 2011;Harter et al., 2004).
Although archaeological and genetic data predominantly point to a single domestication event in ENA, there is much more to unearth about how sunflower domestication proceeded. It remains to be determined which traits were of primary interest to early farmers, whether sunflower domestication was rapid or protracted, and how proto-domesticates responded to the new selection regime. Genetic characterization of archaeological plant remains with ancient DNA (aDNA) methodologies has the potential to answer these questions by providing windows into past temporal dynamics. Paleogenomic research has grown tremendously in the past decade due to the rapid development of high-throughput sequencing technologies (Der Sarkissian et al., 2015), and the application of paleogenomic methods to archaeobotanical remains has been a particular success (Brown et al., 2015). For example, in reconstructing complete genomes of 6,000-year-old barley grains excavated in Israel, Mascher et al. (2016) determined the ancient samples were closely related to modern cultivars in the region and that the major steps of barley domestication were completed by this point in time. Similarly, Ramos-Madrigal et al. (2016) and Vallebueno-Estrada et al. (2016) characterized genomes of 5,000-year-old maize cobs from the Tehuacán Valley, but they instead found that many domestication-related genes had the ancestral form rather than the derived maize form, suggesting a stepwise process of domestication. Although these paleogenomic studies indicate archaeological remains could be invaluable for understanding sunflower's domestication and ancient cultivation, different plant species have the potential to confound aDNA research through species-and tissue-specific secondary compounds that interfere with DNA extraction and library preparation. To examine the paleogenomic potential of archaeobotanical sunflower remains, we screened a collection of archaeological and ethnographic specimens with a shotgun sequencing strategy. The sequencing data generated from these ancient and historic specimens were analyzed to determine variability in endogenous content, DNA damage, and sources of exogenous DNA. In addition, following precedents in mammalian aDNA projects (Dabney, Knapp, et al., 2013;Gilbert et al., 2007) and genome skimming of modern samples (Bock, Kane, Ebert, & Rieseberg, 2014;Straub et al., 2012), we leveraged the sequencing data to characterize variation in high copy number mitochondrial and plastid genomes, allowing us to investigate how these and other archaeological and historic specimens may enrich our understanding of the domestication process.

| Archaeological sunflower specimens
Although archaeobotanical remains are most often preserved by charring or carbonization, such materials are generally incompatible with paleogenomic analyses (Nistelberger, Smith, Wales, Star, & Boessenkool, 2016). Therefore, we only obtained and processed desiccated specimens for this study. We tested 15 sunflower disk fragments, one pericarp (seed coat), and one kernel, all of which originate F I G U R E 1 Map of sampling locations and archaeological sites. Ethnographic samples (and number of accessions sampled) are in red, and landraces are in blue. Archaeological sites with ancient sunflower material discussed in the text are marked by yellow circles. Eden's Bluff, the site from which all archaeological remains detailed in this article were sampled, is bolded from the Eden's Bluff archaeological site in northwestern Arkansas (Figures 1 and S1; Table 1). The specimens have been under the curation of the University of Arkansas Collections Facility (UARK) and the University of Michigan Museum of Archaeological Anthropology (UMMAA). Thirteen disks that were sufficiently intact to enable diameter measurements ranged in size from 35 to 110 mm (mean = 75.5) and were all larger in this dimension than disks of a well-defined wild H. annuus population (Smith, 2014), indicating the archaeological disks represent plants cultivated by humans. Likewise, the dimensions of the archaeological pericarp (length = 9.1 mm) and kernel (length × width = 6.5 × 3.6 mm) are consistent with origin from domesticated sunflowers.
Eden's Bluff (state site ID: 3BE6) was excavated in 1932 and 1934 as a part of expeditions led by the University of Arkansas focused on the so-called Ozark Bluff-Dweller sites, as coined by Harrington (Harrington, 1924a(Harrington, , 1924b(Harrington, , 1960. These sites are renowned for their preservation of organic remains, including desiccated plant tissues (Fritz, 1986;Gilmore, 1931). Native Americans likely used the rockshelters and caves specifically because their dry conditions were well suited for long-term food storage and, despite the name, are unlikely to have served as seasonal dwellings (Brown, 1984). The chronology of the Ozark Bluff-Dweller sites is not fully understood, due to the limited number of radiocarbon dates (Davis, 1967). As part of her rigorous archaeobotanical analyses, Fritz (1986) Figure S2). All AMS dates from this and other reports were calibrated to calendar years before present (calBP) using OxCal v4.3.2 (Bronk Ramsey, 2009) and the IntCal13 (Reimer et al., 2013) calibration curve.

| Ethnographic landrace achenes
Eleven accessions of sunflower landraces were acquired from ethnological collections at the National Museum of the American Indian (NMAI) and UMMAA (Table 2, Figure 1). These specimens consist of achenes sourced from Native Americans and via various intermediaries in the first half of the twentieth century by Gilmore (1919) and Heiser (1951). At the time, Heiser (1951, p. 441) lamented that "few aboriginal strains of the cultivated sunflower are still in existence, and… it is likely that the few remaining ones will disappear unless steps are taken to preserve them." While his efforts propagated many sunflower landrace lineages, some of the achenes he attempted to grow were not viable, including seed originating from the Six Nations reserve in Ontario. Thus, these ethnographic achenes offer a unique opportunity to investigate genetic relationships of putatively extinct landraces to living sunflower lineages.

| DNA extraction and sequencing
Archaeological specimens were processed at a dedicated paleogenomics laboratory at the University of Copenhagen. The laboratory meets the standards for aDNA research (Cooper & Poinar, 2000;Gilbert, Bandelt, Hofreiter, & Barnes, 2005), such as being physically separated from modern DNA and post-PCR laboratories, being outfitted with air filtration and nightly UV irradiation equipment, and requiring researchers to wear coveralls to minimize contamination.
DNA was extracted using a method that has been shown to work well on a range of species and tissue types (Wales, Andersen, Cappellini, T A B L E 1 Archaeological specimens. Accelerator mass spectrometry (AMS) dates are listed in calibrated years before present. Samples with sequencing depth of coverage (DoC) <4 for the plastome were excluded from the plastome analysis. See Figure S1 for images of most samples and DTT, and 10% proteinase K solution), and then extracted using two rounds of phenol and one round of chloroform. To minimize the effect of co-extracted compounds and pigments, the recovered DNA was purified in a Qiagen MinElute column using optimizations to retain highly fragmented DNA (Dabney, Knapp, et al., 2013). Four extraction blanks were processed with samples to monitor potential sources of contamination. The extracted DNA, including that from the extraction blanks, was converted to Illumina-compatible libraries using a blunt-ended adapter ligation approach and optimizations to retain short molecules . Before indexing PCR, the libraries were tested by quantitative PCR (qPCR) to estimate the appropriate number of cycles to avoid overamplification. qPCR was conducted with a SYBR Green assay as described by Wales et al. (2015), using AmpliTaq Gold (Applied Biosystems, Foster City, CA), primers IS7 and IS8 (Meyer & Kircher, 2010), and a Roche LightCycler 480 Real-time PCR System. Libraries were amplified with AmpliTaq Gold for 10-18 cycles (Table S1) using a P7 indexing oligo with a 6-bp sample-specific barcode to enable multiplex sequencing (Meyer & Kircher, 2010).
Libraries were pooled and shotgun-sequenced on six whole or partial lanes of an Illumina HiSeq 2500 in single-read mode with 81 or 94 sequencing cycles (Table S1).
The 11 ethnographic samples were deemed to be relatively well preserved and thus to pose a potential contamination risk to archaeological samples. Therefore, the achenes were extracted in sterilized laminar flow hood in a pre-PCR modern DNA laboratory at the University of Copenhagen where sunflowers had not been previously tested. Achenes were frozen in liquid nitrogen and fragmented with a sterile pestle. DNA was extracted with a Qiagen Plant Mini kit following the manufacturer's protocol except that the 65°C incubation was conducted for 2 hr. Many specimens exhibited high-molecularweight DNA on an agarose gel, so DNA was sheared with a Diagenode Bioruptor using an appropriate number of sonication cycles for each sample (Table S1). One accession (Seneca_striped_12997-682) was processed twice, using a whole achene and an individual kernel. DNA was converted to Illumina libraries following the same protocol used for the archaeological samples and sequenced on one lane of an Illumina HiSeq 2500 in single-read mode with 81 sequencing cycles.

| Sequencing data processing
Raw sequencing reads were processed using Paleomix 1. many endogenous reads to map to multiple loci. As we observed potential erroneous insertions of the organellar genomes in the nuclear assembly, reads were also separately mapped to the plastid genome, mitochondrial genome, and the nuclear genome without unplaced contigs; these alignments were only used for organellar genome and library complexity analyses.
T A B L E 2 Ethnographic achenes from Native American sunflower landraces. Three Seneca achenes are reported to have been collected in North Dakota (indicated with an asterisk); however, oral traditions and written records indicate these landraces originated from the traditional lands of the Seneca people near Lake Ontario To minimize potential biases arising from differences in sequencing strategies, such as higher theoretical mapping scores from pairedend than single-read data, the paired-end modern data were treated as though it was single-read data by trimming and mapping read mates separately.

| Metagenomic analysis of archaeological and ethnographic samples
To characterize non-sunflower sources of DNA isolated from archaeological and ethnographic specimens, 10,000 randomly selected trimmed, unmapped reads were compared against the NCBI nucleotide collection (nr/nt) database using the BLASTn algorithm (Altschul, Gish, Miller, Myers, & Lipman, 1990). SNPs were analyzed in R 3.3.1 (R Core Team, 2013) using the Pegas (Paradis, 2010) package to identify haplotypes, and then, haplotype relationships were visualized in popart (Leigh & Bryant, 2015) using a minimum spanning network (Bandelt, Forster, & Röhl, 1999). For construction of the haplotype networks, a total of 701 and 413 polymorphic sites were used for the plastome and mitochondrial genome, respectively. One of the oldest samples (Eden-3) together with three other archaeological samples (Eden-1, Eden-4, and Eden-17) did not satisfy our filtering parameters and thus were not included in haplotype network construction.

| Organellar nucleotide diversity analysis
Nucleotide diversity (pi) per each polymorphic site was computed using VCFtools (Danecek et al., 2011) allowing for haploid genomes (haploid switch). For each group, mean nucleotide diversity was calculated by taking average nucleotide diversity of all the sites used in haplotype network construction for chloroplast or mitochondria.
Landrace diversity metrics were calculated after excluding MexCult7 and MexCult14 because those samples were collected in local markets in Chiapas/Mexico and are likely modern cultivars as inferred from the haplotype networks.

| Chronology
AMS radiocarbon dating of the archaeobotanical remains demonstrated the specimens originate from three distinct time points: 3100, 1700, and 850 calBP ( Figure S2). Eden-3 and Eden-6 are the oldest samples, producing nearly identical AMS dates (Table S1), and thereby provide strong evidence that Eden's Bluff should be added to the short list of archaeological sites with sunflower cultivation before 3000 BP.
Eleven AMS dates fall near 1700 calBP, all of which overlap at a 95% CI from 1736 to 1711 calBP. Thus, the majority of the samples may be derived from a single occupational phase; however, these specimens are recorded as being excavated from multiple contexts, suggesting that some specimens may have been deposited decades or even a few centuries apart. Eden-2 produced the youngest date at ca. 850 calBP (Table 1). While this young disk is an outlier in the chronology of our other AMS dates, Fritz (1986) found similar dates for maize excavated from Eden's Bluff, supporting the inference that this sample belongs to a more recent occupation.

| Shotgun sequencing and endogenous content
We generated 4.

| DNA degradation
Consistent with the findings from previous paleogenomic studies, DNA recovered from the archaeological sunflowers was highly fragmented and displayed varying levels of chemical damage ( Figure S3).
The mean read length of endogenous nuclear DNA for archaeological samples ranged from 41.9 to 62.1 bp, with an overall mean of 52.6 bp (Table S1). Cytosine deamination is the principal form of damage observed in aDNA studies (Dabney, Meyer, & Pääbo, 2013), and in circumstances where contamination from modern sources is possible, especially hominin research, damage patterns can be used to discern ancient and modern sequences (Jónsson et al., 2013). During the life of a cell, cytosine residues can spontaneously convert to uracil, but they are fixed with cellular repair mechanisms. After death, these uracil residues accumulate, primarily in single-stranded overhangs, and due to the activity of polymerases used in DNA library preparations, apparent C-to-T and G-to-A transitions are observed at the 5′ and 3′ ends of sequencing reads. This damage can be visualized as ski-jump style plots ( Figure S3), with steeper slopes indicating more damage. In addition, the δS parameter calculated by mapDamage provides a probability of cytosine deamination in single-stranded contexts (Table S1)

| Exogenous DNA
Metagenomic analysis of unmapped reads revealed a complex mixture of DNA in archaeological and control samples (Figure 2). The chief contaminant across all archaeological samples is bacteria (up to 85%) with Actinobacteria primarily differentiating archaeological samples from ethnographic samples (PC1, Figure S4). The extraction controls are also dominated by bacteria, and taxa such as Proteobacteria, Actinobacteria, and Firmicutes are consistent with species commonly observed as laboratory reagent contaminants (Salter et al., 2014).
Fungi and metazoans also make up a substantial proportion of archaeological contaminants, contributing as much as 30% of read content in several samples.

Ethnographic landraces
A r ik a r a 1 2 2 9 7 6 A r ik a r a 1 2 6 3 0 6 A r ik a r a 1 4 0 4 2 − 8 7 4 A r ik a r a b r o a d 1 2 9 9 9 − 6 8 2 A r ik a r a / M a n d a n (5.9% and 1.9% compared to 2.0% and 0.5% respectively in archaeological samples). Eden-1 and Eden-2 are differentiated from other archaeological samples (PC2 in Figure S4) by high counts of Gammaproteobacteria (specifically the Pseudomonas stutzeri group in Eden-1 and Pseudomonas putida group and Enterobacteriales in Eden-2). One ethnographic sample, Arikara_122976, more closely resembles archaeological samples with lower endogenous sunflower DNA content (37.9% compared to the ethnographic average 87.6%) and a more substantial fraction of sequences originating bacterial, fungal, and metazoan contaminants. While Arikara_122976 groups with archaeological samples in the PCoA analysis ( Figure S4), it contains nearly twice as many unmapped reads assigned to fungi, with most assigned to the Sordariomycetes, as any other ethnographic sample ( Figure 2).

| Plastome analysis
We constructed two haplotype networks, one including and one The cultivated sunflower sequences-whether from archaeological or ethnographic remains, extant landraces, or modern cultivarssort into few haplotype clusters that we have denoted as Classes 1 through 4 relative to the much greater diversity observed in wild Helianthus sequences, which are nearly all unique (Figure 3; Table S3).
All Eden's Bluff archaeological specimens dating to ~1700 calBP fall in Class 1 and share the same or similar haplotypes as many ENA, southwestern, and Mexican landraces; several ethnographic samples; and the majority of modern cultivars (Figure 3a). Although Eden-8, Eden-10, and Eden-14 have distinct haplotypes, they are only one or two substitutions removed from the predominant Class 1 haplotype. Many more substitutions must be inferred to support the reticulate lineages connecting their sequences to the distinct Arikara or San Ildefonso haplotypes or to any other wild H. annuus plastome sequence, and the more resolved structure of Class 1 in the haplotype network excluding the archaeological samples suggests those alternative connections are highly unlikely to reflect the true history of descent (Figure 3b).
The two other Eden's Bluff samples for which sufficient sequence was recovered for plastome analysis-Eden-2 (850 calBP) and Eden-6 (3100 calBP)-cohere with the third most common haplotype class, Class 2. This group also includes three Mexican landraces collected from Nahua farmers (MexCult3, MexCult6, F I G U R E 3 Plastome haplotype networks constructed with wild, cultivated, landrace, ethnographic, and archaeological sunflowers (a), and plastome haplotype network constructed without the archaeological sunflowers (b). The size of the circles corresponds to number of individuals present, and the number of polymorphic sites between individual haplotypes is indicated by tick marks. Haplotype classes for each sample are included in Table S3. Class 1 is a core domestication haplotype and is composed of wild Helianthus annuus, archaeological specimens, ethnographic samples, extant landraces, and modern cultivars. Class 2 also represents a haplotype that entered the domestication process thousands of years ago; however, it is not observed in cultivars. Class 3 consists of R-type elite cultivars used in hybrid breeding, and was presumably introduced into domesticated germplasm from H. petiolaris in the 20th century; as discussed in the text, we suspect two Mexican landraces in Class 3 may originate from misidentified cultivars. Class 4 consists exclusively of elite cultivars, and was likely introduced from crop wild relatives, putatively H. argophyllus, during recent breeding for resistance to pathogens and diseases  (Lentz, Pohl, Alvarado, Tarighat, & Bye, 2008). In contrast to Class 2, Class 3, the second most common haplotype class, has a membership consisting nearly entirely of R-type modern cultivars, which are lines carrying a nuclear restorer allele for the cytoplasmic male sterility system used for hybrid sunflower breeding. Two putative Mexican landraces (MexCult7 and MexCult14) also carry the Class 3 plastome sequence, raising the possibility they are actually elite-bred material. The Class 4 haplotype sequence shared by three modern cultivars (BRS-1, HA-R2, and IR) is most similar to sequences obtained from annual Helianthus species other than H. annuus, likely reflecting a history of introgression as part of a recent breeding program. Finally, the Hidatsa landrace has a unique haplotype compared to other samples analyzed, consistent with the findings of a previous study of sunflower sequence diversity using chloroplast microsatellite markers (Wills & Burke, 2006).

| Mitochondrial genome analysis
When archaeological sequences are excluded, the haplotype network constructed for mitochondria is very similar to the plastome network. Four major cultivated haplotype classes emerge with nearly the same memberships, and thus, we use parallel nomenclature ( Figure 4, Table S3). One key difference is that the San Ildefonso ethnographic sample is more similar to the Class 1 cultivated haplotypes than to any other cultivated or wild mitochondrial sequence. Inclusion of mitochondrial sequences from the Eden's Bluff samples in network construction analysis led to poorly resolved, highly reticulate networks. In contrast to the observed plastome sequences, each of the mitochondrial haplotypes from these archaeological samples contained many apparent private mutations causing each sample to appear unique. We suspect these patterns are artifactual, likely reflecting spurious SNPs originating from short exogenous DNA sequence fragments that align to highly conserved regions or, alternatively, SNPs that originate from nuclear inserts of mitochondrial DNA (Hazkani-Covo, Zeller, & Martin, 2010;Thalmann, Hebler, Poinar, Pääbo, & Vigilant, 2004).
F I G U R E 4 Mitochondrial haplotype network constructed with wild, cultivated, landrace and ethnographic sunflowers. The size of the circles corresponds to number of individuals present, and the number of polymorphic sites between individual haplotypes is indicated by tick marks. Haplotype classes for each sample are included in Table S3. Class 1 is composed of individuals sharing the same haplotype and also those that diverge by only one or two polymorphic sites. Due to uniparental inheritance of organelles, the mitochondrial classes contain the same individuals as the plastome classes. See Figure 3 for information on the domestication haplotypes (Classes 1 and 2) and those introduced to modern cultivars during 20th-century breeding (Classes 3 and 4)

| Nucleotide diversity
The average pairwise nucleotide diversity (pi) of all groups of domesticated sunflower samples is reduced relative to wild H. annuus, consistent with a genetic bottleneck during domestication (Table 3).
This reduction is comparable for both organellar genomes. For instance, there is a 68% and 72% reduction in diversity in ethnographic samples compared to wild H. annuus in chloroplast and mitochondria, respectively. Within domesticated types, modern cultivars have higher sequence diversity relative to the ethnographic samples and landraces. However, this likely reflects the recent introgression of wild haplotypes by modern breeding, as cultivars and landraces show lower diversity as compared to the ethnographic samples when only the diversity within the major haplotype classes also present in the Eden's Bluff samples (Class 1 and 2) is considered (Table 3). We report a value for pi for the archaeological samples but note that this metric is best suited for analyses of contemporaneous individuals and that diversity within a single site is generally expected to be lower than diversity present in the broader geographical sampling represented by the sequences from wild, ethnographic, or modern cultivated material.

| Sunflower archaeological remains yield quality endogenous DNA
While aDNA studies have revealed important insights into the pace of selection during domestication in some plants (e.g., Mascher et al., 2016;Ramos-Madrigal et al., 2016;Vallebueno-Estrada et al., 2016), recovery of degraded DNA from most crops is not routine, and this project represents the first exploration of how paleogenomic testing of archaeological sunflower remains can be used to understand its unique domestication history. Through paired AMS dating and paleogenomic testing of archaeological specimens from the Eden's Bluff site in Arkansas, we find that many desiccated remains dating back as far as 3100 BP can be valuable sources of DNA. Some specimens yield more than 50% sunflower DNA, although a seemingly random subset of specimens yield levels of endogenous DNA (<1%) essentially incompatible for state-of-the-art paleogenomic techniques, such as targeted enrichment of genetic loci of interest (Carpenter et al., 2013). Still, 13 of the 17 specimens yielded >5% endogenous DNA and are therefore well suited for in-depth analysis of nuclear targets that can be defined from genomic and transcriptomic studies of extant sunflower germplasm.
We suspect the exogenous DNA content obtained from our samples originates from at least four sources: organisms that inhabited the disks and achenes during the life of the plant, such as pathogens; organisms that consumed metabolites, proteins, and other biomolecules in the tissue after the death of the individual; environmental DNA transferred from the archaeological sediment; and modern DNA contamination from excavation, curation, and genetic testing. While it is difficult to distinguish these potential sources, the sequencing of extraction controls provides a means to identify cross-contamination of samples and pervasive DNA in laboratory reagents (Salter et al., 2014).
We observed that DNA degradation patterns are variable in archaeological sunflower, both in terms of DNA fragment length and the frequency of chemical damage, even within one relatively tight time interval. For example, the two oldest specimens (Eden-3 and Eden-6) yielded effectively identical AMS dates of ca. 3100 calBP. However, compared to Eden-6, Eden-3 has slightly shorter endogenous DNA (difference of means = 5.8 bp) and higher levels of cytosine deamination (δS of 0.999 vs. 0.673). Similarly, the youngest sample from the collection, Eden-2, dates to 850 calBP and has DNA that is nearly as short (mean fragment length of 62.1 bp) and as damaged as Eden-9 (mean fragment length of 59.7 bp), which is twice as old. Thus, fragmentation and damage profiles do not necessarily follow straightforward, timedependent degradation patterns, perhaps reflecting variability in how different remains were treated prior to deposition (e.g., intentional desiccation or heating in antiquity). Together, these findings indicate that multiple samples from the same site and stratigraphic layer ought to be initially tested by low-depth shotgun sequencing to identify promising candidates for in-depth genetic analysis.

| Organellar haplotype networks recapitulate anticipated patterns for extant taxa
Organellar genomes in most plants exhibit uniparental inheritance (Sato & Sato, 2013). Therefore, a one-to-one association of plastid haplotypes with mitochondrial haplotypes is often expected (Mogensen, 1996), and indeed, we observe such a tight correspondence between our defined organellar haplotype classes (Table S3). Because the plastid and mitochondrial genomes are nonrecombining, it can be possible to use organellar loci as markers for taxonomic identification, as is performed with DNA barcoding studies (Avise et al., 1987;CBOL Plant Working Group et al., 2009). Yet, the organellar genomes of the five annual Helianthus species we have sampled do not resolve into T A B L E 3 Nucleotide diversity (pi) for wild, archaeological, ethnographic, landrace, and modern cultivated sunflowers. It is important to note that the archaeological specimens were excavated from one site and are therefore not wholly comparable to population-level measures of pi for the other sunflower groups. Given that Class 3 and 4 haplotypes were likely introduced to domesticated lines during recent breeding, a separate calculation of pi for modern cultivars with Class 1 and 2 haplotypes is provided  (Sambatti, Strasburg, Ortiz-Barrientos, Baack, & Rieseberg, 2012;Whitney et al., 2015). For instance, Bock et al. (2014) observed a similar lack of taxonomic structure in the organellar genomes of perennial Helianthus species, suggesting this is common throughout the genus.
Most modern cultivars carry one of two distinct haplotypes (the most common Class 1 sequence or Class 3), and these assort into inbred line classes developed to facilitate hybrid production. Elite-bred sunflower lines are classifiable into two types: male "R-lines" and female "B-lines," the latter being derived from open-pollinated varieties (OPV) (Korell, Mösges, & Friedt, 1992 (Seiler, Qi, & Marek, 2017).
Class 4 was likely also introduced during crop improvement, potentially from H. argophyllus, the sunflower species which has been most frequently crossed with domesticated lines to impart disease and parasite resistance (Seiler & Fredrick Marek, 2011). Indeed, it is perhaps surprising that additional non-H. annuus haplotypes were not more commonly observed, as breeders have introduced allelic variation for novel traits (e.g., resistance against a range of pathogens) by prolific and repeated introgression of genetic material from other Helianthus species. H. annuus has reportedly been crossed with every annual species and 14 perennial species in the genus (Kaya, 2014). Our finding of only two introgressed haplotypes, one of which was deliberately selected for, likely reflects that H. annuus has predominantly served as the recurrent maternal parent during sunflower improvement.

| Ethnographic and archaeological organellar sequences reveal lost diversity and raise new hypotheses
Although low-depth shotgun sequencing data from ancient samples like those which we report here generally do not enable populationlevel characterization of nuclear genes of interest, patterns of variation in organellar genomes can be assessed because these DNA sources are found in many copies per cell, increasing their chance of recovery (Hofreiter, Serre, Poinar, Kuch, & Paabo, 2001). Furthermore, analyses of nonrecombining loci from archaeological samples can lead to important insights about the phylogeography and demography of domestication, as demonstrated by aDNA studies of pigs (Larson et al., 2007), cattle (Beja-Pereira et al., 2006), and bottle gourds (Kistler et al., 2014).  (Heiser, 1976;Škorić, 1992).
The loss of diversity in extant landraces relative to historic samples also provides a caution and an opportunity for conducting genome scans for domestication genes. By including nuclear DNA recovered from ethnographic specimens, it may be possible to distinguish between genes that experienced selective sweeps as a consequence of the domestication process versus changes in sequence diversity that score similarly by population genetic metrics due to the recent loss of landrace germplasm. The sole modern wild H. annuus sample carrying a Class 1 haplotype is also instructive in this regard. Given the frequency at which domesticated and wild sunflowers interbreed (Arias & Rieseberg, 1994;Linder, Taha, Rieseberg, Seiler, & Snow, 1998)  persuasively, multilocus nuclear genotype data and candidate domestication gene sequences from these three Mexican landraces demonstrate they are more closely related genetically to extant landraces and wild populations from ENA than to wild populations in Mexico (Blackman et al., 2011).
Because we have obtained aDNA sequence for archaeological samples excavated at the same site but that date to three separate time periods, we can compare the Eden's Bluff samples not only to wild germplasm from the modern era but also to each other. In doing so, we observe a pattern of sequence turnover. The samples dated to the earliest and latest time points (3100 calBP and 850 calBP) both carry the Class 2 haplotype, but the many samples dated to the intermediate time interval (1700 calBP) possess the Class 1 haplotype exclusively. This pattern suggests that multiple different domesticated lineages of sunflowers were maintained in the region for millennia and might reflect differential cultivation of these proto-landraces across time. It is interesting to note that these time points generally correspond to major prehistoric cultural periods in the Ozarks and across North America, namely the Late Archaic, Woodland, and Mississippian periods (Sabo & Early, 1990). Despite these potential links to cultural changes, it must be emphasized that we have tested

| CONCLUSIONS AND FUTURE DIRECTIONS
In summary, we have shown that recovery of ancient and historic DNA from archaeological and ethnographic sunflower specimens is feasible and that desiccated specimens frequently contain high levels of endogenous DNA. At present, shotgun sequencing data allow us to infer the relationships between ancient and modern samples for organellar loci. In tandem with sequencing data from modern accessions, we have gained new perspectives on the persistence of plastid lineages for thousands of years under cultivation and the loss of genetic diversity during recent improvement. We recognize these loci track the maternal lineage and do not document the full domestication history of the sunflower, and our future studies where we obtain greater depth of coverage for many loci in the nuclear genomes of ancient and historic specimens will allow us to address more nuanced questions about the pace of domestication and specific targets of selection.
Fortuitously, numerous desiccated archaeological specimens have been excavated from dozens of sites in the Ozarks and other parts of ENA (Fritz, 1986;Gilmore, 1931;Smith, 2014), thereby providing the means to identify genetic changes over millennia. Most of the specimens were excavated from rockshelters from 1920 to 1930 (Davis, 1967;Harrington, 1924aHarrington, , 1924bHarrington, , 1960, but some of these sites, including Eden's Bluff, have since been inundated by the construction of dams in the mid-20th century or otherwise degraded (Fritz, 1986).
Thus, these curated specimens offer an otherwise unachievable prehistoric perspective on sunflower domestication. Candidate targets of selection during domestication have been reported in several studies (Baute et al., 2015;Blackman, Strasburg, Raduski, Michaels, & Rieseberg, 2010;Blackman et al., 2011;Chapman, Mandel, & Burke, 2013;Chapman et al., 2008), and identifying more should be accelerated thanks to expanding genomic resources being generated by the International Consortium for Sunflower Genomic Resources (Badouin et al., 2017;Kane et al., 2011). Thus, we anticipate paleogenomic characterization of archaeological and ethnographic sunflower tissues will soon have tremendous potential to resolve long-standing questions about the demographic and functional history of domestication for this important oilseed crop.