The efficacy of whole human genome capture on ancient dental calculus and dentin

Abstract Objectives Dental calculus is among the richest known sources of ancient DNA in the archaeological record. Although most DNA within calculus is microbial, it has been shown to contain sufficient human DNA for the targeted retrieval of whole mitochondrial genomes. Here, we explore whether calculus is also a viable substrate for whole human genome recovery using targeted enrichment techniques. Materials and methods Total DNA extracted from 24 paired archaeological human dentin and calculus samples was subjected to whole human genome enrichment using in‐solution hybridization capture and high‐throughput sequencing. Results Total DNA from calculus exceeded that of dentin in all cases, and although the proportion of human DNA was generally lower in calculus, the absolute human DNA content of calculus and dentin was not significantly different. Whole genome enrichment resulted in up to four‐fold enrichment of the human endogenous DNA content for both dentin and dental calculus libraries, albeit with some loss in complexity. Recovering more on‐target reads for the same sequencing effort generally improved the quality of downstream analyses, such as sex and ancestry estimation. For nonhuman DNA, comparison of phylum‐level microbial community structure revealed few differences between precapture and postcapture libraries, indicating that off‐target sequences in human genome‐enriched calculus libraries may still be useful for oral microbiome reconstruction. Discussion While ancient human dental calculus does contain endogenous human DNA sequences, their relative proportion is low when compared with other skeletal tissues. Whole genome enrichment can help increase the proportion of recovered human reads, but in this instance enrichment efficiency was relatively low when compared with other forms of capture. We conclude that further optimization is necessary before the method can be routinely applied to archaeological samples.

sequencing effort generally improved the quality of downstream analyses, such as sex and ancestry estimation. For nonhuman DNA, comparison of phylum-level microbial community structure revealed few differences between precapture and postcapture libraries, indicating that off-target sequences in human genome-enriched calculus libraries may still be useful for oral microbiome reconstruction.
Discussion: While ancient human dental calculus does contain endogenous human DNA sequences, their relative proportion is low when compared with other skeletal tissues. Whole genome enrichment can help increase the proportion of recovered human reads, but in this instance enrichment efficiency was relatively low when compared with other forms of capture.
We conclude that further optimization is necessary before the method can be routinely applied to archaeological samples.

K E Y W O R D S
ancient DNA, genomics, hybridization capture, target enrichment

| INTRODUCTION
The development and application of sequence capture technology has greatly increased the number of archaeological samples that are accessible for genomic studies (e.g., Carpenter et al., 2013;Fu, Mittnik, et al., 2013;Haak et al., 2015;Schroeder et al., 2015).
Typically, the majority of DNA in a given archaeological sample is exogenous (i.e., postmortem environmental) in origin, making untargeted sequencing of these samples inefficient and expensive, with the exception of extraordinarily well-preserved samples and bone elements, such as the petrous bone (Gamba et al., 2014). Targeted sequence capture allows for the selective enrichment of endogenous ancient DNA (aDNA) sequences prior to sequencing, thereby increasing the proportion of desired, on-target molecules in the sequencing run. Sequence capture additionally reduces the amount of material required for destructive analyses, and decreases the experimental workload and cost of aDNA analysis ( Avila-Arcos et al., 2011;Carpenter et al., 2013). To date, targeted enrichment of archaeological specimens has resulted in the successful retrieval of ancient mitochondrial genomes (e.g., Briggs et al., 2009;Llamas et al., 2016;Ozga et al., 2016;Slon et al., 2016), ancient pathogen genomes (Bos et al., 2014;Spyrou et al., 2016;Vågene et al., 2018), human genome-wide SNPs (Haak et al., 2015), partial or whole exomes (Burbano et al., 2010;Da Fonseca et al., 2015), entire chromosomes (Cruz-Dávalos et al., 2017;Fu, Meyer, et al., 2013), and partial nuclear genomes (Carpenter et al., 2013;Schroeder et al., 2015).
The majority of these aDNA capture studies have focused on either archaeological bone or dentin as sample material. However, host DNA preservation in these tissues is highly variable (Damgaard et al., 2015;Gamba et al., 2014), and destructive analysis of skeletal remains may be restricted or not permitted in some cases, making archaeogenetic analysis of these populations challenging. Recent research on ancient dental calculus (calcified dental plaque) has shown that it is the richest known source of ancient DNA in the archaeological record, exceeding the DNA content found in bone and dentin by more than an order of magnitude (Mann et al., 2018;Ozga et al., 2016;Warinner, Rodrigues, et al., 2014). Consequently, dental calculus is potentially valuable for studies of ancient and degraded samples, where DNA preservation is limited. Moreover, as dental calculus is a calcified bacterial biofilm, not a human tissue, it might be subject to fewer restrictions with respect to destructive sampling .
Previous proteomic analysis of ancient and modern dental calculus identified a high proportion of immune proteins, particularly from neutrophils, suggesting that human DNA may enter dental calculus as a result of inflammation-related immunological activity, including the release of neutrophil extracellular traps (NETosis) (Warinner, Rodrigues, et al., 2014).
Archaeological dental calculus has been shown to contain sufficient mitochondrial DNA for full mitogenome reconstruction ; however, mitogenomes only provide maternal ancestry information. In contrast, genome-wide sequence data provide significantly more information that can be used to determine the sex of individuals (Skoglund et al., 2013), infer genome-wide ancestry and admixture patterns (e.g., Skoglund et al., 2014), establish kinship and genetic relationships (Sikora et al., 2017), and provide evidence for natural selection and human environment interactions (e.g., Jeong et al., 2016). Establishing whether ancient dental calculus can serve as a viable source of genome-wide nuclear human DNA is thus important in order to evaluate its potential for future population genetics studies.
In this article, we perform whole genome enrichment (WGE) on 24 paired archaeological human dentin and dental calculus samples ( Figure 1) that had been previously shown to be well preserved (Mann et al., 2018). In total, we generated approximately 600 million pairedend reads and characterized the quantity and quality of the human genetic data obtained. The samples were sourced from diverse contexts to assess if patterns of preservation in dental calculus vary across time and space, and to evaluate the performance of WGE on samples with varying levels of preservation. The enrichment was performed using the MYbaits WGE kit (Arbor Biosciences, MI), which uses biotinylated RNA "bait" to capture the human DNA molecules in ancient DNA libraries (Enk et al., 2014).
We find that although dental calculus is an excellent source of both microbial and human endogenous DNA, the relative proportion of human DNA is consistently quite low, making efficient WGE challenging. In this study, we observed only modest enrichments of up to four-fold, which is relatively low when compared with previously published enrichment rates for mitochondrial genome capture  or selected SNPs (e.g., Mathieson et al., 2015). Additionally, we find that capture enrichment of dental calculus results in the biased recovery of human reads with significantly higher GC content.  Ziesemer et al., 2015). The same sample set was also evaluated in a separate study on the differential preservation of ancient DNA in dental calculus and dentine (Mann et al., 2018).

| DNA extraction
All samples were extracted in dedicated ancient DNA facilities at the

Laboratories of Molecular Anthropology and Microbiome Research
(LMAMR) in Norman, Oklahoma. The LMAMR lab operates in accordance with established contamination control precautions and workflows, as previously described (Ziesemer et al., 2015). Prior to extraction, the surface of the tooth was washed with a 2% NaOCl solution, followed by molecular biology grade water. Dental calculus was then removed from the tooth using a dental scaler, and the tooth root was separated from the crown using a Dremel rotary tool. The tooth root and calculus were further decontaminated by UV irradiation in a Crosslinker for 1 min on each side. DNA extraction was performed as described by Ziesemer et al. (2015). In brief, 10-20 mg of dental calculus and approximately 100 mg of dentin were crushed to a coarse powder and washed in 1 ml 0.5 M EDTA under rotation for 15 minutes to remove loosely-bound contaminants. Following centrifugation, the supernatant was removed, and the decontaminated pellet was digested in a solution of 0.45 M EDTA and 10% proteinase K with heating at 37-55 C for 8-12 hr, followed by room temperature incubation for 5 days until digestion was complete. For dentin samples, the digestion buffer was refreshed after 2 days to avoid saturation of EDTA chelation, and the two buffer fractions were combined after digestion completion. For all samples, cell pellet debris was separated by centrifugation and the DNA-containing supernatant was further extracted using a phenol/chloroform approach , followed by DNA purification and concentration using a modified Qiagen MinElute silica spin column protocol (Dabney et al., 2013). Extracted DNA was eluted twice into 30 μl of EB for a combined volume of 60 μl, and immediately quantified after extraction using a Qubit fluorometer 2.0 High Sensitivity assay (Life Technologies). Subsequent DNA quantification performed later in a separate study measured higher concentrations, likely due to evaporation (Mann et al., 2018).

| Illumina library preparation
Approximately 100  in 50 μl reactions. The reaction was incubated for 15 min at 20 C and purified using Qiagen QiaQuick columns before elution in 30 μl EB. The adapter fill-in reaction was performed in a final volume of 50 μl and incubated for 20 min at 37 C followed by 20 min at 80 C.
Libraries were amplified and indexed at the Center for Geogenetics in Copenhagen, Denmark, using a dual-indexing approach (Kircher, Sawyer, & Meyer, 2012)  2.5 | High-throughput sequencing and initial data processing

| mapDamage 2.0 analysis
We used mapDamage2.0 (Jonsson et al., 2013) to quantify postmortem DNA damage patterns (e.g., deamination rates) for each DNA library and to rescale base quality scores of Ts and As based on their probability of resulting from molecular damage.

| Contamination estimates
We used contamMix 1.05  to estimate the level of modern human DNA contamination. In brief, a consensus sequence was built using reads with mapping quality ≥30, base quality ≥20, and a minimum per-site depth of coverage of three. Additionally, sites with consensus concordance lower than 70% were set to N. Next, mtDNA reads were extracted from the original alignments and re-mapped to the newly created mtDNA consensus sequence as described above.

| Chromosomal sex determination
To determine chromosomal sex, we used the method described by Skoglund et al. (2013), which calculates the fraction of sex chromosome reads that align to the Y chromosome. The analysis was restricted to reads with a mapping quality ≥30.

| mtDNA haplogroup assignment
To determine mtDNA haplogroups, mtDNA variant sites were called using samtools 1.2 and bcftools 1.9 , allowing for recalculation of the extended BAQ, and excluding bases with quality lower than 20 and reads with mapping quality lower than 30. Genotypes were compared with the revised Cambridge Reference Sequence (Andrews et al., 1999), and variants with depth of coverage lower than 3 and "allelic balance" lower than 70% were discarded. HaploGrep2.0 (Weissensteiner et al., 2016) was run on the list of filtered variants to assign the most likely haplogroup for each sample. Because of the low number of mtDNA reads for several individuals, we assessed the accuracy of haplogroup assignment at different depths of coverage by carrying out an in silico downsampling experiment on the full datasets generated for individuals H10 and S40. For each sample, 15,000, 10,000, 5,000, 1,000, 500, 100, and 50 mtDNA reads were randomly subsampled from the total reads, and each downsampled dataset was used to perform haplogroup assignment as described above (Supporting Information Table S4).

| Genome-wide clustering analysis (ADMIXTURE)
We used the model-based clustering algorithm ADMIXTURE (Alexander, Novembre, & Lange, 2009)  To do so, we randomly sampled 5,000,000, 1,000,000, 500,000, 100,000, 50,000, 10,000, and 5,000 reads from the mapped and filtered reads, and obtained 10 independent replicates for each number of reads. For each dataset, we estimated admixture proportions assuming three populations as previously described, and compared the obtained values with the expected proportions estimated on all available reads (Supporting Information Figure S2).

| Microbial profiling
Analysis-ready reads from the full dataset were aligned locally (--nounal --local) to the subset of the Silva SSU 111 reference dataset (Quast et al., 2013) (Mann et al., 2018;Ozga et al., 2016;. The DNA in dental calculus was primarily of microbial origin, but human reads were also present, and the relative proportion of human DNA was significantly lower in dental calculus (0.005-0.35%) than dentin (0.04-66.95%; Wilcoxon signed-rank test, p < .01; Figure 2b; Supporting Information Table S2). When normalized by sample mass input, however, the estimated absolute quantity of human DNA in dental calculus (0.5-210 pg/mg) and dentin (0.5-130 pg/mg) was similar (Wilcoxon signed-rank test, p > .3), albeit highly variable, with 4 of 12 sample pairs containing more human DNA in dental calculus, and eight sample pairs containing more human DNA in dentin ( Figure 2c). As expected, we observed that the human DNA in dental calculus samples from older contexts (i.e., Camino del Molino, Spain) or warmer climates (i.e., Anse à la Gourde, Guadeloupe) were generally less well-preserved (as indicated by lower human DNA contents, shorter average fragment lengths, and higher deamination rates) than the DNA recovered from younger samples or those from colder climates (e.g., Nepal or the Netherlands). However, contrary to our expectations we did not observe the same pattern for the dentin samples (Supporting Information Table S2).

| Whole human genome enrichment
Whole genome enrichment resulted in uneven enrichments in the human DNA content of both dentin and dental calculus libraries and for three of the dentin libraries (S108, F349A, and S41) capture did not lead to any significant enrichment at all (see Table 1). Generally, we observed higher enrichment factors for the dental calculus libraries than the dentin libraries, but due to the low starting amount of human DNA in the dental calculus libraries the absolute gains were relatively low ( Figure 3a). Enrichment of mitochondrial DNA (up to 140-fold) was higher than nuclear DNA (up to four-fold) (Supporting Information Tables S2 and S3), but still relatively low when compared with previous studies specifically targeting the mitochondrial genome .
As observed elsewhere (Mann et al., 2018), average human DNA fragment lengths in the precapture libraries were found to be significantly shorter (Wilcoxon signed-rank test, p < .01) in dental calculus (73 bp) compared with dentin (85 bp) (Figure 3b) Clonality markedly increased with capture in both dentin and dental calculus (Wilcoxon signed-rank test, p < .001 for dental calculus; p < .01 for dentin) (Figure 3c), and the GC content of human reads was significantly higher in postcapture than precapture libraries (Wilcoxon signed-rank test, p < .01; Figure 3d). The GC content of human reads in dental calculus (median 45% in precapture and 47% postcapture libraries) was also significantly higher than in dentin (median 40% in precapture and 42% postcapture libraries; Wilcoxon signed-rank test, p < .05 for both precapture and postcapture). To ensure that the elevated GC content of human reads in dental calculus was not a consequence of mismapping of bacterial reads (which have a higher median GC content) to the human genome, we performed a BLASTn search of 10,000 randomly sampled dental calculus human reads against the NCBI nt database, and confirmed that most (97-98%) uniquely mapped to human (Supporting Information Figure S1).
Terminal damage rates were significantly lower in dental calculus than in dentin (Wilcoxon signed-rank test, p = .01), and WGE did not significantly influence damage rates (Wilcoxon signed-rank test, p > .1) (Figure 3e).

| Contamination estimates
Due to the low sequencing depth per sample, we were only able to estimate mitochondrial contamination rates for half of the dentin samples and none of the calculus samples. In the precapture dataset, contamination estimates were generally low and ranged from 0.6 to 4.1% (Supporting Information Table S2). After capture, we observed higher contamination rates, ranging from 1 to 15.7% (Supporting Information   Table S3).

| Sex identification
Previous studies (Skoglund et al., 2013) suggest that a minimum of 3,000 reads mapping to the sex chromosomes are required to accurately identify the chromosomal sex of ancient remains from high-throughput sequencing data. For the calculus libraries, we recovered less than 300 sex chromosome reads per sample prior to capture.
This increased slightly after capture, to a maximum of 892 reads for S41, but was still not sufficient to obtain reliable estimates (Table 2).
For the dentin libraries, we recovered significantly more sex chromosome reads, averaging around 13,000 reads per sample prior to capture, and 17,000 reads after capture (Table 2). This was sufficient to obtain reliable sex estimates for 8 of the 12 individuals (Table 2, of which, 4 were identified as female and 4 as male.

| Mitochondrial haplogroup determination
Similar to obtaining sex estimates from high-throughput sequencing data, a minimum number of reads mapping to the mitochondrial genome are needed to confidently assign a mitochondrial haplogroup. Ancient DNA recovery and human endogenous content from archaeological dental calculus. Bar charts summarizing: (a) Total DNA yield (ng/mg) in dental calculus (filled bars) and dentin samples (hollow bars) on a log 2 scale. Dental calculus samples show a higher total DNA yield compared with dentin samples in all cases. (b) Endogenous human DNA content (%) in dental calculus (filled bars) and dentin samples (hollow bars) on a log 2 scale. Overall human endogenous content is higher for dentin samples. (c) Estimated absolute human DNA yields (ng/mg) on a log 2 scale. Total human DNA per mg is estimated to be higher in dental calculus in 4 out of 12 sample pairs, but both substrates exhibit high variation. Archaeological sites are ordered from left to right by continent (Europe, Americas, Asia). Data are provided in Supporting Information Table S2 To determine how many mitochondrial reads are needed, we serially downsampled the total mitochondrial reads of two well-preserved dentin samples, H10 and S40, from approximately 15,000 to 50 mitochondrial reads and observed how this affected haplogroup assignment and scoring using the program HaploGrep (Weissensteiner et al., 2016). We found that at least approximately 500 mitochondrial reads are needed to obtain haplogroup assignments with a HaploGrep score greater than 0.5, and at least 1,000 mitochondrial reads are recommended for more confident assignments (Supporting Information Table S4). Depending on aDNA fragment length, this corresponds to an average depth of coverage of approximately 1-4×. However, we note that other factors, such as the number of sequencing errors, deamination rates (postmortem damage), and contamination will also affect the accuracy of the haplogroup assignment.
As expected, we recovered significantly more mitochondrial reads from dentin than from dental calculus (Wilcoxon signed-rank test,   Table S2). Whole genome capture led to substantial enrichments of mitochondrial reads (6-to 139-fold for dental calculus and 2-to 99-fold for dentin) (Table 1). However, even with capture we still did not recover enough mitochondrial reads from the dental calculus libraries to confidently determine mitochondrial haplogroups, and only one sample, NF217, allowed a low confidence assignment to haplogroup A2 (HaploGrep score 0.5). For dentin, five of the 12 samples yielded sufficient mitochondrial reads without capture to determine the haplogroup with a HaploGrep score greater than 0.6 (Supporting Information Table S5). After capture, we were also able to obtain haplogroup assignments for five out of the 12 samples. The assignments were consistent before and after capture, and the identified haplogroups are found among contemporary populations in the sampled regions today (Supporting Information Table S5).

| Genome-wide ancestry estimation
Ancestry estimation programs like ADMIXTURE (Alexander et al., 2009) rely on reference panels to provide genome-wide ancestry estimates for ancient samples. Among other things, the accuracy of these estimates depends on the intersection between the data that were generated from the ancient samples and the reference panel used.
Simply speaking, the larger the overlap is, the more accurate the estimates are. Using a series of randomly downsampled datasets generated from the total human reads of dentin sample H10 and the Human Genome Diversity Panel (HGDP) as a global reference, we found that at least 100,000 total mapping nuclear reads (corresponding to~2,500 overlapping sites) are needed in order to obtain consistent admixture proportions at K = 3 with low standard errors using the clustering algorithm ADMIXTURE (Figure 4a,b; Supporting Information Figure S2) (Alexander et al., 2009). Below 100,000 reads, the estimated admixture proportions became increasingly variable with large standard errors and, therefore, unreliable.
In accordance with the mtDNA results, we recovered significantly more nuclear reads from dentin than calculus (Wilcoxon signed-rank test, p < .01 for precapture and postcapture libraries). However, at a sequencing depth of 5 M paired-end reads, only 6 out of 12 (precapture) and 7 out of 12 (postcapture) dentin samples and none of the calculus samples yielded more than 100,000 autosomal reads. Whole genome capture generally led to modest enrichments in autosomal Total number of reads mapping to the sex chromosomes after removing PCR duplicates and reads with mapping quality <30. b Ry observed fraction of Y chromosome alignments compared with the total number of alignments to the X and Y chromosome (Skoglund et al., 2013). c Typical Ry for males is an Ry over 0.09. Ry values under 0.02 are considered female. d Sex predicted by Ry value, but insufficient X + Y reads are available for confident assignment. A minimum of 3,000 R + Y reads are recommended for sex assignment (Skoglund et al., 2013).
reads for both dental calculus (five-fold) and dentin (four-fold) and, therefore, also led to greater overlap between the ancient samples and the reference panel. In cases where the capture worked, we recovered roughly twice as many sites after capture (Supporting Information Tables S2 and S3), given the same sequencing effort, which also improved the accuracy of the ADMIXTURE-based ancestry estimates. This can be most clearly seen in the case of S40 dentin where the results from ADMIXTURE analysis performed on the precapture dataset erroneously assigned a higher proportion of European ancestry to an individual of Asian ancestry (Figure 4c).

| Microbial profile
The effect of WGE on off-target microbial DNA sequences was evaluated by comparing the taxonomic assignment of 16S rRNA gene sequences (16S rDNA) before and after human enrichment on the total sequencing dataset. Contrary to expectations, the proportion of 16S rDNA sequences did not decrease in the captured libraries, as might be expected for this off-target and relatively high GC content gene (Supporting Information Tables S2 and S3). However, taxonomic analysis of these 16S rDNA reads using the QIIME pipeline (Caporaso et al., 2010) revealed that the proportion of 16S rDNA reads assignable to OTUs using a closed reference 97% identity clustering approach did decrease significantly from 45.5 to 23.1% for dental calculus and from 19.6 to 5.3% for dentin (Wilcoxon signed-rank test, p < .00003 for dental calculus, p < .004 for dentin; Figure 3f; Supporting Information Table S2). These results are broadly consistent with analyses performed using the MetaBIT pipeline (Louvel et al., 2016), where the median postcapture OTU assignment rate fell by half for dental calculus (from 44.9 to 23.4%) and by more than four-fifths for dentin (from 21.9 to 3.1%) (Supporting Information Tables S2 and S3).
Although higher rates of OTU assignment for dental calculus compared with dentin in both precapture and postcapture libraries are not unexpected because the host-associated taxa present in dental calculus are better represented in reference databases than the environmental taxa present in dentin, the reason for the overall drop in OTU assignment rates observed after capture is unclear.
As a consequence of the very low numbers of 16S rDNA reads recovered from the dentin samples, further taxonomic analysis was restricted to the dental calculus samples only. Precapture and postcapture dental calculus datasets were rarefied to a depth of 1,697 and assigned taxonomy using the QIIME pipeline and the Greengenes database (Caporaso et al., 2010). Despite significant differences in OTU assignment rates before and after capture, overall phylum-level taxonomic proportions were similar between precapture and postcapture libraries ( Figure 5). This suggests that off-target microbial sequences obtained through WGE of dental calculus may be suitable for phylum-level microbial community structure analysis of the ancient human oral microbiome.

| DISCUSSION
The principal aim of this study was to evaluate whether dental calculus could serve as a viable alternative source of human DNA for whole human genome reconstruction and to explore the efficacy of WGE on dental calculus. To do so, we characterized the human DNA content in a diverse set of archaeological dental calculus and dentin samples, before and after WGE. In agreement with previous studies (Mann et al., 2018;Ozga et al., 2016;, we observed that the total extracted DNA yield of dental calculus far exceeds that of dentin-up to 375-times higher as in the case of NF47. Additionally, we found that although the proportion of human DNA in the samples is generally lower in dental calculus than dentin, the absolute amount of human DNA in both substrates is comparable (cf., Mann et al., 2018).
A s ia n / N a t iv e A m e r ic a n  Whole genome capture resulted in up four-fold enrichment of the human endogenous content in both dental calculus and dentin libraries, with the exception of three dentin samples that either showed no enrichment (S108) or, in fact, were depleted in human DNA content (F349A, S41) following capture. These enrichments are comparable to those observed in previous studies using a whole genome capture approach (Carpenter et al., 2013;Schroeder et al., 2015;Avila-Arcos et al., 2015;Schroeder et al., 2018), but significantly lower than those reported for mitochondrial genome capture  or the targeted capture of specific SNPs (Mathieson et al., 2015). This suggests that current WGE methods are not as effective as other forms of targeted enrichment, which might be at least partially explained by the size and complexity of the target, the low copy number of nuclear  Tables S2 and S3).
As observed previously, we find slightly higher contamination estimates for postcapture libraries (Supporting Information Tables S2 and   S3) , and we found that postcapture average fragment lengths were longer (Supporting Information Tables S2 and   S3), which is consistent with the known bias toward longer fragments when using a capture approach (Cruz-Dávalos et al., 2016;Ozga et al., 2016 Interestingly, we find that prior to enrichment the median GC content of human DNA in the calculus libraries (45%) is significantly higher than in the dentin libraries (39.5%), as well as higher than the average GC content of the human genome (40.9%). Previous studies have shown that the GC content of DNA retrieved from dentin typically reflects that of the reference genome (Cruz-Dávalos et al., 2016). However, a recent study (Mann et al., 2018) observed an inverse relationship between DNA fragment length and GC content in ancient DNA derived from microbial taxa. The systematic loss of AT-rich fragments in taxa with low-and medium-GC genomes may be partially explained by the susceptibility of short fragments with low GC content to loss through denaturation (Mann et al., 2018). The high GC content of human DNA in dental calculus might be related to factors specific to the manner in which human DNA is incorporated into dental calculus.
However, it remains unclear whether these patterns are produced through the sequencing preparation or a naturally occurring taphonomic process (Mann et al., 2018). Furthermore, we note that capture slightly increased the overall GC content of the ancient DNA libraries.
Regarding postmortem damage, we did not observe any changes in the frequency of typical damage patterns following enrichment (cf., Carpenter et al., 2013). We did, however, detect significant differences in terminal damage rates between dental calculus and dentin, whereby human DNA in dental calculus appears to be less damaged than in dentin (cf., Mann et al., 2018). It is possible, therefore, that the human DNA trapped in dental calculus is somehow more protected from various degradation processes (e.g., hydrolytic damage) than human DNA in dentin. Overall, despite some differences that may be intrinsic to biological differences between dental calculus and dentin, we find that in-solution WGE affects dental calculus in a similar way as dentin or bone.
With respect to sex and ancestry estimates, we find that WGE marginally improved the reliability of these assignments by enabling the generation of more data for the same sequencing effort. As such, we recovered approximately twice as many sex chromosome reads on average with WGE than without. This was sufficient to reliably determine the biological sex of 8 of the 12 individuals, 4 of whom were identified as female, and 4 as male. While we were unable to recover sufficient X and Y chromosome reads from the calculus samples to obtain confident sex estimates, we note that sex chromosome reads were present and that given the appropriate sample size and sequencing effort, high-throughput sequencing of archaeological dental calculus samples could be used for sexing ancient human remains. With respect to genome-wide ancestry estimates, we note that WGE increased the overlap between the ancient samples and the modern reference panel and, therefore, also improved the accuracy of the ADMIXTURE-based ancestry estimates. We also recovered significantly more mitochondrial reads after capture, resulting in more reliable mtDNA haplogroup estimates.
Finally, in regard to the sample microbial profile, we found that although the proportion of 16S rDNA reads assigned to OTUs significantly decreased after capture, no major differences were observed in microbiome profiles at the phylum level, indicating that off-target reads in libraries enriched for the human genome may still be useful for investigating the ancient oral microbiome.
Overall, we note that our WGE experiment was notably less effective than other forms of capture targeting the mitochondrial genome ( tures (50 C) led to higher enrichment rates. A third option would be to perform two or more consecutive rounds of capture (Li et al. 2013;Templeton et al., 2013) to increase enrichment rates.

| CONCLUSION
Whole human genome capture performed on a set of 24 paired human dental calculus and dentin samples resulted in up to four-fold enrichments of the human endogenous content. These kinds of enrichment rates are orders of magnitude lower than those achieved with other kinds of capture targeting the mitochondrial genome (Maricic et al., 2010;Ozga et al., 2016) or specific SNPs (Mathieson et al., 2015). We conclude that while archaeological dental calculus does contain ancient human DNA, current WGE techniques are not effective at retrieving it, and further optimizations are needed before WGE can be more widely applied. The low relative proportion of human DNA in dental calculus clearly poses challenges for retrieving host genome information using both shotgun and capture enrichment approaches. However, in the absence of other resources or when sampling of other tissues is restricted, dental calculus can serve as a viable source of human ancient DNA. The ability to recover both human DNA and microbial DNA from the same archaeological substrate opens new avenues of research for studying the relationships between the genetic information of the host and microbiome composition, function and evolution. However, to fully realize the potential of dental calculus for human genome-wide analyses, optimization of DNA enrichment techniques is necessary.  FIGURE 5 Frequency of microbial phyla inferred from dental calculus samples before and after whole genome capture. The microbial profiles prior to (light) and after (dark) human genome capture enrichment were similar between precapture and postcapture libraries, indicating that offtarget microbial sequences from postcapture libraries may be suitable for use in ancient oral microbiome studies