Evaluation of introgressive hybridization among Cervidae in Japan's Kinki District via two novel genetic markers developed from public NGS data

Abstract Hybridization and backcrossing of native populations with introduced species can lead to introgression and genetic alteration. In this study, we evaluated introgression in 43 deer from a potential hybrid zone around Okinoshima Island, Kinki District, Japan. This region witnessed the migration of a hybrid population (cross between the Formosan sika deer [Cervus nippon taiouanus] and other deer species) that could potentially breed with the native Japanese sika deer (C. n. centralis). We used an existing genetic marker for the mitochondrial cytochrome b gene and two novel markers for nuclear DNA, developed using publicly available next‐generation sequencing data. We identified one mainland deer with a mitochondrial haplotype identical to that of the Formosan sika deer as well as nuclear heterozygous sequences identical to those of Formosan and Japanese sika deer. This suggests that the mainland deer is a hybrid offspring of the Okinoshima population and native deer. However, only Japanese sika deer sequences were found in the other 42 samples, indicating limited introgression. Nevertheless, hybridization pre‐ and postintroduction in the Okinoshima population could cause multispecies introgression among Japanese sika deer, negatively affecting genetic integrity. We developed a simple test based on polymerase chain reaction–restriction fragment length polymorphism to detect introgression in natural populations. Our method can accelerate genetic monitoring of Japanese sika deer in Kinki District. In conclusion, to prevent further introgression and maintain genetic integrity of Japanese sika deer, we recommend establishing fences around Okinoshima Island to limit migration, besides a continued genetic monitoring of the native deer.


| INTRODUC TI ON
Invasive species hybridizing with related native species can result in the latter's extinction through introgression, eroding genetic integrity (Rhymer & Simberloff, 1996). In Cervidae, sympatric species easily hybridize and produce fertile offspring (Matsumoto, Ju, Yamashiro, & Yamashiro, 2015). One notable example is the introduction of sika deer (Cervus nippon) into Europe, North America, and New Zealand (McCullough, Kaji, & Takatsuki, 2009). In Scotland, native red deer (C. elaphus) populations are threatened (Abernethy, 1994) by repeated introduction of Japanese sika deer since the 19th century (Goodman, Barton, Swanson, Abernethy, & Pemberton, 1999), resulting in deer hybridization throughout the country (Smith et al., 2018). To conserve the native species, hybridization with the introduced species should be prevented through management methods such as isolation of the two populations.
Contrastingly, molecular genetic evidence for mitochondrial DNA indicated the existence of two major groups, namely the Northern and Southern Japan groups (Nagata et al., 1999). Although genetic evidence was not consistent with the morphology-based classification, the latter is widely accepted at present (Nagata, 2009).
Formosan sika deer (C. n. taiouanus) is a subspecies native to Taiwan but now extinct in the wild (McCullough et al., 2009). In 1955, 10 individuals were introduced in Okinoshima Island, Wakayama Prefecture, Kinki District, Japan (Fukushima et al., 1983;Takatsuki, 1985) for tourism purposes (Tokida, 1998). The Okinoshima population exhibits considerable morphological variation, making identification of subspecies through morphological traits difficult (Matsumoto et al., 2015). Likely originated from a private deer farm in Taiwan, this population has identical mitochondrial cytochrome b and nuclear alpha-lactalbumin sequences to those of Formosan sika deer, Formosan sambar (C. unicolor swinhoei), and red deer (C. elaphus) (Matsumoto et al., 2015).
During 2015-2016, one buck was caught in Sennan City and a doe was caught nearby, in Misaki Town; both locations are in the Honshu (mainland) portion of Kinki District and 4.2 km from Okinoshima Island (Figure 1). Deer were not known to be distributed around these Honshu areas previously. Sika deer can swim up to 12 km (Feldhamer, 1980), and individuals have been observed swimming around the island (Tatsuzawa, 2000). Therefore, there is a distinct possibility that the captured deer originated from the Okinoshima population and could hybridize with the native Japanese sika deer (C. n. centralis) on Honshu. Another population at risk of breeding with the Okinoshima deer is the Japanese sika on Awajishima Island, located 4 km west of Okinoshima Island.
Through genetic data, hybridization between species can be reliably determined (Goodman et al., 1999), and the success rate increases with the increase in the number of genetic markers (Dupuis, Roe, & Sperling, 2012). Mitochondrial DNA is a commonly used marker, but because they can only infer maternal inheritance, nuclear genetic markers such as single-nucleotide polymorphisms (SNPs) should also be incorporated in hybridization analyses to identify paternal origin. However, only one nuclear marker that can distinguish between Formosan and Japanese sika deer is currently available (Matsumoto et al., 2015). Hence, more nuclear markers with moderate polymorphism in Cervidae must be identified.
With the development of next-generation sequencing (NGS), a large genetic dataset is now freely available in public databases such as the NCBI Sequence Read Archive (SRA) (Leinonen, Sugawara, & Shumway, 2011;https://www.ncbi.nlm.nih.gov/ sra). In the present study, we developed two novel nuclear DNA markers using the NGS dataset ( Figure 2) and included an existing mitochondrial marker to evaluate the genetic status of deer populations in Kinki District. Specifically, our aim was to determine the phylogeny and extent of introgression among Okinoshima deer and Japanese sika deer on Honshu and Awajishima Islands.  We also established a new genotyping protocol using the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis as an alternative to direct sequencing (Amjadi, Varidi, Marashi, Javadmanesh, & Ghovvati, 2012;Meyer, Höfelein, Lüthy, & Candrian, 1995;Partis et al., 2000). Advantages of PCR-RFLP include lower cost and higher speed (Amjadi et al., 2012;Meyer et al., 1995;Partis et al., 2000).

| DNA samples
As a control to evaluate novel DNA markers, blood and muscle samples were collected from three Formosan sika deer (that either died of natural causes or were subjected to medical treatment) at the Hirakawa Zoo in Japan (Frm). Three other control samples (Chb) were obtained from legally hunted Japanese sika deer in Shizuoka and Yamanashi Prefectures (Central Japan). These individuals were selected as they are unlikely to have experienced crossbreeding with the Okinoshima population, given that the sampling sites are 350 km apart.
We collected muscle or antler samples from deer represent-

| Establishing genetic markers based on NGS data
Novel genetic markers were developed using high-quality and highly variable (HQHV) regions with elevated SNP count per 500 bp. A length of 500 bp was selected because one run of Sanger sequencing is sufficient for diagnosing DNA sequences of this size (Luo, Tsementzi, Kyrpides, Read, & Konstantinidis, 2012). To obtain the HQHV region, we first searched the NCBI SRA (https://trace.ncbi. nlm.nih.gov/Traces/sra/sra.cgi?view=software) for NGS raw reads of one European red deer (C. e. hippelaphus) (Bana et al., 2018), four tule elk (C. canadensis nannodes) (Sacks, Lounsberry, Kalani, Meredith, & Langner, 2016), and one white-tailed deer (Odocoileus virginianus) (Seabury et al., 2011) (Table 1). Using the better-established cattle (Bos taurus) reference genome (UMD3.1) from the Bovine Genome Database (http://bovinegenome.org/), we extracted the HQHV region in the deer genome. The raw read data (sra files) were converted to the fastaq format using SRA Toolkit v2.8.2 (Leinonen et al., 2011). These fastq files were then mapped to the bovine reference genome using the BWA-MEM function in bwa 0.7.15 (Li & Durbin, 2009). Subsequently, SNPs were called and extracted from highquality mapped regions using samtools v1.3.1 (Li, 2011) and vcftools v0.1.15 (Danecek et al., 2011). Inferred sequences in the three deer species were constructed from SNPs with EMBOSS cons (http:// www.bioinformatics.nl/cgi-bin/emboss/cons). Coverage depth was calculated for each site using R and bash scripts written by the authors. Regions were removed if the absolute value of mean coverage ± 0.1 was exceeded; this criterion avoids the negative influence of duplicated genomic regions. Regions with <1% N (undetermined F I G U R E 2 Schematic representation of the experimental procedures performed in this study nucleotide site) and heterozygous sites were also excluded to avoid low mapping quality. Regions with >1 SNP per 100 bp were considered highly variable and extracted. Noncorrected phylogenetic distance, p-distance, and its variance were calculated across three pairs (European red deer and tule elk, European red deer and white-tailed deer, and tule elk and white-tailed deer) to remove regions with distance or variance <0.001, as higher phylogenetic distance between inferred sequences is more suitable for species identification. The primers for PCR amplification of HQHV regions were designed using Primer 3 (Untergasser et al., 2012). Two nuclear markers (C02 and C22) were established from these procedures.

| DNA extraction, PCR, and Sanger sequencing
Mitochondrial and nuclear DNA were extracted from all tissue and blood samples using the standard phenol chloroform method following proteinase K treatment. Genomic DNA from antler specimens was

| Constructing the phylogenetic network
All inferred and directly sequenced regions were aligned using MEGA 6.06 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013) with default settings. A phylogenetic network was constructed in SplitsTree4 (Huson & Bryant, 2006). Noncorrected p-distance was used for all three markers. Default settings in SplitsTree4 were selected for Cytb, but "average states mode" was selected for C02 and C22 regions because the nuclear sequences contained heterozygous sites. Network accuracy was estimated with 100,000 bootstraps.

| PCR-RFLP for genetic assessment
We used five samples-two Okinoshima deer, two control deer (Japanese and Formosan sika deer), and one Misaki Town deer samples-for the PCR-RFLP assessment. Genotyping was performed using PCR-RFLP. The NEBcutter V2.0 tool (http://nc2.neb.com/ NEBcutter2/; New England Biolabs Inc.) identified Rsa I (Takara Bio Inc.) as appropriate restriction enzymes for Cytb and C02, while HpyCH4IV (New England Biolabs Inc.) was identified for C22.

| PCR primer establishment
Three HQHV regions were detected using the NGS data (Table 2), after excluding seven candidate HQHV regions for possessing a high number of N or heterozygous sites and two for lacking diversity. We established three nuclear DNA primer sets for the C02, C18, and C22 regions. However, based on preliminary analyses, we excluded C18 because it was not sufficiently polymorphic. Thus, we used only C02 and C22 for all subsequent analyses. Neither region contained any genes based on the cattle genomic data.
Using the nuclear markers C02 and C22, we sequenced the HQHV regions from both Frm and Chb control populations. In the Frm samples, C02 yielded two (O1 and F1) diploid genotypes, whereas C22 identified one (o1). In the Chb samples, we found two TA B L E 1 Profile of genome assembly using the next-generation sequencing (NGS) data

| Evaluation of the Honshu and Awajishima Island populations
The Misaki Town specimen exhibited a Cytb haplotype identical to the haplotype in Formosan sika deer, also previously found in the Okinoshima population (Matsumoto et al., 2015). The Honshu and Awajishima populations exhibited three diploid genotypes (j1-3) in the C22 region ( Figure 3c, Table 4). The most common genotype was j1 (n = 34), differing from j2 by only one substitution at 64 nt. Similar to C02, the Okinoshima population contained four diploid genotypes in the C22 region. The most common allele in the Okinoshima population was o1; its sequence was identical to that of Formosan sika deer and the inferred sequence of European elk. The phylogenetic analysis revealed that the Okinoshima population has two distinct nodes in the C22 region ( Figure 3c).
These two diploid genotypes were identical to the homozygous sequences (J1 and O1, j1 and o1) found in other populations. The phylogenetic analysis using both C02 ( Figure 3b) and C22 (Figure 3c) indicated that the Misaki sample was between the Okinoshima and Japanese sika deer clades.

| PCR-RFLP
The PCR-RFLP analysis of the five samples confirmed the sequencing results (Figure 4). In the Okinoshima population, the sambar-type Cytb sequence differed from the Formosan sika-type sequences.
However, the Okinoshima-origin fragment patterns were identi-
Furthermore, all the SNP sites of the two nuclear sequences were heterozygous, with an allele from the native Japanese sika deer and an allele from the hybrid Okinoshima population. Our analysis of the alpha-lactalbumin gene (Matsumoto et al. unpublished data) showed a similar heterozygosity. To the best of our knowledge, purebred Formosan sika deer have not been introduced or found around Misaki. Therefore, the Misaki individual is probably a firstgeneration hybrid between Okinoshima deer and native Japanese sika deer.
Male deer disperse more frequently than female deer (Clutton-Brock, Guinness, & Albon, 1982;Ito & Takatsuki, 2009); therefore, it is not surprising that only males have been observed on Jinoshima Island, located between Okinoshima Island and Honshu (Wild Management Organization, 2017). However, the fact that our Misaki sample has a maternal lineage originating from Okinoshima indicates that does also swam to Honshu at some point and mated with Japanese sika males. In contrast, Japanese sika deer have not managed to migrate from Honshu to Okinoshima Island, based on the sequence identified in the Okinoshima population so far (Matsumoto et al., 2015).
Our data suggest introgressive hybridization in the Okinoshima population. First, at least three Cervidae species (Formosan sika deer, Formosan sambar, and red deer) may have been mated for antler production in Taiwanese deer facilities (Matsumoto et al., 2015).
Second, members of this Okinoshima population may have mated with the native Japanese sika deer on Honshu resulting in another hybridization event.

| Evaluation of Wakayama and Awajishima Island populations
All three DNA sequences of the 42 deer from three Japanese populations (Awajishima, North Wakayama, and South Wakayama) were identical to typical Japanese sika deer sequence and were in the same clade as the Chb population. Thus, we found no evidence of introgression in the Wakayama Prefecture or Awajishima Island.
There are two limitations for the detection of introgressive hybridization in this study. First, the detection of other hybridization events may have been impaired by the limited sample sizes.
Further genetic assessments using larger sample sizes are therefore needed. Second, a low number of markers were used and, in general, this may also hinder the detection of the hybridization (Dupuis et al., 2012). Nevertheless, low deer density in Misaki Town and surroundings and the fact that no hybridization without Misaki sample was detected indicate that the hybridization event between Okinoshima population and native Japanese sika deer may be limited. Our genetic assessment using the three markers, which can be used to diagnose hybridizations at least in F1 generation hybrids, will maximize the detection of hybridization in this stage. However, further crosses between these populations may reduce the detection rate of our method. In this case, more markers resulting from NGS data will be needed to diagnose subspecies and determine hybridization rate.

| Conservation of Japanese sika deer
Japanese sika deer is an endemic species in Japan and is highly divergent in terms of phenotypic (Terada et al., 2012) and genetic features (Nagata et al., 1999). A phylogenetic study using mitochondrial DNA suggested that 4.9-6.7 million years may have passed since the divergence of Japanese and other C. nippon subspecies including Formosan sika deer (Randi et al., 2001). In addition to the phylogenetic difference, the "Nara deer," a national treasure in Nara Prefecture, Kinki District, has been protected by law since the 13th century, and the importance of the deer involves historical, cultural, and also tourism perspectives (Torii & Tatsuzawa, 2009). Therefore, further hybridization should be avoided and the original genetic integrity of the Japanese sika deer population should be conserved.

| Management of the Okinoshima population
Recently, a re-introduction program has been implemented for the Formosan sika deer, and the population size has exceeded 1000 individuals in Kenting National Park, Taiwan (Pei & Liang, 2015). However, the genetic diversity of the re-introduced population is low because of the limited number of founders; therefore, outcrossing is necessary (Pei & Liang, 2015).

| Applying NGS data for conservation genetics
Researchers are increasingly using publicly available NGS data in conservation genetics (Allendorf, Hohenlohe, & Luikart, 2010). In the present study, we successfully employed NGS data to establish novel genetic markers for species identification and determining hybridization. The use of NGS data to develop genetic markers offers several advantages. First, given the inclusion of genome-wide polymorphisms in such data, multiple copies of a single genetic region in the genome can be detected. Such duplicated regions will negatively influence phylogenetic analyses; therefore, the use of NGS data allows the exclusion of these regions and the selection of single-copy regions instead (Naumann et al., 2011). In the present study, we searched for and sequenced C02 and C22, both single-copy regions in the Cervidae genome. The use of genomic data in conservation and phylogenetic studies also offers the potential for detecting a genetic region correlated with adaptive phenotypes (Ouborg, Pertoldi, Loeschcke, Bijlsma, & Hedrick,2010 ). For example, a genomic approach in polar bears enabled successful identification of genetic regions under strong selection and linked to reorganization of the cardio-vascular system (Liu et al., 2014). We recommend NGS databased primer establishment coupled with fine mapping to uncover candidate regions related to adaptive phenotypes as such discoveries can greatly benefit conservation efforts.
Mitochondrial data revealed that European red deer, tule elk, and Japanese sika deer are closely related (Ludt, Schroeder, Rottmann, & Kuehn, 2004). In support of this previous finding, we identified a shared o1 diploid genotype of C22 between Frm and European red deer. However, we also found alignment gaps at 23, 24, and 322-329 nt in the C02 sequence. These gaps probably result from mapping errors, a major limitation of NGS-based primer establishment that should be taken into account. Therefore, future studies should obtain actual tissue samples of European red deer, tule elk, and white-tailed deer to directly sequence the C02 region, thereby allowing a more precise estimation of phylogenetic relationships.

| Conclusion
The present study uncovered introgressive hybridization among Cervidae members. Currently, introgression between the Okinoshima population and native Japanese sika deer is likely rare, because we found only one hybrid individual in Honshu.
However, our data suggest that at least one Okinoshima doe swam to Honshu. Moreover, in Misaki Town and surroundings, we found little evidence of sika deer existence, suggesting a low native population density. The Okinoshima deer are also hybrids with at least three parent species, that is, the introgression of their genetic material into a small Japanese sika population could have an outsized effect. However, other hybridization events may not have been detected because of the limited sample size and markers used in the present study. To have a better understanding of introgression in Japanese deer populations, we strongly recommend genetic assessment on a large scale, covering more individuals and expanding across North Kinki district. The PCR-RFLP procedure we developed can accelerate such large-scale assessments. We also strongly recommend the establishment of fence around Okinoshima Island, to prevent further degradation of genetic integrity among the native deer population of Honshu (Wild Management Organization, 2017).

ACK N OWLED G M ENTS
We thank Dr. Daniel Huson for his kind advice on the phylogenetic analysis. We also thank Mr. and Mrs. Yoshida, as well as the Management Office of Tomogashima, for field assistance. We would like to thank Editage (www.editage.jp) for English language editing.
The study was supported by a grant from the Kansai Organization for Nature Conservation in 2017.

CO N FLI C T O F I NTE R E S T
None Declared.