Nuclear microsatellite and mitochondrial DNA analyses reveal the regional genetic structure and phylogeographical history of a sanguivorous land leech, Haemadipsa japonica, in Japan

Abstract Recent molecular studies have indicated that phylogeographical history of Japanese biota is likely shaped by geohistory along with biological events, such as distribution shifts, isolation, and divergence of populations. However, the genetic structure and phylogeographical history of terrestrial Annelida species, including leech species, are poorly understood. Therefore, we aimed to understand the genetic structure and phylogeographical history across the natural range of Haemadipsa japonica, a sanguivorous land leech species endemic to Japan, by using nine polymorphic nuclear microsatellites (nSSR) and cytochrome oxidase subunit one (COI) sequences of mitochondrial DNA (mtDNA). Analyses using nSSR revealed that H. japonica exhibited a stronger regional genetic differentiation among populations (G'ST = 0.77) than other animal species, probably because of the low mobility of land leech. Analyses using mtDNA indicated that H. japonica exhibited two distinct lineages (A and B), which were estimated to have diverged in the middle Pleistocene and probably because of range fragmentation resulting from climatic change and glacial and interglacial cycles. Lineage A was widely distributed across Japan, and lineage B was found in southwestern Japan. Analyses using nSSR revealed that lineage A was roughly divided into two population groups (i.e., northeastern and southwestern Japan); these analyses also revealed a gradual decrease in genetic diversity with increasing latitude in lineage A and a strong genetic drift in populations of northeastern Japan. Combined with the largely unresolved shallow polytomies from the mtDNA phylogeny, these results implied that lineage A may have undergone a rapid northward migration, probably during the Holocene. Then, the regional genetic structure with local unique gene pools may have been formed within each lineage because of the low mobility of this leech species.


| INTRODUC TI ON
The Japanese archipelago comprises four main islands (Hokkaido, Honshu, Shikoku, and Kyushu; Figure 1) and is located off the far eastern coast of the Asian continent. The landmasses that later formed the Proto-Japanese Islands were originally located along the eastern fringe of the Asian continent. Approximately 15 million years ago (Mya), two landmasses-the northeastern and southwestern portions of the Proto-Japanese Islands-began to independently separate from the Asian continent, along with the formation of the Sea of Japan. Before the two landmasses were connected to each other, eastern and western Japan were separated by a sea zone (channel) called the Fossa Magna, which longitudinally traversed central Honshu, Japan (area in gray; Figure 1), during the Miocene (around 15-5 Mya; Tojo, Sekiné, Suzuki, Saito, & Takenaka, 2017). The Fossa Magna was later filled with thick sediments from the Tertiary (Takeda et al., 2004), and the subsequent elevation of the surrounding land during the Quaternary led to the formation of the high mountain chains in central Honshu. The Fossa Magna and orogenic movements are considered important dispersal barriers for many animal species; recent molecular phylogeographical analyses have indicated deep divergences between inter-and intraspecific vicariants in the Fossa Magna region (Sekiné, Hayashi, & Tojo, 2013;Shoda-Kagaya et al., 2010;Suzuki, Sato, & Ohba, 2002;Tojo et al., 2017;Watanabe et al., 2006;Watanabe, Tominaga, Nakajima, Kakioka, & Tabata, 2017). Furthermore, glacial and interglacial cycles have also been considered to affect genetic structures, mainly through the expansion and contraction of organismal distributions and through migrations from the Asian continent across land bridges during the Quaternary (Millien-Parra & Jaeger, 1999;Motokawa, 2017;Qiu, Fu, & Comes, 2011). In this context, the geohistory of Japan, along with biological events such as distribution shifts, isolation, and divergence of populations, is probably engraved within organismal genomes. Most of these phylogeographical studies on animal species were conducted for vertebrate and insect species. However, phylogeography of several other animal species, including terrestrial species of the phylum Annelida, is poorly understood.
A previous study using mitochondrial and nuclear DNA sequences of leech species revealed that the phylogeographical history of three medicinal leeches has been shaped by population differentiation occurred during a postglacial range expansion and by present climatic influences (Trontelj & Utevsky, 2012). However, the lack of phylogenetic and genetic structure of two of these leech species did not allow the complete understanding of their phylogeographical history. Trontelj and Utevsky (2012) then noted the need for an analysis using microsatellites exhibiting high levels of polymorphism. The use of nuclear microsatellites (nSSR) in combination with mitochondrial DNA (mtDNA) sequences for studies on leech species provides an opportunity to track the phylogeographical history, as previously reported for other animal species (e.g., Konishi, Hata, Matsuda, Arai, & Mizoguchi, 2017;López-Uribe, Cane, Minckley, & Danforth, 2016;Michaelides et al., 2015;Rutkowski et al., 2015;Shoda-Kagaya et al., 2010). However, few genetic analyses on leech species using nSSR markers have been conducted (e.g., Liu et al., 2016). Therefore, in the present study, we aimed to reveal the genetic structure and phylogeographical history using nSSR markers and mtDNA sequences for  Tissues from the caudal sucker were used for DNA extraction to prevent contamination from residual blood from recent feedings.

| Sequencing of mtDNA
The mitochondrial cytochrome oxidase subunit one (COI) gene on an ABI 3500 Genetic Analyzer. The obtained DNA sequences were visually inspected, including quality check, and aligned using BioEdit 7.2.5.0 (Hall, 1999). Multiple alignments of sequences were obtained using ClustalW (Thompson, Higgins, & Gibson, 1994) and then manually adjusted. All sequences were deposited in GenBank under accession numbers LC427683-LC427763.

| Nuclear microsatellite genotyping
Nine nSSR loci, previously developed for Haemadipsa japonica by Morishima, Suzuki, and Aizawa (2018), for 798 individuals of leeches from 37 populations were used. Two populations-38 (Hiroshima) and 39 (Wakayama)-were excluded from this analysis because of their small sample sizes (N = 1-2) ( Notes. Lat., Long., and Alt. indicate latitude, longitude, and altitude of the sampling location, respectively; N indicates numbers of samples analyzed for nSSR and mtDNA; numbers in parentheses indicate breakdown of pooled samples; we pooled these locations as single population because two locations were in the same mountain and/or along the same river system and because sample size of these locations was small.  Table 2).
Genotypes were determined using an ABI 3500 Genetic Analyzer and GeneMapper ver. 4.1 (Applied Biosystems).

| Data analysis for mtDNA
A phylogenetic maximum likelihood analysis of the COI haplotypes was performed using PhyML 3.0 (Guindon et al., 2010) on the web server with default settings (http://www.atgc-montpellier.fr/phyml/).
The substitution model GTR+G+I was selected based on the Akaike information criterion (AIC). Bootstrap support was calculated with 1,000 replications. Haemadipsa picta (accession number HQ203177) was included as an outgroup based on Lai, Nakano, and Chen (2011).
Moreover, population differentiation measures (G ST and N ST ) were estimated and the significant difference between them was tested using PERMUT (Pons & Petit, 1996) with 1,000 permutations. A significantly higher N ST than G ST denotes the presence of a phylogeographical structure (Pons & Petit, 1996). Notes. F ST , genetic difference index (Weir & Cockerham, 1984); G ST , a measure of genetic differentiation (Nei, 1973); G′ ST , estimated standardized measure of genetic differentiation (Hedrick, 2005); H O , observed heterozygosity; H S , gene diversity within populations; H T , total genetic diversity over all populations; N A , number of alleles per locus; T A , annealing temperature. Significant deviation of the F ST values from zero was tested using 1,000 randomizations (p < 0.01).

| Data analysis for nSSR
(H T ), and measures of genetic differentiation among the population-F ST (Weir & Cockerham, 1984) and G ST (Nei, 1973)-were calculated using the FSTAT ver. 2.932 software (Goudet, 2001). Statistical significance of F ST was also tested using FSTAT. An estimated standardized measure of genetic differentiation (G' ST ; Hedrick, 2005) was calculated using GenAlEx ver. 6.502 (Peakall & Smouse, 2006). We tested genotypic disequilibrium by using FSTAT for all pairs of loci with 1,000 permutations.
To assess population structure, an individual-based Bayesian clustering algorithm, implemented in the STRUCTURE ver. 2.3.4 (Pritchard, Stephens, & Donnelly, 2000) software, was used. The algorithm available in the STRUCTURE software estimates allele frequencies for each gene pool (cluster) and population memberships for every individual (Hubisz, Falush, Stephens, & Pritchard, 2009).
The STRUCTURE software was run for 100,000 MCMC iterations after a burn-in period of 100,000 on the total dataset. STRUCTURE was run 10 times independently for each cluster (K) (ranging from 1 to 20). The obtained results were harvested using STRUCTURE HARVESTER (Earl & vonHoldt, 2012). The optimal number of clusters K was determined using two alternative approaches based on the change of mean log likelihoods of the data, LnP(D) (Pritchard et al., 2000), and rate of change in LnP(D), ΔK (Evanno, Regnaut, & Goudet, 2005), between successive K values; this is because ΔK is not always a good indicator of the best K as suggested by several studies (e.g., Janes et al., 2017;Wang, 2017). The outputs of 10 independent runs for around optimal K values were integrated using CLUMPP ver.1.1 (Jakobsson & Rosenberg, 2007) and visualized using DISTRUCT ver.1.1 (Rosenberg, 2004

| Genetic structure analysis for mtDNA
Total length of the sequenced mtDNA COI fragments was 658 bp.
A total of 81 haplotypes with 39 substitution sites were identified (Supporting Information Appendices S1 and S2). The maximum like-

| Genetic diversity for nSSR
Nine nSSR loci for 37 Haemadipsa japonica populations showed highly variability ( Table 2) Figure 9); whereas in lineage B, the decrease in these parameters with the increase in latitude was not significant (R 2 = 0.099, p = 0.565 for H E ; R 2 = 0.166, p = 0.968 for Ar [26] ).

| Genetic differentiation of Haemadipsa japonica in Japan
The genetic differentiation measures, F ST and G' ST , of Haemadipsa japonica, estimated using nSSR, were much higher than those of other animals, including species with low dispersal potential, such as the canyon tree frog (Hyla arenicolor), the giant water bug (Abedus herberti), and the three aquatic leeches (Whitmania pigra, Hirudo F I G U R E 5 Genetic structure defined using nine nuclear microsatellite loci and the Bayesian clustering, STRUCTURE, and mtDNA lineages. Individuals are represented by vertical lines. Population codes (1-37) correspond to those shown in Table 1 F I G U R E 6 Estimation of divergence times in Haemadipsidae based on cytochrome oxidase subunit one (COI) mitochondrial DNA sequences. Clade depth indicates the mean nodal age (million years = Myr); nodes are annotated with bars indicating the 95% highest posterior density intervals for node ages. Lineage A, sublineage B1, and sublineage B2 of Haemadipsa japonica were obtained based on a maximum likelihood phylogeny analysis and haplotype network. The unit of scale bar is substitution per site nipponica, and Poecilobdella manillensis) (Table 5). Furthermore, the results of the STRUCTURE analysis at K = 13 showed a regionally unique clustering pattern with populations dominated by unique gene pools (populations 19-21 and 31) ( Figure 5). This regional and strong genetic differentiation among populations, as well as the isolation by distance from H. japonica (Figures 5 and 8), may be a result of this leech species' incapability of dispersing long distances (Borda & Siddall, 2010); this is also implied for aquatic leech species (Liu et al., 2016). Long-distance dispersal of birds may be considered a reasonable explanation for the distribution of duognathous leech species; however, this hypothesis is not plausible because of the species' feeding behavior and because it has only rarely been reported feeding on migratory birds (Borda & Siddall, 2010). The STRUCTURE analysis indicated that two genetically distinct populations (formed by populations 10-12 and populations 13-15) with different gene pools were found in Tochigi Prefecture (Figure 1 and K = 13 or 15 in Figure 5); these two populations were separated by less than 50 km.
Although, in Japan, H. japonica is known to feed on blood from sika deer, wild boar, Japanese serow, and humans (Sasaki et al., 2005;Sasaki & Tani, 2008), our results suggest that H. japonica may not be capable of dispersing several tens of kilometers even if they move with assistance of these host animals. In this context, the low mobility of H. japonica may have resulted in a strong genetic differentiation and a low level of gene flow between populations.

| Phylogeographical history of Haemadipsa japonica in Japan
Our analysis showed that Haemadipsa japonica was comprised of two mtDNA lineages (A and B) with no shared haplotypes within the lineage (Figures 3 and 4). The divergence of these lineages was estimated to have occurred ~700,000 years ago, and sublineages B1 and B2 were estimated to have diverged ~400,000 years ago; both estimated times are in the middle Pleistocene ( Figure 6). Genetic differentiation between northeastern and southwestern lineages has been reported in several common animal species found across Honshu, Shikoku, and Kyushu, such as the Japanese sika deer (Cervus nippon), Japanese hare (Lepus brachyurus), tree frog (Hyla japonica), Pelophylax frog (Pelophylax nigromaculatus), and seed parasitic weevil (Curculio hilgendorfi) (Aoki, Kato, & Murakami, 2008;Dufresnes et al., 2016;Nagata et al., 1999;Nunome, Torii, Matsuki, Kinoshita, & Suzuki, 2010). The existence of two lineages in the aforementioned animal species was proposed based on intraspecific phylogenetic analyses, and it was tentatively explained by two biogeographi-

F I G U R E 7
The change in the mean log likelihoods of the data, LnP(d) (a), and the rate of change in the LnP(d), ΔK (b), between successive K values obtained using the Bayesian clustering and STRUCTURE analysis for Haemadipsa japonica seems unlikely, although the possibility of a later extinction of these Haemadipsa species in the continental areas cannot be ruled out.
Alternatively, the second explanation is likely plausible. The global glacial and interglacial cycles had already started during the middle Pleistocene (Hansen, Sato, Russell, & Kharecha, 2013). In addition, in the middle Pleistocene, islands from the Japanese archipelago, including Honshu, Kyushu, and Shikoku, supposedly formed a single landmass with the Asian continent through a land bridge (e.g., Ota, 1998). In this context, the divergence between lineages A and B hermaphrodites and reproduce by cross-fertilization (Mann, 1962 haplotypes (Toews & Brelsford, 2012). Considering that the populations of sublineage B1 of Haemadipsa japonica were geographically separated from each other and randomly distributed in certain areas of Honshu, adaptive introgression of mtDNA seems an unlikely ex- planation. An alternative explanation is demographic consequences.
This means that, after the divergence of lineages A and B, populations possessing each haplotype may have had secondary contact with mtDNA introgressive hybridization and may have separated from each other again, which was then followed by population fragmentation and genetic drift. As previously discussed in this text, H. japonica is considered to have low mobility; therefore, subsequent gene flow among these populations could have been reduced. As the effective population size of mitochondrial genome in hermaphrodites represents only one half of a nuclear genome (Latta, 2006), the fixation of mtDNA lineages by genetic drift is more likely to occur than that of alleles of nuclear genome. Therefore, the different results for mitochondrial and nuclear data in populations of lineage A from southwestern Japan and sublineage B1 may have resulted from secondary contact with introgressive hybridization and subsequent random genetic drift and fixation. This explanation is in line with the randomly overlapping distribution of sublineage B1 in the distribution of lineage A in southwestern Japan.
In contrast to lineage A of southwestern Japan (populations 25-31 and 33), lineage A of northeastern Japan (populations 1-21) has almost exclusively unique gene pools (in blue and light blue) of nSSR, except for populations 19 and 20 at K = 4 (STRUCTURE analysis; Figure 5). In addition, genetic diversity of nSSR significantly decreased with the increase in latitude (Figure 9), whereas a nonsignificant change with the increase in latitude was found for lineage B.  (Wilson & Eisenberg, 1982) and thrive and reproduce between 21 and 32°C (Keegan, Toshioka, & Suzuki, 1968 (Humphries & Winker, 2010;Lorenzini et al., 2014;Maddison, 1989;McCracken & Sorenson, 2005). In the case of the roe deer species, the poorly resolved phylogenetic relationships were considered to have been F I G U R E 9 Relationship between latitude and the genetic diversity measures for nuclear microsatellite loci-expected heterozygosity (H E ) (a) and allelic richness (Ar [26] ) (b)-in populations of Haemadipsa japonica from lineage A caused by high rates of migration among populations (Lorenzini et al., 2014). Haemadipsa japonica exhibits a strong regional genetic differentiation among populations; in addition, mtDNA lineage A of H. japonica showed several star-like divergences of haplotypes along with the presence of phylogeographical structure ( Figure 4).
Therefore, our results suggested that the largely unresolved shallow polytomies found in lineage A may be because of a rapid northward migration with a rapid divergence of haplotypes probably during the Holocene.

| CON CLUS ION
In this study, Haemadipsa japonica was found to include two mtDNA lineages (A and B), which diverged in the middle Pleistocene. After the LGM, probably during the Holocene, H. japonica underwent a northward migration with a rapid divergence of haplotypes; the subsequent regional strong genetic structure may have been a result of the low mobility of H. japonica.
Since the 1980s, the populations of this leech have expanded, possibly because of the increase in mammal populations, and it has become a serious problem in several prefectures in Japan, as the leech has expanded to areas near human residence (Asada, Ochiai, & Yamanaka, 1995;Sugiyama & Sakaniwa, 2010). The clear genetic structure found in the present study may be used to predict ongoing or future dispersal routes of H. japonica, besides new areas of occurrence toward which leech populations may locally expand.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N
K.M. and M.A. designed this study and conducted sampling in field.
K.M. conducted nuclear microsatellite genotyping, sequencing of mtDNA, and analyzed data. All authors contributed to writing this manuscript.

DATA ACCE SS I B I LIT Y
All mitochondrial DNA (mtDNA) sequences have been deposited in Genbank under the accessions numbers LC427683-LC427763; numbers of the mtDNA haplotype data of each population is in the Appendices; all data on microsatellite genotypes have been deposited in Dryad (https://doi.org/10.5061/dryad.b9v801n).