Comparative mitochondrial genomic analyses of three chemosynthetic vesicomyid clams from deep‐sea habitats

Abstract Vesicomyid clams of the subfamily Pliocardinae are among the dominant chemosymbiotic bivalves found in sulfide‐rich deep‐sea habitats. Plastic morphologies and present molecular data could not resolve taxonomic uncertainties. The complete mitochondrial (mt) genomes will provide more data for comparative studies on molecular phylogeny and systematics of this taxonomically uncertain group, and help to clarify generic classifications. In this study, we analyze the features and evolutionary dynamics of mt genomes from three Archivesica species (Archivesica sp., Ar. gigas and Ar. pacifica) pertaining to subfamily Pliocardinae. Sequence coverage is nearly complete for the three newly sequenced mt genomes, with only the control region and some tRNA genes missing. Gene content, base composition, and codon usage are highly conserved in these pliocardiin species. Comparative analysis revealed the vesicomyid have a relatively lower ratio of Ka/Ks, and all 13 protein‐coding genes (PGCs) are under strong purifying selection with a ratio of Ka/Ks far lower than one. Minimal changes in gene arrangement among vesicomyid species are due to the translocation trnaG in Isorropodon fossajaponicum. Additional tRNA genes were detected between trnaG and nad2 in Abyssogena mariana (trnaL3), Ab. phaseoliformis (trnaS3), and Phreagena okutanii (trnaM2), and display high similarity to other pliocardiin sequences at the same location. Single base insertion in multiple sites of this location could result in new tRNA genes, suggesting a possible tRNA arising from nongeneic sequence. Phylogenetic analysis based on 12 PCGs (excluding atp8) supports the monophyly of Pliocardiinae. These nearly complete mitogenomes provide relevant data for further comparative studies on molecular phylogeny and systematics of this taxonomically uncertain group of chemosymbiotic bivalves.

elements controlling the initiation of replication and transcription (Boore, 1999;Wolstenholme, 1992). Due to their fundamental roles in oxidative phosphorylation responsible for energy production (Saraste, 1999), mt PCGs are generally thought to evolve primarily under constant purifying selection (Oliveira, Raychoudhury, Lavrov, & Werren, 2008). During the past decade, mt genomes had been intensively investigated for the study of molecular evolution, population genetics, and inferring phylogeny (Boore, 1999;Shen et al., 2017).
Vesicomyid bivalves occur globally, mostly in sulfide-rich marine substrates found at deep-sea hydrothermal vents, hydrocarbon seeps, and sites of organic enrichment such as whale carcasses (Johnson, Krylova, Audzijonyte, Sahling, & Vrijenhoek, 2017;Peek, Gustafson, Lutz, & Vrijenhoek, 1997;Peek et al., 2000). Most members of this family housing intracellular autotrophic sulfide-oxidizing endosymbionts that provide essentially all of their nutrients and energy supply, making them primary subjects for studying adaptive strategies for chemosynthesis-based nutrition , and host/symbiont coevolution Shimamura et al., 2017). The Vesicomyidae are divided into two subfamilies: Vesicomyinae and Pliocardiinae partially according to their gut and gill structure .
Vesicomyinae including small-sized bivalves are characterized by nonreduced gut and the absence of subfilamental tissue in gills, whereas all Pliocardiinae studied to date have reduced gut systems. By far, more than 100 species have been described in family Vesicomyidae distributed worldwide from sublittoral zone to the hadal depths (Krylova, Sahling, & Janssen, 2010). However, their taxonomy was still fully unresolved owing to the small gene datasets used for the phylogenetic analyses (Johnson et al., 2017). The highly compact and easily accessible mt genomes could provide informative data to define their confused genera and draw their global distribution. Complete (or nearly complete) mt genomes are known from many species of bivalves, but only five are recorded from vesicomyids: Abyssogena mariana, Ab. phaseoliformis, Isorropodon fossajaponicum, Phreagena okutanii, and "Calyptogena" magnifica (single quotes denote a dubious genus assignment) (Liu, Cai, Zhang, & Vrijenhoek, 2015;Ozawa, Shimamura, Takaki, Yokobori et al., 2017).
In this study, nearly complete mt genomes were sequenced from three pliocardiin species assigned to the genus Archivesica: Archivesica sp., Ar. gigas, and Ar. pacifica according to Johnson et al. (2017). The mt genomes were annotated and compared to other available bivalve mt genomes. In this paper, we discuss our findings in nucleotide composition, gene rearrangement patterns, and mt genome evolution at the intrafamily level.

| Sample and DNA extraction
Clam specimens were sampled with human occupied and remotely operated vehicles (Table 1). Specimens were initially transferred to buckets containing 100% ethanol, or preserved at −80°C until used for DNA extraction. Total genomic DNA was extracted using DNA extraction kit (Tiangen, Beijing, China) for marine animals according to manufacturer's protocols.

| Gene annotation and bioinformatics analysis
The protein-coding and rRNA genes are annotated by blast searches in GenBank and aligned to the orthologous mt genes of bivalves.
Start codons of protein-coding genes were set at the first start codon that did not overlap with an upstream gene, and a stop codon was not allowed to overlap with downstream genes. The boundaries of the ribosomal genes rrnL and rrnS were assumed to extend to the boundaries of flanking genes as the ends of ribosomal genes were difficult to be precisely determined by DNA sequencing alone (Boore, 2006). Identification and positional confirmation of the tRNAs was accomplished by using the software of MITFI (Juhling et al., 2012) and ARWEN (Laslett & Canback, 2008).

| Phylogeny
For phylogenetic analysis, we downloaded most of bivalve mt genomes used to reconstruct phylogenetic trees of subclass Heterodonta from the reference (Ozawa, Shimamura, Takaki, Yokobori et al., 2017) and two Pteriomorphia species were set as outgroup. Summarizing, we included in our dataset 32 Veneroida, four Myoida, three Lucinoida, one Pholadomyoida, and the outgroup two Ostreoida. Each PCG, with the exception of atp8, was separately aligned with codon-based multiple alignments implemented in Mega7.0, and the gap and divergent regions were removed using Gblocks (version 0.91b) (Castresana, 2000) under default (stringent) settings. The atp8 gene was not included in the analyses due to the missing of this gene in several bivalves.
Even though the "missing" atp8 might be found after re-annotation (Breton, Stewart, & Hoeh, 2010), several species still lack this gene.
Alignments of individual genes were then concatenated as a combined

| The mt genomes in deep-sea clams
The mt genomes were sequenced by a combination of short and long-PCR amplifications. Initially, partial fragments of four mtDNA F I G U R E 2 Evaluation of codon bias in the mitochondrial genomes of eight vesicomyid species. ENC, effective number of codons; CBI, codon bias index; G + Cc, G + C content of all codon positions; G + C3s, G + C content of the third codon positions genes (nad6, cox1, rrnL, and nad5) were amplified using degenerate primers designed according to known vesicomyid mt genomes.
Then species-specific primers based on the known fragments were redesigned to perform long-PCR amplification. Except for regions located between nad6 and nad5/trnaL1/trnaW, most parts of the mt genomes were successfully sequenced (Appendix S2). The regions that failed to be sequenced were known to contain notable base composition bias and high numbers of tandem repeats (Liu et al., 2015;Ozawa, Shimamura, Takaki, Yokobori et al., 2017)  Typical metazoan mt genomes generally contained 37 genes including 13 protein-coding genes (PCGs), two rRNAs, and 22 tRNAs (Wolstenholme, 1992), whereas vesicomyid clams are reported to Twenty-two tRNAs were detected for Ar. pacifica, while 20 and 21 tRNA were detected for Archivesica sp. and Ar. gigas, respectively, indicating that we have sequenced most mt genes of these three clams.
Nucleotide composition and A − T/G − C proportions were computed for each single gene and for PCGs, rRNA, and tRNAs taken as a whole (Figure 1). Each analyzed gene was consistently biased toward being AT-rich and positive GC-skew. All PCGs were heavily negative AT-skewed, while the tRNA genes were moderately T-skewed, and the rRNA genes are slightly T-or A-skewed. Nucleotide composition bias is also reflected in the codon usage pattern (Appendix S4, S5).
Among 62 amino acid encoding codons, UUU-F, UUA-L, and AUU-I were the most frequently used codons. Relative synonymous codon frequencies (RSCU) revealed that the degenerate codon usage at the third codon positions is generally biased to use more As and Ts than Gs and Cs. However, vesicomyid species used more codon GGG for amino acid Gly than other three degenerate codons (GGA, GGC, and GGU), while some bivalves used GGU or GGA as the most frequently used codon for Gly (Plazzi, Ribani, & Passamonti, 2013;Xu et al., 2010).
To further investigate the codon usage bias among vesicomyid species, we analyzed the correlations between ENC (effective number of codons), CBI (codon bias index), the G + C content of all codons (G + Cc), and the G + C content of the third codon position (G + C3s).

| Protein-coding genes
The vesicomyid mt genomes contained the full set of PCGs (including atp8) usually presented in metazoan mtDNA. We reannotated mt genome of "C." magnifica, which changed the location of start and/or stop codon for nad3 and cox3 (Appendix S3). Except for cox1 and cox2, substantial size variation for each of the PCGs was not found among the eight vesicomyid species compared in this study (Appendices S2 and S3). The cox1 sequence from Archivesica sp., Ar. gigas, Ar. pacifica, and "C." magnifica con-

| Transfer RNA genes
We re-annotated mt tRNAs of "C." magnifica, and a putative trnaS Almost all of the tRNA genes possessed the cloverleaf secondary structure composed of four arms with conserved size, except the two trnaS genes which appeared to lose the DHU arm (Appendix S7).
Similar structures have also been observed in many other bivalve mt genomes (Plazzi et al., 2013). Although the function of tRNAs that lack a DHU arm have not been characterized in bivalves, it have been reported that in the nematode the tRNAs lacking either the DHU or the TψC arm still retained functionality (Okimoto, Macfarlane, Clary, & Wolstenholme, 1992).

| Ribosomal RNA genes
As in other vesicomyids, the large and small rRNA subunits (rrnL and rrnS) from the three newly cloned mt genomes were located between cytb and atp8, and between trnaT and trnaM, respectively. We re-annotated the boundaries of rrnL and rrnS of the five previously reported vesicomyid clams to have the largest frame, but we did not allow them to overlap with adjacent genes. The

| Evolution of the vesicomyid mt genome
The ratio of Ka (nonsynonymous substitution rate) to Ks (synonymous substitution rate) is widely accepted to measure the rate of PCGs sequence evolution: Ka/Ks > 1 means positive selection; Ka/Ks < 1 means purifying selection; and Ka/Ks not significantly different from 1 indicates neutral evolution (Yang & Bielawski, 2000;Zhang et al., 2006). We compared Ka/Ks ratios Low Ka/Ks ratios were also reported for mt PCGs from the deepsea giant isopod Bathynomus sp. (Shen et al., 2017). The mt PCGs work in close association with nuclear-encoded subunits in protein complexes involved in oxidative phosphorylation system (Burton & Barreto, 2012). Malfunctions of these PCGs would be lethal or at least severely affect fitness. Lethal mutation will be excluded, while neutral, mildly deleterious or suited mutation will be retained (Oliveira et al., 2008). Vesicomyid species usually reside in pore water that contains high concentration of toxic chemicals such as hydrogen sulfide and heavy metal, and symbioses with chemosynthetic bacteria. The toxic habitat, in combination with effect of symbionts, might serve as the force of directional selection.
F I G U R E 5 Gene order of mt genomes from eight vesicomyid clams. All species with complete or nearly complete mt genomes are listed. Noncoding region means none coding region. White colors indicate changes compared to the "C." magnifica pattern. Asterisks indicate genes insertion and underlines refer to gene translocation event

| Gene order
Previous work had showed that gene orders in bivalve mtDNA were highly rearranged, and tRNAs were more highly rearranged than PCGs and rRNA genes (Milbury & Gaffney, 2005). As expected, a lot of differences were found in gene arrangement between vesicomyid clams and other bivalves. After excluding tRNA genes from comparison, the eight vesicomyid species had a conversed gene order quite different from the Solemya velum clam (Eisen, Smith, & Cavanaugh, 1992;Plazzi et al., 2013) and Bathymodiolus mussel (Ozawa, Shimamura, Takaki, Yokobori et al., 2017), which were also reported to harbor chemoautotrophic symbionts in its gill tissue for their nutrition. However, they displayed somewhat similar gene order with genus Meretrix, sharing three gene blocks (cox2cytb-rrnL-atp8-nad4-atp6-nad3, rrnS-cox3-cox1, and nad5-nad6) ( Figure 4). They also shared four gene blocks (rrnS-cox3-cox1-cox2, F I G U R E 6 Alignment of sequence located between trnaG and nad2 in eight vesicomyid mt genomes (a), and tRNA detection in the putative sequence based on "Calyptogena" magnifica mt genome after base insertion (b) F I G U R E 7 Phylogenetic relationship of Heterodonta based on the concatenated nucleotide sequences of 12 mt protein-coding genes. Numbers on branches are bootstrap probability of Neighbor-joining method (left) and Maximum-likelihood method (right). Two Ostreidae species belong to the subclass Pteriomorphia are used to root the tree. Red type face indicates the species for which mt genomes are determined in the present study  -nad3-nad1, nad6-nad4L, and cytb-rrnL) with Arctica islandica (Figure 4).
Rearrangement of tRNA was observed within vesicomyid species. The trnaG located at downstream of the blocks nad6-nad4L in most vesicomyid except for I. fossajaponicum, whose trnaG located at upstream of this block ( Figure 5). By alignment, the trnaG in I. fossajaponicum was highly identical to its counterparts in other seven vesicomyid clams (Appendix S6), suggesting a tRNA translocation event.
Rearrangements are random discrete events, and retro-mutation to identical gene orders arising by chance is very unlikely (Boore & Staton, 2002). Thus, these rearrangements of tRNA in vesicomyid species would be a valuable phylogenetic marker. For example, tree topology based on the gene order was similar to that of the ML phylogeny based on the concatenated gene sequences, which was reported in bivalves (Ozawa, Shimamura, Takaki, Yokobori et al., 2017).
When compared to "C." magnifica mt genome, an additional tRNA genes were identified at the location between trnaG and nad2 in Ab. mariana (trnaL3), Ab. phaseoliformis (trnaS3), and Ph. okutanii (trnaM2), between cox1 and trnP in I. fossajaponicum (trnN2), as well as at downstream of trnaW in Ab. phaseoliformis (trnaH2) and I. fossajaponicum (trnaK2) ( Figure 5). Additional gene copy in mt genome could be obtained by gene replication, and different gene copies would share somewhat sequence identify to each other.
However, sequence analysis indicated that these additional tRNA genes showed quite a low similarity to other tRNA genes that encoded the same tRNA, instead, the additional tRNA between trnaG and nad2 display high similarity to the sequences of the same location ( Figure 6a). The interval between cox1 and trnP in these clams except I. fossajaponicum is too short in length, while the intervals between trnaW and NCR (noncoding region) are unknown in some clams, so we do not analyze the sequence of these two intervals.
The interesting point is, with a few mutations, new tRNA would be detected in the sequence between trnaG and nad2. For example, only one insertion at many sites in this region of "C." magnifica mt genome would result in different tRNA detection ( Figure 6b). Recently, comparative genomic analyses have revealed that new genes may derive from ancestral intergenic sequence (Zhao, Saelao, Jones, & Begun, 2014). The additional tRNA located between trnaG and nad2 probably arise from site mutation in this region, even though more molecular evidence is needed.

| Phylogenetic analysis
Phylogenetic analyses were performed with nucleotide and amino acid

| CON CLUS ION
In this study, we sequenced and annotated three nearly complete mt genomes belonging to family Vesicomyidae that colonize hydrothermal vents or cold seeps. Comparative analysis of eight vesicomyid mt genomes showed that gene content, codon usage, base composition, and tRNA structure were highly conserved, whereas gene arrangement displayed slight diversity. A relative low ratio of Ka/Ks exhibited by most PCGs of vesicomyid mt genomes suggested mt genomes of vesicomyid clams might undergo strong purifying selection in deep-sea habitats.
Sequence analysis showed some tRNA of vesicomyid mt genomes probably arise from ancestrally nongenic sequence.

ACK N OWLED G M ENTS
Authors are grateful to Dr. Robert C. Vrijenhoek for his kindness in providing clams. This work was supported by National Natural

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
HL and SY performed all experiments and analyzed the data. HL, JL, and HZ wrote the manuscript. All authors contributed to editing and revising the manuscript. All authors read and approved the final manuscript.

E TH I C S A PPROVA L A N D CO N S E NT TO PA RTI CI PATE
Approval of the experimental protocol of this experiment by the Experimental Animal Ethics Committee of Hainan Province, China was not needed for this experimental approach.