Complete mitochondrial genome sequence of the Himalayan Griffon, Gyps himalayensis (Accipitriformes: Accipitridae): Sequence, structure, and phylogenetic analyses

Abstract This is the first study to describe the mitochondrial genome of the Himalayan Griffon, Gyps himalayensis, which is an Old World vulture belonging to the family Accipitridae and occurring along the Himalayas and the adjoining Tibetan Plateau. Its mitogenome is a closed circular molecule 17,381 bp in size containing 13 protein‐coding genes, 22 tRNA coding genes, two rRNA‐coding genes, a control region (CR), and an extra pseudo‐control region (CCR) that are conserved in most Accipitridae mitogenomes. The overall base composition of the G. himalayensis mitogenome is 24.55% A, 29.49% T, 31.59% C, and 14.37% G, which is typical for bird mitochondrial genomes. The alignment of the Accipitridae species control regions showed high levels of genetic variation and abundant AT content. At the 5′ end of the domain I region, a long continuous poly‐C sequence was found. Two tandem repeats were found in the pseudo‐control regions. Phylogenetic analysis with Bayesian inference and maximum likelihood based on 13 protein‐coding genes indicated that the relationships at the family level were (Falconidae + (Cathartidae + (Sagittariidae + (Accipitridae + Pandionidae))). In the Accipitridae clade, G. himalayensis is more closely related to Aegypius monachus than to Spilornis cheela. The complete mitogenome of G. himalayensis provides a potentially useful resource for further exploration of the taxonomic status and phylogenetic history of Gyps species.


| INTRODUC TI ON
The Himalayan vulture (Gyps himalayensis) from the family Accipitridae has a wide distribution in Southeast Asia. In China it is found along the Himalayas and the adjoining Tibetan Plateau during the breeding season, and from Yunnan province during the winter season (BirdLife International, 2016;Lu, Ke, & Zeng, 2009). It inhabits mountainous areas at altitudes ranging from 1,200 to 6,000 m (Ferguson-Lees & Christie, 2001). The species' population trend is currently stable, but future declines are possible because vultures feed on livestock carcasses containing diclofenac, a nonsteroidal anti-inflammatory pharmaceutical known to cause reproductive failures in birds (Das, Cuthbert, Jakati, & Prakash, 2011;Oaks et al., 2004;Shultz et al., 2004;Swan et al., 2006).
There are currently eight recognized species in the genus Gyps including G. bengalensis, G. indicus, G. tenuirostris, and G. himalayensis from Asia, G. africanus, G. coprotheres, and G. rueppellii from Africa, and G. fulvus from Europe, Africa, and Asia (Ferguson-Lees & Christie, 2001;Pain et al., 2003). So far, the complete mitochondrial genomes of all species from this genus have not been amplified. In addition, the phylogenetic position of Gyps and related genus species has not been fully solved. Phylogenetic relationships in the family Accipitridae have traditionally been proven to be difficult to resolve based on morphological traits (Griffiths, 1994;Jollie, 1976).
The systematic taxonomic status among Falconiform taxa is rather complex and controversial Liu, Li, Du, & Liu, 2017).
Based on his point of view, Falconiformes is polyphyletic, including New World vultures (Cathartidae), and the result is supported by studies of some behavioral traits and molecular data (Wink & Sauer-Gürth, 2004). Although the taxonomy of order Falconiformes is supported morphologically (Livezey & Zusi, 2001;Mayr & Clarke, 2003), recent studies have shown that they do not support close phylogenetic relationships between the Falconidae and other families included in the order Falconiformes (Burleigh, Kimball, & Braun, 2015;Hackett et al., 2008;Jarvis et al., 2014;Mahmood, Mclenachan, Gibb, & Penny, 2014). According to the viewpoint of Hackett et al. (2008) and the South American Classification Committee, the Falconiformes is only included in the family Falconidae, and the remaining four families are placed in the order Accipitriformes. For a long time, the majority view has been to include accipitrids with the falcons in the Falconiformes, but some authorities have begun to recognize them as a separate group, Accipitriformes (Chesser et al., 2010). Additionally, a recent DNA study indicated that falcons not closely to the accipitrids, accordingly some authors propose that it should not be a part of Falconiformes (Chesser et al., 2012).
Instead, it should be placed within Accipitriformes, and include most of the diurnal birds of prey: hawks, eagles, vultures, and many others (Chesser et al., 2012;Liu et al., 2017). However, its phylogenetic position of Sagittarius serpentarius is still confusing, and a lot of controversy about its validity is ongoing (Ferguson-Lees & Christie, 2001).
In this study, the complete mitochondrial genome of G. himalayensis was sequenced and reported for the first time. Based on new data generated from G. himalayensis and obtaining complete mitogenome sequence data from GenBank, we tried to elucidate:

| Sample collection and genomic DNA extraction
Blood of the Himalayan Griffon, G. himalayensis (Figure 1 Figure 2. Five milliliters of blood was sampled from the brachial vein using heparinized syringes and stored in liquid nitrogen in field and stored at −80°C in the laboratory.
After a quality inspection, DNA was purified on 0.8% agarose gel.
After measuring DNA concentration using a spectrophotometer, it was used for the PCR amplification.

| Mitochondrial DNA amplification and sequencing
The complete mitochondrial genome was amplified in 13 overlapping segments by PCR with Taq DNA Polymerase (TaKaRa), using 20 ng of genomic DNA from the sample as a template. Complete mtDNA was amplified as concatenated sequences using selectively amplified mtDNA template and 11 primer pairs from literature (Jiang, Wang, Peng, Peng, & Zou, 2014;, Ast, Dimcheff, Yuri, & Mindell, 1999;Zhao et al., 2012). Partial PCR primers were designed based on the conserved regions of the related species, Aegypius monachus (NC_022957), Buteo buteo (NC_003128), and Spilornis cheela (NC_015887).The PCR amplifications were conducted in 25 μl reactions containing 1 × PCR Buffer, 200 nM of each primer, 400 μM dNTP, 1 μl template (as per kit), and 1 U LA-Taq or Taq DNA Polymerase (TaKaRa). The PCR protocol was as follows: an initial denaturation at 95°C for 3 min, followed by 30 amplification cycles with a S1000TM thermal cycler at 94°C for 30 s, 53-65°C for 30 s, and 72°C for 80-180 s, continued with a final extension at 72°C for 10 min. The PCR products were preserved at 12°C after the last cycle. Each amplicon was then purified using the Omega Cycle-Pure Kit (Omega Bio-Tek) and subjected to automated sequencing using an ABI 3730 sequencer, either directly or following sub-cloning into the pUC19 DNA vector (TaKaRa). To ensure high accuracy, each amplicon was sequenced twice independently with all the PCR primers.

| Sequence assembly, annotation, and analysis of the mitochondrial genome
The complete mitochondrial genome sequence of the Himalayan Griffon, G. himalayensis, was assembled using the SeqMan module  structures (Kumazawa & Nishida, 1993) and anticodons. Ribosomal RNA genes (rRNAs) were identified by NCBI Internet BLAST search and comparisons. Codon usage and nucleotide composition statistics were computed using MEGA 6.06 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013).

Composition skew analysis was carried out with the formulas AT-skew = [A-T]/[A + T] and GC-skew=[G-C]/
[G + C], respectively (Perna & Kocher, 1995). The complete mitogenome of G. himalayensis was deposited in GenBank under the accession number KY594709.

| Phylogenetic analysis
The sequence data were initially compared using program ClustalX 1.83 (Thompson, Gibson, Plewniak, Jeanmougin, & Higgins, 1997) with default parameters. The mitogenome of G.himalayensis was analyzed in a phylogenetic tree using a total of 32 other order Falconiformes mitogenomes retrieved from NCBI GenBank (Table   S1). The concatenated sequences of the 13 PCGs of complete mitochondrial genomes were used. On the basis of the best-fit model predicted by jModelTestv.0.1.1, the optimal nucleotide substitution model was selected using jModeltest v.0.1.1 (Posada, 2008) and the Akaike information criterion (Posada & Buckley, 2004). Maximum likelihood (ML) analysis was carried out with PhyML v.3.0 (Guindon & Gascuel, 2003). The confidence level (Felsenstein, 1985) at each branch was evaluated by allowing four substitution rate categories and performing 1,000 bootstrap replicates. BI analysis was performed with MrBayes v.3.1.2 (Huelsenback & Ronquist, 2001). The best-fitting model of TVM+I+G (nst = 6; rates = gamma) was selected for subsequent Bayesian phylogenetic analyses. Bayesian posterior probabilities were estimated with the Markov Chain Monte Carlo sampling. The analysis was initiated with randomly generated trees and ran for 2 × 10 7 generations from which a total of 2 × 10 5 trees were sampled at the intervals of every 100 generations, and among the sampled trees, the top 25% were used as burn-in and discarded.
Tracer v.1.4 (Rambaut & Drummond, 2007) was used to check that sampled values of log likelihood, plotted against generation time, reached stationarity. Finally, a majority rule consensus tree was generated and posterior probabilities were computed from the remaining postburn in trees.

| Genome content and organization
The complete mitogenome of G. himalayensis is a closed circular molecule 17,381 bp in length, similar to the other raptors (Li, Liu, et al., 2015;Li, Liu, et al., 2015;Qin, Guan, Shi, Hou, & Qin, 2013). which is similar to that seen in the other Gyps mitogenomes (52.71%-57.65%; Table S1; Haring et al., 2001). The A + T content of the control region was clearly higher than that of PCGs, tRNAs, and rRNAs (Table S1).
Mitochondrial genes overlapped by a total of 151 bp in 12 different locations from 1 to 70 bp, and the longest overlap (70 bp) existed between genes tRNA-Glu and pseudo-control region. The shortest overlap (1bp) existed in ATP6-COXIII, tRNA-Cys-tRNA-Tyr, and tRNA-Gln-tRNA-Met genes. In addition, some proteincoding genes shared 1-20 nucleotides in common with adjacent tRNA genes. This is a common feature of mitochondrial DNA which is very compact and economical (Curole & Kocher, 1999).
Furthermore, 17 intergenic spacers were present in the G. himalayensis mitogenome and involved a total of 85 bp. The longest spacer sequence was 22 nucleotides long and located between ND6 and tRNA-Pro (Table 1).
F I G U R E 3 Complete mitochondrial genome organization and gene arrangement of Gyps himalayensis. Genes coded on the Hstrand are indicated in the outer ring, while the genes coded on the L-strand are indicated in the inner ring. Genes are abbreviated as follows: ATP6 and ATP8 (subunits 6 and 8 of ATPase), COXI-COXIII (cytochrome c oxidase subunits 1-3), Cytb (cytochrome b), ND1-ND6 and ND4L (NADH dehydrogenase subunits 1-6 and 4L), 12S rRNA and 16S rRNA (ribosomal RNA of 12S and 16S), CR (control region); pCR (pseudo-coding region). One-letter amino acid abbreviations were used to label the corresponding tRNA genes AT/CG skews were essentially similar in the mitochondrial protein-coding genes (PCG), 2 rRNA genes, CR and the mitogenomes of other Falconiformes birds, (e.g. A. monachus, F. peregrinus, and Micrastur gilvicollis), except for a few genes (ND3, ND4L and CR;

| Protein-coding genes and codon usage patterns
The 13 mitochondrial PCGs of G. himalayensis were 11,393 bp in length, which accounted for 65.54% of the total mitogenome sequence. The PCG regions of the G. himalayensis mitogenome consisted of 3,797 codons in total. Most of the PCGs had ATG as start codon, except for COXI and ND6, which had GTG and ATT start codons, respectively. This is the case in the other Gyps species as well (Table 1; (Table 3). In contrast, the codons ACG, CCG, UCG, CGG, AAG, GCG, and CAG were rarely used, accounting for only 1.38% of the amino acids. Moreover, just like in other vertebrates (Li, Liu, et al., 2015;Li, Huang, et al., 2015;Zhang, Nie, Wang, & Hu, 2009 (Table S2).

| Transfer and ribosomal RNA genes
In total, the typical 22 tRNA genes with conventional secondary structures were found from the mitogenome of G. himalayensis. The total length of mitochondrial tRNA genes was 1,548 bp and size of individual genes ranged from 65 bp (tRNA-Ser2) to 74 bp (tRNA-Leu1 and tRNA-Leu2). Most of the tRNAs could be folded into the canonical cloverleaf secondary structure, while the tRNA-Ser (S1) contains a predicted secondary structure with the TΨC arm and loop, but lacks the DHU arm and loop. This phenomenon is considered to be a typical feature of vertebrate mitogenomes (Lu, Lu, Li, & Jiang, 2016;Wolstenholme, 1992). Moreover, most of anticodons were identical to those observed in other Gyps species. The CCA 3′-terminal group was added during processing. A total of 24 unmatched base pairs were found in the tRNAs of G. himalayensis, and these contained G-U pairs in the 9 bp AA, 6 bp A-C, 5 bp DHU and 4 bp TψC stems (Figure 4).
Stem mismatches appear to be ordinary in tRNA genes and possibly repaired through a post-transcriptional editing process (Lavrov, Brown, & Boore, 2000). Compared to other birds, most mismatched nucleotides were G-U pairs, which can form weak bonds in tRNAs and noncanonical pairs in tRNA secondary structures (Gutell, Lee, & Cannone, 2002;Li, Liu, et al., 2015;Li, Liu, et al., 2015).
The ribosomal genes (12S rRNA and 16S rRNA) were identified as to location, length, and base composition in the mitogenome of the G. himalayensis. The 12S rRNA gene is situated between tRNA-Phe and tRNA-Val, while the 16S rRNA gene is located between tRNA-Val and tRNA-Leu ( Figure 3). The length of 12S rRNA and 16S RNA were 985 bp and 1,630bp, respectively ( were similar to those in other Accipitridae birds Li, Liu, et al., 2015;Li, Liu, et al., 2015;Song et al., 2015).

| Noncoding regions
The noncoding region of the G. himalayensis mitogenome comprised of two major control regions. First one is the short noncoding region (pseudo-control region, CCR) 619bp in length, which is located after tRNA-Glu gene. The second one (CR) is 1,202 bp long and located between tRNA-Thr and tRNA-Pro ( Figure 3). The CCR sequence is shortest than any other sequences in the mitogenome (Table 4). The CRs of both vertebrates and invertebrates have a high A + T content and the replication initiation feature (Boore, 1999). The base frequencies in the CR of G. himalayensis are 33.1% T, 25.9% C, 23.5% A, and 17.5% G, while those of the pseudo-control region (CCR) are 35.2% T, 21.5% C, 33.9% A, and 9.4% G, respectively.
In general, three distinct CR domains are recognized: (a) the highly variable, left-end domain I; (b) the conserved, central domain (DII); (c) and the right-end domain III (DIII); Baker & Marshall, 1997). These three distinct domains were present also in the griffon vulture CR (Figure 5a).
As shown in Figure 5b, the pseudo-control region of G. himalayensis had a 125 bp nonrepeating region (nr-CCR) in the 5′ end, followed by a cluster of tandem repeats in the 3′ end (r-CCR): 12 times of 11 bp repeats units (followed by a 8 bp incomplete repeat unit), then 120 bp insertion and more than six times of 44 bp repeats units (followed by a 40 bp incomplete repeat unit; Figure 5, Table 4). The type and number of repeat units in CCR is very variable in Falconiformes (Table 4) Sorenson, & Dimcheff, 1998), 338 bp in B. buteo (Haring, Riesing, Pinsker, & Gamauf, 1999), 360 bp in Haliaeetus albicilla (Haring et al., 2001), and 472 bp in Aquila chrysaetos (Masuda et al., 1998). None of them had recognizable repeats from tRNA-Glu or ND6 genes, as was observed in Amazona parrots (Eberhard, Wright, & Bermingham, 2001). These variable tandem repeats have been identified as the main reason for the length variability of mitochondrial genome control regions, as well as for the whole mitogenome (Kim, Seong, Ho, & Ha, 1998;Zhang et al., 2016).

| Phylogenetic relationships
The nucleotide data for phylogenetic analyses included 11,364 nucleotide sites from 13 protein-encoding genes. Of these sites, 5,472 were conservative, 5,889 were variable, and 4,835 sites were parsimoniously informative. So far, based on a total of 33 other complete mitogenome sequences retrieved from NCBI GenBank (Table S1) (Seibold & Helbig, 1995;Sibley & Monroe, 1990;Wink, 1995). However, recent molecular studies have shown that the New World vultures clearly have their affinity with many raptors and not with storks (Burleigh et al., 2015;Hackett et al., 2008;Wink & Sauer-Gürth, 2004). Our phylogeny indicated that C. aura (Cathartidae) was basal to the remaining Accipitriformes, which is consistent with Jiang et al.'s results . According to a relatively short sequence, Wink and Sauer-Gürth (2004) placed S. serpentarius with storks. In our phylogenetic analysis, the S. serpentarius is deepest on the branch with Pandion haliaetus (Pandionidae) and Accipitridae (Figure 7). This result is consistent with some previous studies (Hackett et al., 2008;Lerner & Mindell, 2005;Mahmood et al., 2014). Furthermore, P. haliaetus is the sister to Accipitridae. The relationships among the four families of Accipitriformes are consistent with Burleigh et al.'s results (Burleigh et al., 2015;Jiang et al., 2015). In addition, the phylogenetic results indicate that there were four clades in Accipitridae. Clade 1 was (Gys + Aegypius + Spilornis), Clade 2 was (Aquila + Hieraaetus + Spizaetus), Clade 3 was only Accipiter, and the Clade 4 was (Buteo + Butastur) (Figure 7).
Our results indicated that the relationships at the family level were (Falconidae + (Cathartidae + (Sagittariidae + (Accipitridae + P F I G U R E 7 Results of phylogenetic analyses based on Bayesian inference (BI) and maximum likelihood (ML) analyses for 32 raptor taxa based on 13 PCGs sequences. Phodilus badius (KF961183) and Strix leptogrammica (KC953095) were used as outgroups. Tree topologies produced by BI and ML analyses were identical. Bayesian posterior probability and bootstrap support values from ML analyses, respectively, are shown on the nodes. The asterisks indicate new sequence generated in this study andionidae))), and this inference was supported by both BI (posterior probabilities ≥0.89) and ML (bootstrap ≥85.3%) analyses ( Figure 7). The Falconidae clade contained members from the genera Falco, Phalcoboenus, and Micrastur, and all the Falco species formed a monophyletic cluster (Figure 7). In this clade, S. cheela was a taxon ancestral to G. himalayensis and A. monachus. Furthermore, the results of the phylogenetic analyses support a closer relationship between G. himalayensis and A. monachus, rather than between G. himalayensis and S. cheela (Figure 7). This is in accordance with the results of Jiang et al. (2014) and Wink (1995).
P. haliaetus forms a sister group with the family Accipitridae clade.
Clade Sagittaridae comprised of S. serpentarius as the sister taxon to the Accipitridae and Pandionidae (1.00, 98.5%). Also, this result was consistent with those of the previous studies (Lerner, Klaver, & Mindell, 2008;Liu et al., 2017;Song et al., 2015), but differed from those of Hackett et al. (2008) and Pacheco et al. (2011). The difference is mainly due to the different samples used and molecular markers, and their evolutionary relationships need further investigating and searching for more evidences from binding nuclear gene data and morphological characters. In the current study, all clades were well resolved, with only a few exceptions less than 80%, while Bayesian posterior probabilities were 1.00 (except for one node). Despite their fast evolutionary rates, mitogenomes contain species-specific evolutionary relationships, which can be efficiently recovered by improving taxon sampling (Rubinstein et al., 2013).

| CON CLUS IONS
In short, this is the first study to report and analyze the complete mitogenome of the G. himalayensis. The sequence structure of the G. himalayensis mitogenome is typical for birds and possesses high similarity to other reported Accipitridae mitogenomes. However, at the 5′ end of the DI region, a long continuous poly-C sequence was discovered. Additionally, two types of tandem repeat units were also found in the pseudo-control region. The phylogenetic analyses support a closer relationship between G. himalayensis and A. monachus than that between G. himalayensis and S. cheela. Consequently, the results should provide a useful resource for future study of Gyps mitochondrial genomic evolution. However, the proposed evolutionary relationships among G. himalayensis, A. monachus, and S. cheela based on the findings that emerged in the current study should be accepted with caution due to limited taxon sampling. Many aspects of the phylogeny of the genus Gyps remain to be resolved and further analysis based on more molecular markers (binding nuclear gene) and a large number of taxon sampling is necessary to clarify the phylogenetic relationships among species of genus Gyps, Aegypius, and Spilornis.

CO N FLI C T O F I NTE R E S T
The authors declare that there are no conflicts of interest.

DATA ACCE SS I B I LIT Y
The following information was supplied regarding the deposition of DNA sequences: GenBank accession numbers: KY594709.