- Top of page
- Authors contribution
- Supporting Information
The prevalence of drug-resistance in Mycobacterium tuberculosis is already having a negative impact on the control of tuberculosis. We report the draft genome sequences of two super-extensively drug-resistant M. tuberculosis isolates from China, FJ05194 (lineage 2) and GuangZ0019 (lineage 4), and compare them with the H37Rv reference strain to identify possible sources of genetic variation associated with their extensive drug resistance. Our results suggest that their extensive drug resistance probably results from the stepwise accumulation of resistances to individual drugs.
- Top of page
- Authors contribution
- Supporting Information
Forty per cent of the world's notified cases of tuberculosis (TB) in 2010 arose in India and China (WHO, 2011). In China, 5.7% of new TB cases are multidrug resistant (MDR), and 8% of MDR cases are extensively drug-resistant (XDR) (Zhao et al., 2012). As drug sensitivity testing of Mycobacterium tuberculosis strains FJ05194 and GuangZ0019, isolated from patients in Fujian and Guangdong provinces, respectively, indicated that they are both resistant to four first-line (rifampicin, isoniazid, ethambutol and streptomycin) and four second-line TB drugs (capreomycin, kanamycin, ofloxacin, and ethionamide) (WHO, 2008), we performed genome sequencing to examine the genetic basis of their drug resistance.
Genome sequencing was performed on an Illumina HiSeq 2000 sequencer (Illumina Inc.) by generating paired-end libraries. Paired-end reads were de novo assembled using SOAPdenovo v1.05 (Rohde et al., 2011) and gaps were filled manually. Gene prediction was performed using Glimmer v3.02 (Delcher et al., 2007), and rRNA and tRNA were identified using RNAmmer (Lagesen et al., 2007) and tRNAscan-SE 1.3.1 (Schattner et al., 2005), respectively. Sequencing-related statistics are presented in Table 1.
Table 1. Sequencing statistics for S-XDR isolates from Fujian (FJ05194) and Guangdong (GuangZ0019) provinces, China, and their drug-resistance-associated genes and intergenic regions
|Coverage, based on H37Rv Ref (%)||99.57||99.96|
|Length of raw reads||528 000 000||528 000 120|
|Genome length (nt)||4363 305||4373 311|
|G + C content (mol%)||65.48||65.55|
|Protein-coding genes (CDSs)||4165||4198|
|5S rRNA genes||1||1|
|Previously reported drug resistance-associated genes (TBDream database)b|
|Rv1909c (furA)|| ||T40A|
|Rv3566c (nat)|| ||G207R|
|Rv3919c (gidB)|| ||K163X|
|Rv3794 (embA)|| ||A366E|
We performed a comparative genomic analysis of these two super-XDR (S-XDR) isolates, using M. tuberculosis H37Rv as a reference (GenBank accession no. NC_000962). Raw single nucleotide polymorphisms (SNPs) were identified by aligning clean reads to the H37Rv reference genome with SOAP2 (Li et al., 2009). Polymorphic bases that were present in < 70% of the reads or had a quality score < 20 were filtered out and SNPs called in repetitive regions of the H37Rv genome, defined as exact repetitive sequences of ≥ 25 bp in length, identified using either blast, RepeatMasker or Trf (Benson, 1999; Tarailo-Graovac & Chen, 2009) were excluded. A total of 1287 and 795 SNPs were identified in FJ05194 and GuangZ0019, respectively (Table 1; Supporting Information, Table S1). The lineages of the two isolates were determined by performing a phylogenetic analysis using 22 additional drug-sensitive strains (H37Rv and 21 publicly available genomes) which represent global diversity (Comas et al., 2010); FJ05194 belongs to lineage 2 and GuangZ0019 to lineage 4 (Fig. S1A). Based on the phylogenetic tree we removed SNPs present in the drug-sensitive lineage 2 and lineage 4 isolates from the FJ05194 and GuangZ0019 SNP datasets, respectively (i.e. we retained only those that were unique to FJ05194 or GuangZ0019), to better identify genetic variation related to drug resistance in these two isolates.
The presence of non-synonymous SNPs (nsSNPs) in certain genes is strongly correlated with drug resistance; for example, nsSNPs at the 516, 526 and 531 sites of rpoB are correlated with rifampicin resistance (Heep et al., 2001). We examined the distribution of the unique nsSNPs/intergenic region SNPs (IGR-SNPs) from each isolate (FJ05194: 221; GuangZ0019: 180) in the drug resistance genes included in the TBDream database (Sandgren et al., 2009); 17 nsSNPs and two IGR-SNPs previously associated with drug resistance were found in 13 of these genes and two IGRs, respectively (Table 1). We amplified the 13 nsSNP-containing drug resistance genes by PCR and resequenced them by Sanger sequencing on an ABI3730 sequencer to verify the SNPs. Results were fully consistent with Illumina sequencing results, confirming the quality of the original genome sequences.
Our results suggest that the extensive drug resistance of these isolates has probably arisen due to the accumulation of resistances to individual drugs. Resistance to six of the eight drugs tested in this study could be attributed to mutations in genes previously reported to be associated with these resistances (Table 1). For example, the H526Y and S531L mutations in rpoB, associated with resistance to rifampicin (Heep et al., 2001), and the A89E and H166P mutations in ethA, associated with ethionamide resistance (DeBarber et al., 2000), were present in these isolates. However, we did not find mutations in genes classically associated with resistance to capreomycin and kanamycin (rrs, tlyA and the eis promoter; Maus et al., 2005a,b; Zaunbrecher et al., 2009), suggesting that there are likely to be novel mutations in these isolates associated with these resistances. Of interest, several genes for efflux pumps thought to be involved in the transport of drugs across the membrane carry nsSNPs unique to isolate FJ05194 [Rv0933 (pstB) and Rv2333c (stp)] or GuangZ0019 (Rv2687c) (Raman & Chandra, 2008). Further experimental investigation will be required to identify the key SNPs which confer drug resistance and to determine whether these efflux pumps are involved in resistance mechanisms. The genomic locations of all the drug-resistance-associated genes/IGRs from the TBDream database detected in these two isolates are shown in Fig. S1B.
As the presence of drug-resistance mutations is often associated with a fitness cost (Muller et al., 2013), it is highly likely that there are compensatory mutations present in these two S-XDR isolates. Indeed, the proA-ahpC IGR in FJ05194 carries an SNP that has previously been reported as a compensatory mutation (Gagneux et al., 2006). Previously reported compensatory mutations in rpoA and rpoC, however, were not observed in these isolates, possibly because the S531L and H526Y mutations they carry have previously been reported to have a low fitness cost (Muller et al., 2013). Further analysis of these genome sequences and comparison with the genomes of other well-characterized XDR isolates as they become available should lead to the identification of other compensatory mutations and a better understanding of the evolution and characteristics of extensive drug resistance in M. tuberculosis, leading potentially to more effective control measures.
Nucleotide sequence accession numbers: Sequencing reads are available at the NCBI Sequence Read Archive under the accession code SRA068167. The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under accessions ANBL00000000 (FJ05194), and ANFI00000000 (GuangZ0019).