SEARCH

SEARCH BY CITATION

Keywords:

  • tuberculosis;
  • whole genome shotgun sequencing;
  • multidrug resistance;
  • single nucleotide polymorphisms

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information

The prevalence of drug-resistance in Mycobacterium tuberculosis is already having a negative impact on the control of tuberculosis. We report the draft genome sequences of two super-extensively drug-resistant M. tuberculosis isolates from China, FJ05194 (lineage 2) and GuangZ0019 (lineage 4), and compare them with the H37Rv reference strain to identify possible sources of genetic variation associated with their extensive drug resistance. Our results suggest that their extensive drug resistance probably results from the stepwise accumulation of resistances to individual drugs.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information

Forty per cent of the world's notified cases of tuberculosis (TB) in 2010 arose in India and China (WHO, 2011). In China, 5.7% of new TB cases are multidrug resistant (MDR), and 8% of MDR cases are extensively drug-resistant (XDR) (Zhao et al., 2012). As drug sensitivity testing of Mycobacterium tuberculosis strains FJ05194 and GuangZ0019, isolated from patients in Fujian and Guangdong provinces, respectively, indicated that they are both resistant to four first-line (rifampicin, isoniazid, ethambutol and streptomycin) and four second-line TB drugs (capreomycin, kanamycin, ofloxacin, and ethionamide) (WHO, 2008), we performed genome sequencing to examine the genetic basis of their drug resistance.

Genome sequencing was performed on an Illumina HiSeq 2000 sequencer (Illumina Inc.) by generating paired-end libraries. Paired-end reads were de novo assembled using SOAPdenovo v1.05 (Rohde et al., 2011) and gaps were filled manually. Gene prediction was performed using Glimmer v3.02 (Delcher et al., 2007), and rRNA and tRNA were identified using RNAmmer (Lagesen et al., 2007) and tRNAscan-SE 1.3.1 (Schattner et al., 2005), respectively. Sequencing-related statistics are presented in Table 1.

Table 1. Sequencing statistics for S-XDR isolates from Fujian (FJ05194) and Guangdong (GuangZ0019) provinces, China, and their drug-resistance-associated genes and intergenic regions
 FJ05194GuangZ0019
  1. a

    Gene prediction was performed using Glimmer v3.02 (Delcher et al., 2007). rRNA and tRNA were identified using RNAmmer (Lagesen et al., 2007) and tRNAscan-SE 1.3.1 (Schattner et al., 2005), respectively.

  2. b

    TBDream database (Sandgren et al., 2009).

  3. c

    IGR-SNPs, position relative to the start codon of the flanking gene.

  4. d

    Heterozygous call in Illumina sequencing results, but confirmed as an nsSNP by Solexa sequencing.

Sequencing statistics
Coverage, based on H37Rv Ref (%)99.5799.96
Sequencing depth110117
Length of raw reads528 000 000528 000 120
Genome length (nt)4363 3054373 311
Scaffolds11297
G + C content (mol%)65.4865.55
Annotationa
Protein-coding genes (CDSs)41654198
tRNAs4343
5S rRNA genes11
16S rRNA21
23S rRNA21
SNPs
Total1287795
Coding regions1140709
nsSNPs725437
sSNPs415272
Intergenic regions14786
Lineage24
Previously reported drug resistance-associated genes (TBDream database)b
Isoniazid
Rv1908c (katG)G421SS315N
Rv1909c (furA) T40A
Rv3566c (nat) G207R
Rv1592cE60D
Rv2427c-Rv2428 (proA–ahpC)C–52Tc
Rifampicin
Rv0667 (rpoB)H526YS531L
Streptomycin
Rv0682 (rpsL)K43R
Rv3919c (gidB) K163X
Ethambutol
Rv3794 (embA) A366E
Rv3795 (embB)E504D
Rv3795 (embB)D1024N
Rv3124 C175G
Rv3793-Rv3794 (embC–embA)C–11Ac
Ofloxacin
Rv0005 (gyrB)N538T
Rv0006 (gyrA)D94GdD94G
Ethionamide
Rv3854c (ethA)A89EH166P

We performed a comparative genomic analysis of these two super-XDR (S-XDR) isolates, using M. tuberculosis H37Rv as a reference (GenBank accession no. NC_000962). Raw single nucleotide polymorphisms (SNPs) were identified by aligning clean reads to the H37Rv reference genome with SOAP2 (Li et al., 2009). Polymorphic bases that were present in < 70% of the reads or had a quality score < 20 were filtered out and SNPs called in repetitive regions of the H37Rv genome, defined as exact repetitive sequences of ≥ 25 bp in length, identified using either blast, RepeatMasker or Trf (Benson, 1999; Tarailo-Graovac & Chen, 2009) were excluded. A total of 1287 and 795 SNPs were identified in FJ05194 and GuangZ0019, respectively (Table 1; Supporting Information, Table S1). The lineages of the two isolates were determined by performing a phylogenetic analysis using 22 additional drug-sensitive strains (H37Rv and 21 publicly available genomes) which represent global diversity (Comas et al., 2010); FJ05194 belongs to lineage 2 and GuangZ0019 to lineage 4 (Fig. S1A). Based on the phylogenetic tree we removed SNPs present in the drug-sensitive lineage 2 and lineage 4 isolates from the FJ05194 and GuangZ0019 SNP datasets, respectively (i.e. we retained only those that were unique to FJ05194 or GuangZ0019), to better identify genetic variation related to drug resistance in these two isolates.

The presence of non-synonymous SNPs (nsSNPs) in certain genes is strongly correlated with drug resistance; for example, nsSNPs at the 516, 526 and 531 sites of rpoB are correlated with rifampicin resistance (Heep et al., 2001). We examined the distribution of the unique nsSNPs/intergenic region SNPs (IGR-SNPs) from each isolate (FJ05194: 221; GuangZ0019: 180) in the drug resistance genes included in the TBDream database (Sandgren et al., 2009); 17 nsSNPs and two IGR-SNPs previously associated with drug resistance were found in 13 of these genes and two IGRs, respectively (Table 1). We amplified the 13 nsSNP-containing drug resistance genes by PCR and resequenced them by Sanger sequencing on an ABI3730 sequencer to verify the SNPs. Results were fully consistent with Illumina sequencing results, confirming the quality of the original genome sequences.

Our results suggest that the extensive drug resistance of these isolates has probably arisen due to the accumulation of resistances to individual drugs. Resistance to six of the eight drugs tested in this study could be attributed to mutations in genes previously reported to be associated with these resistances (Table 1). For example, the H526Y and S531L mutations in rpoB, associated with resistance to rifampicin (Heep et al., 2001), and the A89E and H166P mutations in ethA, associated with ethionamide resistance (DeBarber et al., 2000), were present in these isolates. However, we did not find mutations in genes classically associated with resistance to capreomycin and kanamycin (rrs, tlyA and the eis promoter; Maus et al., 2005a,b; Zaunbrecher et al., 2009), suggesting that there are likely to be novel mutations in these isolates associated with these resistances. Of interest, several genes for efflux pumps thought to be involved in the transport of drugs across the membrane carry nsSNPs unique to isolate FJ05194 [Rv0933 (pstB) and Rv2333c (stp)] or GuangZ0019 (Rv2687c) (Raman & Chandra, 2008). Further experimental investigation will be required to identify the key SNPs which confer drug resistance and to determine whether these efflux pumps are involved in resistance mechanisms. The genomic locations of all the drug-resistance-associated genes/IGRs from the TBDream database detected in these two isolates are shown in Fig. S1B.

As the presence of drug-resistance mutations is often associated with a fitness cost (Muller et al., 2013), it is highly likely that there are compensatory mutations present in these two S-XDR isolates. Indeed, the proA-ahpC IGR in FJ05194 carries an SNP that has previously been reported as a compensatory mutation (Gagneux et al., 2006). Previously reported compensatory mutations in rpoA and rpoC, however, were not observed in these isolates, possibly because the S531L and H526Y mutations they carry have previously been reported to have a low fitness cost (Muller et al., 2013). Further analysis of these genome sequences and comparison with the genomes of other well-characterized XDR isolates as they become available should lead to the identification of other compensatory mutations and a better understanding of the evolution and characteristics of extensive drug resistance in M. tuberculosis, leading potentially to more effective control measures.

Nucleotide sequence accession numbers: Sequencing reads are available at the NCBI Sequence Read Archive under the accession code SRA068167. The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under accessions ANBL00000000 (FJ05194), and ANFI00000000 (GuangZ0019).

Acknowledgement

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information

This work was supported by the National Basic Research Program of China (Grant No.: 2012CB518700) and the Chinese Academy of Sciences (Grant No.: KSZD-EW-Z-006).

Authors contribution

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information

N.L., Z.L. and J.Z. all contributed equally to this work.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Acknowledgement
  5. Authors contribution
  6. References
  7. Supporting Information
FilenameFormatSizeDescription
fml12238-sup-0001-TableS1.pdfapplication/PDF537KTable S1. SNPs identified in the FJ05194 and GuangZ0019 isolates.
fml12238-sup-0002-FigS1.tifimage/tif1405KFig. S1. (A) Neighbour-joining phylogeny of 24 human Mycobacterium tuberculosis complex isolates based on 6586 high-quality SNP positions. The tree is rooted with Mycobacterium canettii. Branches are coloured according to the six main phylogeographical lineages of MTBC (Comas et al., 2010): pink, L1 lineage (the Philippines and the Indian Ocean Rim); blue, L2 lineage (East Asia); purple, L3 lineage (India, East Africa); red, L4 lineage (Europe, America); brown, L5, Mycobacterium africanum (West Africa 1); green, L6, M. africanum (West Africa 2). (B) Distribution of SNPs and drug resistance genes/IGRs in the two S-XDR genomes. The outer circle shows SNPs unique to FJ05197 and the inner circle shows those unique to GuangZ0019. Non-synonymous SNPs are represented by red bars, synonymous SNPs by blue bars and intergenic SNPs by yellow bars. The genes and intergenic regions labelled have previously been reported to be associated with drug resistance (Sandgren et al., 2009).

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.