Molecular epidemiology of tuberculosis and other mycobacterial infections: main methodologies and achievements


  • D. Van Soolingen

    1. From the Mycobacteria Reference Department, Diagnostic Laboratory for Infectious Diseases and Perinatal Screening, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
    Search for more papers by this author

Dr Dick van Soolingen, Mycobacteria Reference Department, Diagnostic Laboratory for Infectious Diseases and Perinatal Screening, National Institute of Public Health and the Environment, PO Box 1, 3720 BA Bilthoven, The Netherlands (fax: +31 30 2744418; e-mail:


Abstract. van Soolingen D (National Institute of Public Health and the Environment, Bilthoven, The Netherlands). Molecular epidemiology of tuberculosis and other mycobacterial infections: main methodologies and achievements (Review). J Intern Med 2001; 249: 1–26.

In the last decade, DNA fingerprint techniques have become available to study the interperson transmission of tuberculosis and other mycobacterial infections. These methods have facilitated epidemiological studies at a population level. In addition, the species identification of rarely encountered mycobacteria has improved significantly. This article describes the state of the art of the main molecular typing methods for Mycobacterium tuberculosis complex and non-M. tuberculosis complex (atypical) mycobacteria. Important new insights that have been gained through molecular techniques into epidemiological aspects and diagnosis of mycobacterial diseases are highlighted.


After decades of decline, tuberculosis case rates are increasing worldwide. Tuberculosis is the most frequent cause of death from a single infectious disease in persons aged 15–49 years, causing a total of 2–3 million deaths annually [1]. One-third of the human population is thought to be infected by the causative agent Mycobacterium tuberculosis, and about 200 million additional people are at risk of developing the disease in the next 20 years, if the current trends are conserved [23]. Two major problems regarding the control of tuberculosis are emerging: coinfection with human immunodeficiency virus (HIV) and resistance of M. tuberculosis to the currently used regimen of tuberculostatics.

Molecular methods are increasingly applied to the control of infectious diseases. For the detection of causative microorganisms in clinical material, specific microbial genomic target sequences are amplified in polymerase chain reaction (PCR) or other DNA/RNA amplification assays [4–6]. The disclosure of the genetic basis of resistance to most antibiotics in pathogenic bacteria enabled the development of molecular tests to detect the mutations associated with reduced susceptibility to antimicrobial drugs [47–10]. Species identification of microorganisms is nowadays mainly based on the reading of semiconserved housekeeping genes such as the 16S rRNA gene [11–13]. A high number of strain-specific genetic markers with different levels of discrimination, stability and reproducibility have been identified in the last decade to examine the transmission of microorganisms [14–18].

Until about a decade ago, the only markers available to study the epidemiology of tuberculosis were drug susceptibility profiles and phage types [1920]. The use of either method had serious limitations. The drug susceptibility profile of M. tuberculosis strains is a highly unstable feature, because strains frequently gain resistance to antituberculous drugs during treatment. The predictive value of phage typing to link tuberculosis cases is also limited, because only a few phage types can be distinguished amongst M. tuberculosis isolates. In most areas, one phage type predominates amongst M. tuberculosis isolates; related and unrelated cases cannot be distinguished on this basis.

In recent years, a large number of DNA fingerprinting methods have become available to type mycobacterial isolates. Early work used restriction endonuclease analysis (REA). In this method, DNA is cleaved with particular restriction enzymes, the resulting DNA restriction fragments are separated on agarose gels and analysed by eye [21–24]. Later on, repetitive DNA elements were cloned that could be used as probes to visualize only those restriction fragments that contain the DNA sequence complementary to the probe: restriction fragment length polymorphism (RFLP) typing [2526]. The finding of IS986 and IS6110 and the application of these elements in the epidemiology of tuberculosis caused a breakthrough in the molecular epidemiology of mycobacterial infections [27–38].

Typing methods should preferably be rapid, reproducible, easy to perform, inexpensive and directly applicable to clinical material. None of the currently used typing methods meets all these criteria. Furthermore, the degree of discrimination and stability should be appropriate to the research question to be addressed. For outbreak management, other genetic markers are needed than for determination of the evolutionary divergence between microorganisms. Although there is no perfect method, the application of IS6110-based RFLP typing and other molecular typing methods has contributed significantly to our understanding of transmission of mycobacteria. It has also contributed to the identification of rarely encountered M. tuberculosis complex bacteria that were previously difficult to distinguish in biochemical procedures: Mycobacterium bovis BCG, Mycobacterium microti, and Mycobacterium canettii[39–41].

This paper aims to provide an overview of the most frequently used DNA fingerprint methods for mycobacteria currently available, and to review the improved understanding of the epidemiology of mycobacterial diseases gained by the application of DNA fingerprinting.

Classification of the genus Mycobacterium

Most mycobacteria are free-living saprophytes, and only a small proportion of these bacteria cause chronic diseases in humans and animals [4243]. There is a close genetic relationship between the genera Mycobacterium, Corynebacterium, Actinomyces and Nocardia, all of which appear as Gram-positive irregular non-sporing rods.

Mycobacterium leprae, which causes leprosy, can still not be cultured in vitro[44]. This hampers the development of molecular typing methods to examine the transmission of these bacteria. Therefore, the way M. leprae is transmitted is not fully understood.

The cultivable members of the genus Mycobacterium can be distinguished into two groupings: the M. tuberculosis complex and the non-M. tuberculosis complex (atypical) mycobacteria. The bacteria of the M. tuberculosis complex are genetically so closely related, that it is better to refer to ‘subspecies’ within the complex than to ‘species’[212240]. This lack of genetic heterogeneity amongst M. tuberculosis complex isolates was recently confirmed by a new whole-genome fingerprinting technique: the genome-sequence-based fluorescent amplified-fragment length polymorphism (FAFLP) typing [45]. The subspecies in the complex: M. tuberculosis, M. bovis, M. africanum, M. microti and M. canettii, and the attenuated M. bovis BCG bacteria that constitutes the vaccine against tuberculosis are often difficult to distinguish on the basis of biochemical tests or growth characteristics. DNA fingerprinting is therefore increasingly applied to classify M. tuberculosis complex isolates into subspecies [39–4146–62] Identification to the subspecies level may have important implications for the epidemiology of the respective case, as M. bovis strains are thought not to be transmitted amongst humans. It should be noted that the term ‘strain typing’ should be reserved for the distinction of strains within a (sub)species. The terms ‘strain identification’ and ‘determination’ refer to classification on species or subspecies level.

The non-M. tuberculosis complex constitutes a genetically heterogeneous group of acid-fast bacteria. Based on pigment production and growth rate, atypical mycobacteria can be distinguished into four groupings: I, photochromogens; II, scotochromogens; III, achromogens; and IV, rapid growers [63]. Most of these atypical bacteria can be isolated from environmental sources. In general, the atypical mycobacteria exhibit limited pathogenicity in humans, but bacteria of particular species, such as Mycobacterium kansasii, Mycobacterium malmoense and Mycobacterium avium, are capable of causing tuberculosis-like syndromes, especially in immunocompromised patients [64–68].

Mycobacterium avium complex bacteria are frequently encountered atypical mycobacteria. Most bacteria of this genetically heterogeneous complex are found in environmental sources and are usually not associated with mycobacterial disease. In contrast, bacteria of the subspecies M. avium ssp. avium have been frequently described as the cause of disseminated tuberculosis in immunocompromised patients [666970]. Bacteria of another subspecies within the M. avium complex, Mycobacterium paratuberculosis, cause Johne’s disease in ruminants [71–75]. M. paratuberculosis isolates from humans and animals show a remarkable degree of genetic conservation. It has been postulated that this bacterium may also be involved in the aetiology of Crohn’s disease in humans [7376–85].

With the aid of molecular identification methods, including 16S DNA gene sequencing [1386–88], many new mycobacterial species have been described in recent years, such as Mycobacterium confluentis[89], Mycobacterium celatum[90], Mycobacterium interjectum[91], Mycobacterium mucogenicum[92], Mycobacterium mageritense[93], Mycobacterium novocastrense[94], Mycobacterium wolinskyi[95] and Mycobacterium goodii[95]. The significance of the isolation of these bacteria is not always clear. Repeated isolation of the same mycobacteria from a patient on successive days suggests disease and a need for anti-mycobacterial treatment.

The standard DNA typing for
M. tuberculosis complex isolates

The standardized and most widely applied molecular typing method for M. tuberculosis complex isolates is IS6110 RFLP typing [27–3896–107]. This method is based on differences in the IS6110 copy numbers per strain, ranging from 0 to about 25, and variability in the chromosomal positions of these IS6110 insertion sequences [304699100108–113]. To visualize IS6110 RFLP patterns, DNA is extracted and purified from a bacterial culture. Thereafter, the DNA is digested with the restriction enzyme PvuII and the restriction fragments are separated on an agarose gel. The separated restriction fragments are transferred to a DNA membrane. In order to visualize the IS6110-containing restriction fragments, a peroxidase-labelled probe with a DNA sequence complementary to the IS6110 DNA sequence is added to a hybridization buffer which is poured onto the membrane. The PvuII restriction fragments the IS6110 probe hybridizes to, are highlighted by a chemiluminescence reaction initiated by adding two substrates. The membrane is packed in plastic and the RFLP patterns are detected by putting a light-sensitive film on the packed membrane in a light-blocked cassette. An example of the heterogeneous IS6110 RFLP patterns of M. tuberculosis complex isolates from the Netherlands is depicted in Fig. 1.

Figure 1.

Representative spoligo and IS6110 RFLP patterns of 47 M. tuberculosis complex isolates from the Netherlands. The dendrogram reflects the combined similarity on the basis of both spoligotyping and IS6110 RFLP. Lanes 1–9 contain M. tuberculosis strains belonging to the Beijing genotype family. Lanes 10 and 11 contain M. microti isolates. Lane 47 contains a M. canettii isolate. Note that the M. tuberculosis isolates in lanes 34 and 35 are identical regarding both genetic markers.

Unfortunately, a proportion of M. tuberculosis isolates contain no, or only a few, copies of IS6110, and this proportion differs significantly by geographical area [99108–110114–116]. Strain typing based on a low copy number of IS6110 is not sufficiently discriminatory [4699109117–124], so it has been decided to apply an additional genetic typing method when M. tuberculosis isolates contain fewer than five copies of IS6110. There are several genetic markers that can be used in addition to IS6110 RFLP typing. Relatively simple methods are spoligotyping [125] and the recently introduced FAFLP typing [45]. Most laboratories routinely apply the polymorphic GC-rich sequence (PGRS) RFLP typing as a supplementary typing method [4699103118121].

To facilitate the interlaboratory comparability of IS6110 RFLP patterns, all aspects of this procedure have been standardized and the most critical issues are discussed here [96]. The restriction enzyme PvuII has been selected because it cleaves only once in the 1,355-bp IS6110 insertion sequence. This implies that, assuming identical orientation of IS6110 elements, only one IS element can be present on a PvuII restriction fragment. A probe target sequence has been determined to ensure that only the right-hand part of IS6110 is detected [30]. Finally, internal and external molecular weight standards have been introduced in the method to facilitate an accurate computer-assisted analysis of IS6110 RFLP patterns. The computer-assisted analysis of DNA fingerprints has been described extensively by Heersma et al. [126] and van Embden et al.[127].

By using a standardized method for IS6110 RFLP typing and computer-assisted analysis, it has proven possible to establish international databases of RFLP patterns from a wide geographical area. This made it possible, for example, to trace the source of infection for a case of multidrug-resistant (MDR) tuberculosis in the Netherlands caused by a M. bovis strain [128129]. The patient, who had AIDS, had been hospitalized for a short period in Spain during his holidays. Later on, more nosocomial transmissions in other hospitals in Spain caused by the same MDR M. bovis strain were investigated by DNA fingerprinting [130131]. Even a case of MDR tuberculosis in Canada proved to be associated with transmission in Spain with the aid of DNA fingerprinting [132].

Another result of the establishment of an international database of DNA fingerprints is the knowledge that one particular genotype of M. tuberculosis, the ‘Beijing genotype’ has a high impact on the tuberculosis epidemic in Asia, Vietnam and the former USSR republics [133–137] (R. van Crevelt, personal communication; G. Pfyffer, personal communication).


Although IS6110 RFLP typing is the most widely applied typing method, several disadvantages in the use of this technique prompted the development of other methods. For RFLP typing, about 2 µg of DNA of the M. tuberculosis strain is needed. This amount of DNA can only be extracted from a large number of bacteria grown from clinical material. This implies a culture delay of a few weeks and requires viable organisms. Another disadvantage of RFLP typing is that this technique is technically demanding and expensive. Furthermore, sophisticated computer software is required to analyse IS6110 RFLP patterns in an accurate way [126].

In recent years, many PCR-based typing methods have been developed [45125138–151]. These methods have the advantage that, in principle, few bacteria are sufficient as targets for typing. These PCR methods are easier to perform. Some PCR-based methods, such as the mixed-linker PCR [141148], reveal a level of discrimination and reproducibility almost comparable with IS6110 RFLP typing [18]. However, because of its simplicity, spoligotyping is currently a frequently used PCR typing method for M. tuberculosis complex [120122152–154]. Spoligotyping is also used as an additional typing method for strains with fewer than five copies of IS6110[120122154]. This method is based on the visualization of the spacer DNA sequences in between the 36-bp direct repeats (DRs) in the genomic DR region of M. tuberculosis complex strains [125]. This DR region contains a variable number of DRs and also a variety of spacer DNA sequences in between the DRs. On the basis of the knowledge of the DNA sequences of the spacers present in the DR locus of M. tuberculosis strain H37Rv and M. bovis BCG vaccine strain P3, 43 synthetic oligo’s have been designed and applied in lines on a DNA membrane. In order to examine the presence of these 43 spacers in the DR region of an unknown M. tuberculosis complex strain, the whole DR locus of that strain is amplified in a PCR. This is done by using two inversely orientated primers complementary to the sequence of the DRs. By using such primers, DNA in between DRs next to each other and in between DRs more distantly positioned is amplified [125]. The PCR products, which are of multiple sizes, are applied on the membrane in reverse orientation on the rows with the synthetic oligo’s. Because one of the DR primers is labelled with a biotin label, through a streptavidine-peroxidase conjugate and a substrate, the hybridization on the synthetic oligo’s can be detected by chemiluminescence.

Spoligotyping applied to cultures is simple, robust and highly reproducible. Moreover, the results can be read as a digital code even in a word processor. However, the appropriateness of spoligotyping as a replacement for IS6110 RFLP typing is doubtful [108153155]. A proportion of M. tuberculosis strains with marked differences in their IS6110 RFLP patterns exhibit identical spoligo patterns [108154]. Also the discriminatory power of spoligotyping to distinguish between M. bovis isolates is less than that of PGRS- or DR-based RFLP typing [118119123156–159]. Nevertheless, spoligotyping can be used for prescreening. If M. tuberculosis complex isolates have different spoligo patterns, they also reveal, without exception, different IS6110 RFLPs [108154]. The ligation-mediated PCR [145146] has been described to be more effective than spoligotyping for screening on large scale [160].

It was expected that the use of spacer sequences from strains other than H37Rv and BCG P3 would enhance the discriminatory power of spoligotyping [125]. Unfortunately this is not the case: the 51 new spacer sequences that were added to the currently used set of 43 in a recent study appeared to contribute little to the optimization of this technique [161].

Because with spoligotyping more conserved genetic information is visualized than, for instance, with IS6110 RFLP typing, this technique can be used to classify M. tuberculosis complex isolates in taxons or subspecies. A clear example of this phenomenon is that all ‘Beijing’ genotype strains invariably only reacted with the last nine spacers in the panel of 43 [133]. Furthermore, most M. bovis strains, but not all [55], lack spacers 39–43 [118–120124125]. Most M. bovis and all M. bovis BCG strains lack, in addition, spacers 3, 9 and 16 [120125154]. M. microti strains can be distinguished in ‘vole’ and ‘lama’ types by their characteristic spoligo patterns [4149].

One of the clearest advantages of spoligotyping over IS6110 RFLP typing is that, in principle, spoligotyping can be used simultaneously for the detection and typing of M. tuberculosis complex bacteria in one assay [162]. Furthermore, it has proven possible to perform typing on non-viable cultures [163], on slides of Ziehl–Neelsen stainings [164], and on paraffin-embedded material [164, 165]. However, severe problems are faced in obtaining identical spoligo patterns from bacteria in clinical material, because of factors present in a proportion of clinical specimens that inhibit the PCR. Studies are presently being conducted to optimize the pretreatment of clinical material prior to spoligotyping.

DNA fingerprinting to identify
M. tuberculosis complex isolates

Tuberculosis in humans is not only caused by M. tuberculosis, but also by bacteria of the other subspecies within the M. tuberculosis complex. Like M. tuberculosis, M. africanum mainly infects primates. M. africanum occupies a biochemically intermediate position between M. tuberculosis and M. bovis and causes the same syndromes in humans as M. tuberculosis. It is doubtful whether M. africanum merits a distinct subspecies status, as the few differential tests give variable results characteristic of M. tuberculosis and M. bovis. DNA fingerprinting techniques such as spoligotyping may provide new insight into the identification of this intermediate group of M. tuberculosis complex bacteria, as recently shown by Niemann et al. [166].

In contrast to M. tuberculosis and M. africanum, M. bovis has a broad host range, causing infections in a large variety of domesticated and wild animals [46117124167–169]. M. bovis can be distinguished from other M. tuberculosis complex subspecies on the basis of biochemical criteria and colony morphology [42]. In spoligotyping, most M. bovis strains exhibit characteristic nonreactive spacers in the spoligo patterns [118–120, 125, 156]. However, some M. tuberculosis complex strains with biochemical features of M. bovis do not show these deletions [55]. Several DNA sequences have been described to distinguish between M. bovis and M. tuberculosis, such as the mtp40 gene sequence [51], a M. tuberculosis-specific sequence encoding for a phopholipase; a 500-bp M. bovis-specific sequence of unknown origin [56]; a M. bovis-specific sequence flanking the mpb64 gene [57]; two M. bovis-specific single base-pair mutations in the pseudogene oxyR[58] and the pyrazinamidase gene pncA[59]; and a 12.7-kb fragment 337 bp downstream of the RD2 region [60].

Mycobacterium bovis infections are classically spread from infected cows to humans via raw milk and not usually from human to human. In comparison with M. tuberculosis infections, M. bovis infections are more frequently found at extrapulmonary sites. IS6110 RFLP typing is not the most suitable method for investigating the transmission of M. bovis[46118119124158159], although this method serves to divide M. bovis isolates into broad groups [46119159170]. Most isolates from cows and, hence, a significant proportion of M. bovis strains isolated from humans, contain only one copy of the IS6110 element at a fixed genomic position [3946119158159171172]. M. bovis isolates from other domesticated and wild animals frequently contain a higher number of IS6110 elements [46119124]. PGRS-RFLP typing appears to have a higher resolution amongst M. bovis isolates [4699117119156157159]. Recently, other DNA fingerprinting methods have been introduced for typing of M. bovis, such as spoligotyping and variable numbers of tandem repeats (VNTR) typing [118123125140156–159]. VNTR typing, a method that exploits the polymorphism of six exact tandem repeat loci and the major polymorphic tandem repeat, is particularly promising [18140]. This method has been adapted for automated reading on a DNA sequencer and has provided important information that may help in the control of the disease in cattle in the UK [173].

Confusingly, in rare cases, it has been proven with IS6110 RFLP typing that M. tuberculosis is sometimes involved in zoonotic transmission, between humans and monkeys [167], between humans and elephants [168], and possibly between humans and a parrot [174].

Mycobacterium microti, previously known as the ‘vole bacillus’, was originally found in the 1930s in voles by Wells and Oxon [175] and Wells [176], and later on in larger mammals [177–183]. Recently, with the aid of DNA fingerprinting, it was shown that M. microti is also capable of causing severe forms of tuberculosis in humans [41]. Because the described M. microti strains have specific, characteristic IS6110 RFLP and spoligo DNA fingerprint patterns, it is possible to detect M. microti-derived tuberculosis in humans [4149]. In the Netherlands, four cases have been described, of which one was found in an immunocompetent 39-year-old male [41184]. Three further cases of M. microti infections in apparently immunocompetent humans have been described in Switzerland [185] and Germany [62]. All M. microti infections either caused regular lung tuberculosis or disseminated forms of this disease. With spoligotyping, in the UK M. tuberculosis complex isolates from another human, eight cats, a badger and a cow have been identified as M. microti[49].

The detection of M. microti is severely hampered by the extremely slow growth of this microorganism. In primary isolation procedures, it takes between 6 and 12 weeks to visualize the growth of this pathogen on solid culture media [176]. The growth of M. microti in liquid medium requires less time. In most laboratories dealing with primary isolation of mycobacteria on solid medium, the cultures will not be incubated long enough to detect the growth of this pathogen. For bacteriologists not aware of the existence of this bacterium, this may be one cause of microscopically positive but culture-negative diagnoses. It may therefore be that the M. microti infections in humans described so far represent only the tip of the iceberg. Especially in areas with a high prevalence of rodents, it is conceivable that there may be many more M. microti infections than currently detected.

Mycobacterium canettii was described as a separate subspecies within the M. tuberculosis complex in 1997 [40]. M. canettii shows the highest degree of evolutionary genetic divergence amongst the subspecies within the complex. This bacterium, with abnormal smooth colony morphology and a shorter multiplication time than other M. tuberculosis complex bacteria, was found for the first time in a young Somali patient [40]. All of the about 2000 M. tuberculosis complex strains tested so far using IS1081 RFLP revealed conserved DNA fingerprint patterns consisting of five to seven bands [39]. In contrast, all three independent isolates of M. canettii described so far exhibited only one band in their IS1081 RFLP. This makes IS1081 RFLP typing a reliable tool to recognize M. canettii strains [40].

In 1998, in Switzerland, the third case of a M. canettii infection was found in a 56-year-old patient who had been working for 20 years in Uganda and for 1 year in Kenya [48]. Recently, also, in Djibouti M. canettii strains were isolated from four immunocompetent patients [186]. It therefore seems possible that the East African region is the geographical centre of M. canettii. The question of whether animal sources serve as natural hosts to this subspecies needs further investigation.

Live BCG vaccine is derived from an attenuated M. bovis strain [187]. DNA fingerprinting has made it possible to identify these bacteria with certainty as the cause of tuberculous infections, which was difficult in the pre-fingerprinting era [23, 39, 52, 188, 189]. The combination of the characteristic IS6110 and IS1081 RFLP patterns of BCG isolates is a solid criterion for recognition [39]. Improved recognition of M. bovis BCG has provided the insight that BCG bacteria are able to cause disseminated forms of tuberculosis in immunocompromised patients [188190–193]. Previously, it was thought that such BCG-derived tuberculosis infections were due to endogenous reactivations of BCG bacteria a long time after vaccination with these attenuated bacteria. Recently, it has become clear that another possible source of infection has to be taken into account – the iatrogenic route [193]. BCG bacteria are nowadays frequently used to treat bladder carcinoma. The lyophilized BCG preparation, containing very high numbers of viable BCG bacteria, are often redissolved in the same safety cabinet where chemotherapeutics for other patients are also prepared. The comparison of the DNA fingerprints of the BCG bacteria isolated from patients with those from BCG bacteria used for carcinoma treatment have demonstrated that some BCG infections appear to occur via cross-contaminations in hospital pharmacies (G. Vos, unpublished observations).

From several DNA fingerprinting studies on M. bovis BCG vaccine strains, it can be concluded that there is a high degree of genetic divergence amongst these strains [235061194–197]. What the implications of this divergence are regarding the immunogenicity and efficacy of the BCG vaccine is presently unclear. These differences may inform the debate on the variable effectiveness of BCG in different settings.

Stability of IS6110 RFLP

The genetic polymorphism associated with the insertion sequence IS6110 and the required stability when using this element as a genetic marker in the epidemiology of tuberculosis are almost contradictory. Strains need to change fast enough that non-epidemiologically related isolates are distinct, and yet slowly enough that isolates from related cases are identical. Soon after the introduction of IS6110 RFLP typing, transpositions of the IS element were recorded in the offspring of particular M. tuberculosis strains and isolates from epidemiologically related cases, leading to band shifts in the DNA fingerprints [100198199]. This led to concern regarding the reliability of this method for linking tuberculosis cases.

Recently, in two studies, IS6110 RFLP patterns of isolates were compared for patients who had had sequential positive cultures with days to years in between [200201]. In a study in San Francisco, 29% of the paired isolates, taken at least 90 days apart, from 49 patients showed only minor alterations in the IS6110 RFLP pattern [200]. Interestingly, in this study a lower degree of instability was found in PGRS-based RFLP, and the rearrangements in the PGRS RFLP appeared to occur independently from the IS6110 RFLP alterations. The instability in IS6110 RFLP patterns amongst 544 serial isolates from Dutch patients was lower than that recorded in the San Francisco study, with a calculated half-life of 3.2 years [201]. One possible explanation for this difference is that, in the San Francisco study, patients were divided into two groups with different time intervals between the serial isolates, whereas the Dutch study included all patients. Another possibility is that in the San Francisco study, serial isolates spanning 90 days or more were re-cultured, which may have induced changes in the DNA, whereas in the Dutch study, original DNA fingerprints were compared and the laboratory practices remained the same during the study period. In the Dutch study, using survival analysis, the half-life of IS6110 RFLP was estimated to be 3–4 years [201]. This means that, on average, half of the strains exhibit a band shift in their IS6110 RFLPs in a 3–4 year period. This interval is suitable for distinguishing epidemiologically related and unrelated isolates and therefore supports the use of IS6110 typing in epidemiological studies of recent transmission of tuberculosis.

Population structure of M. tuberculosis strains in different regions

In most low-incidence areas, such as Denmark [106], the Netherlands [103], Switzerland [107], Norway [202], New York [102] and San Francisco [101], the IS6110 RFLP patterns of M. tuberculosis isolates are highly polymorphic. This is influenced by two factors: (i) the relatively high percentage of cases in low-incidence areas due to endogenous reactivation and, hence, the reflection of genotypes present in the population over a long time period; and (ii) the large proportion of cases in these areas found amongst non-native populations originating from different geographical origins, which introduce exotic strains not known in these areas. In areas with a higher incidence of tuberculosis such as Africa and Asia, the IS6110 RFLP patterns are often significantly less variable than in low-incidence areas [133203]. For instance, in Tunisia, 62% of the M. tuberculosis isolates belong to three genotype families that share at least 65% similarity in the IS6110 RFLP patterns, and in Ethiopia 52% of the M. tuberculosis isolates belong to four genotype families [203]. In contrast, Warren et al. [204] found high strain diversity of M. tuberculosis in communities in Cape Town, South Africa, where tuberculosis incidence is high.

In most studies, the population structure of M. tuberculosis in a given area is studied by a single genetic marker. Sola et al. [205] used a combination of four repetitive genetic markers to perform a numerical analysis by the unweighted pair group method with arithmetic averages ( upgma) on 113 isolates in Guadeloupe and concluded that this may be a better approach to elucidate the intraspecies genetic microevolution.

The emergence of the M. tuberculosis Beijing genotype strains

An extreme example of clonality of M. tuberculosis isolates was found in the Beijing area where more than 85% of the isolates exhibited more than 66% similarity amongst their multibanded IS6110 RFLP patterns [133] (see Fig. 1). Later on, this genotype of M. tuberculosis strains was also observed, with a lower frequency, in other parts of Asia [133135136] (R. van Crevel, unpublished observations), the former USSR republics [134137] (G. Pfyffer, personal communication), and also in other geographical regions [108206–208]. Because the highest density of this genotype of M. tuberculosis was found in the Beijing area, this family of strains was designated the ‘Beijing’ genotype. It should be noted that although the IS6110 RFLP patterns of Beijing genotype strains found were highly similar, hardly any of them were mutually completely identical. In contrast, all spoligotyping patterns of Beijing genotype strains have been found to be identical and this pattern is highly specific for this genotype of M. tuberculosis.

Because the Beijing genotype strains are highly prevalent in some areas, it is conceivable that the Beijing genotype strains have a selective advantage over other M. tuberculosis genotypes. It has, for instance, previously been postulated that the BCG-induced immunological defence may protect against regular M. tuberculosis genotypes, but not against this ‘escape variant’ of M. tuberculosis[133]. In a recent study in Vietnam, the occurrence of Beijing genotype strains was examined in 563 tuberculosis patients in different age categories, half of them being BCG vaccinated [135]. In about 50% of the cases included in this study, Beijing genotype strains were isolated. Although the Beijing genotype strains were found somewhat more frequently amongst BCG-vaccinated individuals than amongst non-BCG-vaccinated persons, this difference was not significant. However, it should be noted that the coverage of vaccination is very high in this area and therefore the spillover of Beijing strains from the vaccinated to the non-vaccinated population will be a significant factor levelling off the diversity of strains amongst the non-vaccinated subjects.

The occurrence of the Beijing genotype strains in the Vietnam study was significantly correlated with young age, suggesting recent transmission. This indicates that the Beijing genotype strains may be emerging in Vietnam and most likely also in other parts of Asia. Importantly, in this Vietnamese study [135] and in other areas [108134206], the Beijing genotype strains were significantly more frequently resistant to tuberculostatics. This is consistent with the prolific spread of (multidrug) resistant variants of the ‘W’ strain in the early 1990s in North America [36138207–211]. These ‘W’ strains represent an evolutionary branch of the Beijing genotype strains and exhibit highly similar IS6110 RFLP patterns and identical, Beijing genotype-characteristic spoligotypes.

Recent transmission of M. tuberculosis

The use of molecular epidemiology to study transmission of M. tuberculosis is based on the fact that strains recently derived from a common ancestor exhibit the same DNA fingerprint. The term ‘cluster’ is used to indicate M. tuberculosis isolates with identical or highly similar DNA fingerprints, and also the respective patients from whom these genetically indistinguishable strains were isolated. In the Netherlands, DNA fingerprinting has been applied routinely to all M. tuberculosis complex isolates since 1993. Figure 2 shows the cumulative percentage of clustering of M. tuberculosis isolates in the period 1993–1999. Over the first 2 years of the application of DNA fingerprinting, the percentage of clustered strains increased sharply. Thereafter, the increase of the clustering percentage was almost negligible. This suggests that the extent to which clustering reflects recent transmission depends strongly on study duration: clustering calculated in a short study, e.g. lasting less than 2 years, is unlikely to identify both the source and secondary case in a chain of transmission, and will probably underestimate the amount of recent transmission. Recent modelling work suggests that the relationship between clustering and the proportion of disease attributable to recent transmission also depends on the ages of the cases (J. Glynn, personal communication).

Figure 2.

Percentage of clustering of M. tuberculosis cases in the Netherlands at different time intervals in the period 1993–1999 (solid line), and number of cases in incremental study periods (bulleted line).

Until recently, it was generally assumed that in low-incidence areas, endogenous reactivation of remote M. tuberculosis infections played a major role in the aetiology of tuberculosis. In the population-based studies in Denmark, New York, San Francisco and the Netherlands, it was shown that, on average, 43% of the tuberculosis cases were found in clusters, suggesting recent transmission [101–103106]. This indicates that recent transmission of tuberculosis causes a significant number of tuberculosis cases even in low-incidence areas. Furthermore, in San Francisco, it was shown that a single source case led, directly or indirectly, to 6% of all cases recorded in the 2-year study period over which DNA fingerprinting had been applied [101].

Although in all these studies the proportion of cases which were clustered (suggesting recent transmission) was associated with young age, older persons were also involved in clusters, even in low-incidence areas. In the Netherlands, as expected, the incidence of non-clustered cases (endogenous reactivation) was found to be relatively high above the age of 65. However, although the proportion of clustered cases was low, the incidence of clustered cases amongst the elderly equalled that amongst younger patients [103].

In some low-incidence areas, such as Switzerland and Norway, the percentage of clustered isolates found was low: 17 and 16%, respectively [107202]. This seems to indicate that recent transmission of tuberculosis plays a minor role in these countries. In interpreting the proportion of clustered strains found in a study, knowledge of the proportion of tuberculosis cases in the community included in the study is important. No study is able to include all cases, yet the proportion found to be clustered is crucially dependent on the completeness of the sample: incomplete sampling will underestimate the proportion clustered, even if the number of cases included is large [212213].

These molecular population-based studies also allow the determination of risk factors for clustering and hence recent transmission of tuberculosis [101–105]. For instance, in the study in New York, HIV-positivity, Hispanic ethnicity combined with HIV seronegativity, infection with drug-resistant tuberculosis, younger age and lower income were associated with having a clustered M. tuberculosis isolate [102].

Exogenous reinfection of tuberculosis

One of the fundamental questions regarding the natural history of tuberculosis concerns the role of exogenous reinfection. If a patient suffers from a second episode of tuberculosis, it is important to know whether this is due to a reactivation of the previous tuberculosis infection, or to a superinfection with a new M. tuberculosis strain. Godfrey and Stoker [199] reported that pairs of isolates from individual patients in Malawi exhibited different DNA fingerprints in four cases. In two cases, isolates were taken from different sites, and in one case the two isolates were from different episodes of the disease. In the study by Das et al. [109], in 12% of the relapse patients in Hong Kong the DNA fingerprint of the initial isolate was markedly different from that of the relapse isolate, but this was no greater than the proportion of different patterns found from paired isolates from the same episode of disease. DNA analysis of serial MDR M. tuberculosis isolates from 17 patients in New York City showed that in four out of 17 cases during the course of the disease, a superinfection with another MDR strain occurred [214]. An extremely high percentage of exogenous reinfections was reported by van Rie et al. [215] in South Africa. Twelve out of 16 pairs of strains, obtained from patients in the pretreament period in the first episode of the disease and after the recurrence of tuberculosis after curative treatment, exhibited different DNA fingerprints [215].

The results of these studies indicate that exogenous reinfection after curative treatment occurs. The susceptibility to superinfections will most likely be related to the immune status of the patient. Furthermore, this phenomenon seems related to the risk of infection in a given area. In a recent study in the Netherlands, where there is a very low risk of infection for tuberculosis, reinfection plays a minor role [216]. The rate at which superinfections occur in high-prevalence areas needs to be examined in a large and well-designed study.

Evaluation of the usefulness of DNA fingerprinting by contact tracing

Several studies reported attempts to determine the usefulness of DNA fingerprinting in comparison with conventional contact tracing. This was done by comparing the clustering of cases on the basis of RFLP typing results with the findings in conventional contact tracing on the basis of questionnaires [101103104107217]. In population-based studies in San Francisco, Amsterdam and Zurich, only a small proportion (ranging from 5 to 10%) of the cases linked with the aid of DNA fingerprinting were also linked on the basis of conventional contact tracing using questionnaires [101104107]. This does not imply that the DNA fingerprint links were incorrect, since unsuspected transmission of tuberculosis through short-term contacts cannot be traced by the classical approach of contact tracing [103218219]. Šebek [220] recently reported the results of contact investigations of 784 established contacts after DNA fingerprinting according to the stone-in-the-pond principle [219]: 31% were found in the first circle, 29% in the second and 40% from the third circle and onward. This means that for most of the 784 established contacts (45%), transmissions detected with DNA fingerprinting are not found amongst clear contact of source cases, but amongst loose contacts difficult to link to the source case. In a population-based study in the Netherlands during a 5-year period, contact investigations in the five largest clusters of 23–47 patients suggested epidemiological linkage between the cases on the basis of time, place and risk factors. Furthermore, DNA fingerprinting has shown on numerous occasions that transmission has occurred through very short-term contacts that would not normally be traceable [103]. A major contribution of DNA fingerprinting is thus its ability to highlight previously unsuspected transmission in the community and areas in which contact tracing is not working.

However, it is clear that IS6110-based RFLP typing is not a reliable indicator of epidemiological linkage between tuberculosis cases: identical fingerprints do not prove a close association. For instance, Braden et al. [217] in the rural environment of Arkansas found strains with identical DNA fingerprints, which were isolated from elderly patients from a widespread geographical area who were thought not to have been in contact. A similar observation has been made in the Netherlands in cases amongst the elderly (unpublished observation). Also in a study in Malawi and Kenya, strains from a widespread area were found with identical DNA fingerprints without any apparent epidemiological links [199]. This may indicate that in some areas, particular ancestral, well-conserved strains are found with stable and identical IS6110 RFLP patterns, confusing the epidemiological interpretation.

Another point to consider is the finding that IS6110 copies are not distributed at random in the genome of M. tuberculosis complex strains [209, 221]. At some positions, bands were found significantly more likely to be present than would be expected by chance and this may interfere with the epidemiological interpretation of relatedness of strains.

DNA fingerprinting thus appears to be a valuable tool to study the transmission of tuberculosis, with the proviso that identical strains suggest but do not prove transmission. The use of DNA fingerprinting has shown that transmission of tuberculosis can occur through very short contacts and outside the ring of contacts. It may be time to reconsider the classical principles of contact examination [219, 222–225]. The addition of DNA fingerprinting to conventional contact tracing as a tool to study unsuspected transmission of tuberculosis at different levels seems valuable, and in some situations it could be used to guide the contact tracing. This is especially important in countries with a low prevalence of tuberculosis, where the elimination of this disease comes into sight [225–229].

Transmission of resistant M. tuberculosis strains

Soon after the introduction of INH in the 1950s, resistance to this important drug was found, especially as a result of monotherapy. Middlebrook and Cohn [230] showed that some INH-resistant strains were less pathogenic in guinea pigs and this was attributed to the reduction of the catalase activity in these bacteria. Despite this apparently reduced virulence of some INH-resistant M. tuberculosis strains, many outbreaks of (multidrug) resistant strains were recorded with DNA fingerprinting methods [32138207208231–233]. The susceptibility of HIV-positive patients to tuberculosis infections and the accelerated breakdown to disease often result in more rapid transmission of the disease, as indicated in many outbreaks investigated with DNA fingerprinting [3133–37234235]. It is therefore important to recognize the transmission of resistant strains in these populations at an early stage in order to prevent further spread.

In 1992 Zhang et al. [236] showed that, in a significant proportion of INH-resistant M. tuberculosis strains, INH resistance is associated with mutations (or deletions) in the katG gene. The fact that the catalase/peroxidase activity seems to be needed to convert INH to a biologically active form explains this.

In a population-based study in the Netherlands during a 5-year period it was shown that INH resistance is a negative risk factor for being in a DNA fingerprint cluster [103]. Because different types of genetic mutations in M. tuberculosis can cause resistance to INH, recently the correlation between particular mutations and transmission has been examined in this same country (D. van Soolingen, unpublished observations). M. tuberculosis isolates with a mutation at amino acid (AA) position 315 of the katG gene were found significantly more often in IS6110 DNA fingerprint clusters than INH-resistant strains with other mutations as the basis for INH resistance. Furthermore, INH-resistant strains with a mutation at AA position 315 significantly more frequently developed additional resistance against additional tuberculostatics. Already in 1998 the mutations at the AA315 position of the katG gene were those most frequently found amongst MDR M. tuberculosis isolates from St Petersburg [134]. These studies indicate that INH-resistant strains with different genetic bases have different abilities to generate secondary cases and additional resistance.

Use of DNA fingerprinting to detect laboratory cross-contamination

Cross-contamination of clinical samples during the sampling or culture procedures for mycobacteria was already described in the pre-DNA fingerprinting era [237–239]. This phenomenon involves spillover of bacteria from contaminated bronchoscopes, or from positive clinical samples and control strains to negative samples. These errors can lead to unnecessary visits of falsely diagnosed patients to medical consultants and unnecessary long-term antimicrobial treatment. Since the introduction of accurate strain typing methods, it has become clear that sampling and laboratory errors occur more frequently than previously assumed [240–245]. Small et al. [244] suggested that if in a laboratory mycobacteria with identical DNA fingerprints are cultured within a period of 1 week, then laboratory cross-contamination should be considered. This suggestion has been confirmed in practice in the Netherlands, where DNA fingerprinting has been applied routinely to all positive M. tuberculosis complex cultures since 1993. When peripheral laboratories are notified of possible cross-contamination when multiple isolates within a 1-week period exhibit identical RFLP patterns, in the vast majority of these cases sampling/laboratory mishaps are later confirmed by clinicians. In the Netherlands, about 3% of all positive cultures are derived from confirmed sampling/laboratory errors (unpublished observation). To detect such sampling/laboratory errors, it is of the utmost importance to have optimal communication between the laboratory staff and the clinicians concerned. If only one out of the three successive cultures on consecutive days is positive, then laboratory cross-contamination should be considered. The correlation between clinical manifestations and especially acid-fast smear-negative and/or low-yield culture-positive samples is of the utmost importance. In all cases where the clinical picture of the patient is doubtful, but a positive M. tuberculosis culture is obtained, DNA fingerprinting of all strains isolated in that laboratory in the respective period should be used to examine the possibility of a cross-contamination.

Epidemiological inferences from DNA fingerprinting data

The routine application of DNA fingerprinting on all M. tuberculosis isolates in the Netherlands for an extended period of time provided the possibility of analysing the transmission of M. tuberculosis between people of different nationalities. It was assumed that the probability of a patient being the source of a cluster is proportional to the incidence rate of non-clustered cases multiplied by the probability that a potential source would give rise to a cluster [105] and was used to calculate the ‘transmission index’, defined as the average number of tuberculosis cases resulting from recent transmission from a potential source case. This transmission index (TI) varied strongly by nationality and it was concluded that transmission between subpopulations could be quantified to validate tuberculosis control. In a similar study in San Francisco, the TI was found higher in US-born than in foreign-born groups, and was highest in blacks under the age of 35 [246].

Using RFLP typing results on all M. tuberculosis complex isolates from 1993 through 1996, the correlation between the ages of patients in all clusters comprising two Dutch patients was determined [247]. The mean difference in age between two patients in 81 clusters was 13.9 years, whilst the mean age difference between all possible pairs of individuals in this data set was 25.5 years. It was concluded that, in the Netherlands, tuberculosis cases preferentially transmit infections to people of a similar age [247].

The same set of data was used to identify source and secondary cases and to estimate incubation periods and serial intervals (time intervals between successive cases in a chain of transmission). The geometric mean serial interval was found to be 29.5 weeks and the geometric mean incubation period 20.8 weeks [248]. Though constrained by the relatively short study period (which meant that incubation periods and serial intervals of up to only 5 years could be identified), the findings were consistent with those from earlier studies.

Discriminatory power and reproducibility of typing methods for M. tuberculosis complex isolates

In a recent study, most available molecular typing methods for M. tuberculosis complex isolates were compared with regard to discriminatory power and reproducibility in an interlaboratory study by Kremer et al. [18]. All five RFLP typing methods using IS6110[98], IS1081[39], PGRS [99], the DR [99] and the (GTG)5 repeat [249] as probes were found to be highly reproducible [18]. Amongst the PCR-based typing methods investigated, variable numbers tandem repeats (VNTR) typing [140], mixed-linker PCR [141] and spoligotyping [125] were found to be highly reproducible, whereas the double repetitive element PCR (DRE-PCR) [139], IS6110 inverse PCR [142], IS6110 ampliprinting [144] and arbitrarily primed PCR [143] were poorly reproducible. IS6110 RFLP typing and the mixed-linker PCR were the most discriminatory methods [18]. The mixed-linker PCR is a promising method that has been optimized to facilitate automated detection of the DNA fingerprint patterns [148]. The latter method has recently evolved to fast ligation-mediated PCR (FliP), in which typing of isolates can be achieved within 6.5 h from little amounts of cells. The FliP exhibited the same discriminatory power and reproducibility as mixed-linker PCR [250]. An adapted variant of the recently introduced mycobacterial interspersed repetitive unit (MIRU) typing [251] was also evaluated using the set of DNAs of the interlaboratory study. MIRU typing, including multiplex PCR and fluorescence-based DNA sizing on automated sequencer, with software programs for automated genotyping, allele calling and database construction, was found to be highly discriminatory and reproducible (P. Supply, in preparation). The method of MIRU typing is similar to that of VNTR typing; it is based on the variability in copy numbers of tandem repeats of 40–100 bp in length at 12 different intergenic regions of the M. tuberculosis complex genome. The discriminatory power of MIRU typing was higher in comparison with that of VNTR typing. Also, VNTR typing has in due time been adapted for automated reading on a DNA sequencer and has proven to be useful for discrimination of M. bovis strains with identical spoligotypes [173].

Other DNA fingerprinting methods have been introduced in the last few years, which were not included in the comparative study described above, such as pulsed-field gel electrophoresis (PFGE) [188], heminested inverse PCR (HIP) [150], enterobacterial repetitive intergenic consensus sequences PCR (ERIC-PCR) [151] and ligation-mediated PCR (LMPCR) [145].

In PFGE, DNA is cleaved with rarely cleaving restriction enzymes [188252–256]. The large restriction fragments are separated using sophisticated and expensive electrophoresis equipment. The results of PFGE show a level of discrimination that can be used in the epidemiology of tuberculosis, but it is not clear what the level of discrimination is in comparison with other typing methods. In addition, this method is at least as demanding as RFLP typing.

The heminested inverse PCR targets the insertion sequence IS6110 together with its upstream flanking region and the results obtained with this method seem reproducible [149150].

In 1998, a method was introduced for typing of M. tuberculosis complex isolates, based on primers targeting the enterobacterial repetitive intergenic consensus sequences. The authors claimed that the level of discrimination of ERIC-PCR is greater than that of the IS6110 RFLP typing [151].

The LMPCR, a ligation-mediated method in which a flanking sequence located at the 5′ side of IS6110 is amplified [145146], was compared with spoligotyping and IS6110 RFLP, and proved to be more discriminatory than spoligotyping and less discriminatory than IS6110 RFLP [160].

IS6110 RFLP typing is not ideal, as it is technically demanding and expensive, so the development of faster and easier-to-perform techniques is welcome. It should, however, not be abandoned lightly since it has proven to give a level of discrimination between M. tuberculosis isolates that allows reliable epidemiological interpretations. Furthermore, considerable information has been collected on the stability of this genetic marker and the variability of IS6110 RFLP patterns of M. tuberculosis isolates in different settings. Also the presence of large databases of IS6110 RFLP patterns established over extended time periods is a significant factor in the consideration of the introduction of new genetic markers.

Mycobacterium avium infections in humans usually not derived from birds

The M. avium complex consists of a group of frequently encountered atypical mycobacteria isolated from humans. In AIDS patients in particular, these bacteria act like opportunistic pathogens [64–70]. The M. avium complex comprises a heterogeneous group of slow-growing bacteria, ubiquitous in nature. It is still uncertain how humans contract M. avium complex infections, although some studies describe the isolation of these bacteria from tap water [257–259], soil [259], cheese [260], cigarettes [261], and animals such as birds and pigs [262–265].

Only a few years ago, the only strain typing method available to examine the transmission of M. avium complex strains was serotyping [266]. Recently, genetic markers like the RFLP typing using insertion sequences IS1245, IS1311, IS901, IS902 and IS1110 and PFGE appeared very useful to type M. avium ssp. avium isolates [262263267–277]. Serotyping and IS1245 RFLP typing correlate only partly [263]. A part of the strains with identical serotypes exhibit completely different IS1245 RFLP patterns [263].

Other methods for typing M. avium ssp. avium isolates are the rRNA typing method [278], multilocus enzyme electrophoresis [279280], restriction enzyme analysis, field-inversion gel electrophoresis, RFLP analysis with random DNA probes, typing on the basis of chemical analysis of cell wall components and by the presence of plasmids [281]. PCR-based methods of typing M. avium are randomly amplified polymorphic DNA (RAPD) analysis [282] and PCR amplification of the chromosomal DNA between IS1245 and IS1311.

With a few exceptions, the presence of IS1245 has been found to be restricted to M. avium ssp. avium[263269]. From the molecular epidemiological results obtained so far with the standardized IS1245 RFLP typing [283], some conclusions can be drawn. Most serial M. avium ssp. avium isolates from individual patients exhibit unchanging IS1245 RFLP patterns over time [263268]. This indicates that these patients are infected/colonized for periods of months or even years with the same M. avium ssp. avium strains and this may be an important indication that the isolation of M. avium ssp. avium is associated with clinical significance rather than with coincidental presence. The aetiology of M. avium complex infections in humans is not clear, as these bacteria can be isolated from a large number of sources. Bono et al. [262] used IS1245 RFLP typing to compare M. avium ssp. avium isolates recovered from different animal species and humans. The M. avium ssp. avium isolates from humans revealed multibanded IS1245 RFLP patterns, which shared a high degree of similarity with patterns of the isolates from pigs. Bird isolates exhibited a specific three-band IS1245 RFLP pattern. In the Netherlands, examination of a broad spectrum of M. avium complex isolates from different sources confirmed this picture [263265]. Almost all 90 M. avium ssp. avium strains isolated from lymph nodes of slaughtered pigs showed multibanded IS1245 RFLP patterns with a high degree of similarity with a large proportion of the multibanded IS1245 RFLP patterns of isolates from humans [265284]. In a study in Denmark, samples of peat to be used as potting soil were also found to contain viable M. avium ssp. avium bacteria with the same type of multibanded IS1245 RFLP patterns as isolates from humans [269]. This might be an indication that both humans and pigs are easily exposed to M. avium ssp. avium by contact with soil. In contrast, all 49 M. avium ssp. avium strains isolated in the Netherlands from 29 different bird species exhibited the specific three-band ‘bird type’ IS1245 RFLP pattern. This ‘bird type’ pattern was hardly ever found amongst M. avium ssp. avium isolates from humans [263284]. Birds therefore seem to be an unusual source of M. avium ssp. avium infections.

Typing methods for other mycobacteria

Several DNA fingerprinting methods have been developed for other atypical mycobacteria.

Mycobacterium avium ssp. paratuberculosis is the cause of paratuberculosis in ruminants and may be involved in the aetiology of Crohn’s disease in humans [7376–85]. To study the epidemiology of M. paratuberculosis infections, IS900 RFLP typing has been developed and standardized [8081285–287]. Although, in general, the M. avium ssp. paratuberculosis isolates are genetically highly conserved, with IS900 RFLP typing the spread of different M. avium ssp. paratuberculosis IS900 RFLP types in Europe and Latin America was recently demonstrated [288].

Pulsed-field gel electrophoresis also appears useful for typing M. avium ssp. paratuberculosis[254].

Mycobacterium kansasii is ubiquitous in the environment, and frequently found in water [289–293]. Symptoms of M. kansasii infections often resemble tuberculosis caused by M. tuberculosis complex bacteria. The PGRS and MPTR sequences can be used to type M. kansasii isolates with the aid of RFLP typing on the basis of these repetitive elements as probes [97294–296]. IS1652 RFLP is only suitable for a limited number of M. kansassii subspecies [296297]. Other typing methods are, for example, 16S-23S spacer region typing [298]; large restriction fragment analysis [299], amplified fragment length polymorphism analysis [296]; PFGE [296], and PCR restriction analysis of the hsp-65 gene [300].

Mycobacterium gordonae is also widespread in nature. These bacteria, which have never been found in severe cases of tuberculosis, can be typed with the same insertion sequences as M. kansasii[97294–296]. In addition, IS1511/IS1512 shows a high degree of polymorphism amongst M. gordonae isolates [301].

In rare cases, M. xenopi causes lung tuberculosis. Collins and Stephens [47] described the identification of IS1081 in M. bovis, and Picardau et al.[302] described the use of IS1395, a closely related element, to study the epidemiology of M. xenopi infections.

Mycobacterium abcessus isolates can be typed by multilocus enzyme electrophoresis and pulsed-field gel electrophoresis [303304]. The latter method has also been used for typing of Mycobacterium chelonae[303]. Furthermore, random amplified polymorphic DNA genotyping has been used to type M. abcessus[305306] and M. malmoense[307].

For Mycobacterium haemophilum, an RFLP method has been established using a multiple copy genetic element that exhibited a limited degree of polymorphism [308].

Recent developments in the molecular typing of M. tuberculosis

The standard method for molecular typing of M. tuberculosis isolates, IS6110 RFLP typing, is technically demanding and laborious. Furthermore, a proportion of the M. tuberculosis isolates and most M. bovis strains contain no, or only a few, copies of the IS6110 insertion sequence. For these strains, additional typing with other methods, such as PGRS RFLP typing [4699117–119121], DR RFLP typing [46109116–118] or spoligotyping [120122–124], is required. Moreover, all currently used methods involve a lot of work to visualize only the genetic polymorphism associated with particular, usually repetitive, genomic sequences. Because the whole genomic sequence of the M. tuberculosis strain H37Rv has been published [309], it has become possible to develop DNA fingerprinting techniques in which a larger part of the genomic evolutionary divergence is visualized. This is particularly interesting because this may facilitate the study of the genetic relatedness of M. tuberculosis isolates at different levels in one assay. To follow the direction of transmission of tuberculosis and to distinguish between primary, secondary, etc. sources, the average half-life of IS6110 RFLP, which has been estimated as 3–5 years [200201], is too long. For instance, large clusters of patients on the basis of IS6110 RFLP typing already overlap a period of 7 years in the Netherlands. For a proportion of the patients concerned, it cannot be determined whether transmission of tuberculosis occurred from the initial source case, or from secondary or tertiary sources. This problem will enlarge with the extension of the period in which IS6110 RFLP typing will be used. For this reason, genomic markers need to be disclosed which visualize the most rapid, evolutionary development of M. tuberculosis.

In contrast, to study the associations between phenotypic behaviour of bacteria and the position of the respective strain in the phylogenetic tree of M. tuberculosis, methods are required that yield insight into the long-term evolutionary development of M. tuberculosis.

The release of the whole genome sequence of the M. tuberculosis strain H37Rv revealed 56 loci with homology to known insertion sequences [27, 47, 249, 294, 295, 309–312]. Furthermore, other repetitive elements have been disclosed coding acidic, glycine-rich proteins, the so-called PE (Pro-Glu) and PPE (Pro-Pro-Glu) multigene families, encoding about 10% of the coding capacity of the genome [309]. These PE and PPE genes contain multiple copies of PGRS and major polymorphic tandem repeat (MPTR) sequences. This may offer the possibility of developing new molecular markers with slower and more rapid turnovers than IS6110 RFLP based on these regions. Ideally, epidemiological markers with different molecular clocks should be combined in one assay.

One recently introduced method that visualizes base substitutions across the whole genome of M. tuberculosis is FAFLP typing [45]. In brief, genomic DNA is digested with the two restriction enzymes EcoRI and MseI. With the aid of DNA ligase, adaptors are linked to the restriction fragments. To visualize only particular restriction fragments after PCR amplification, the primer for the EcoRI adaptor sites contains the selective base A, T, C or G labelled with different fluorescent dyes. The amplification products are separated on a denaturing polyacrylamide gel and read on an automated sequencer. The FAFLP patterns were, in some sets of strains, more discriminative than the IS6110 RFLP patterns [45]. This may be an indication of a more rapid turnover in the genomic areas visualized.

Other approaches to visualize relatedness on the basis of the whole genome sequence rather than on polymorphism associated with a limited number of genomic loci of particular repetitive sequences are the DNA microarrays and DNA chip technology [195313314]. These techniques allow detection of genetic variation at various genomic sites simultaneously by analysis of hybridization of mycobacterial DNA on high-density oligonucleotide arrays containing thousands of DNA oligonucleotides on a limited surface [314]. Such an approach was recently used to examine the genetic variation amongst M. bovis BCG vaccine daughter strains in comparison with H37Rv [195]. The microarray and chip technologies in principle enable the performance of identification, drug susceptibility testing and molecular typing at various evolutionary levels in one assay.


The significant contribution of Dr Judith Glynn of the London School of Hygiene and Tropical Medicine in editing the manuscript is greatly acknowledged. I also thank Mrs Kristin Kremer of my department for her extended contribution regarding the organization of the numerous references.

Received 23 August 2000; accepted 3 October 2000.