Spoligotype‐based population structure of Mycobacterium tuberculosis in the Jimma Zone, southwest Ethiopia

Abstract Background To understand the population dynamics and propose more effective preventive strategies, defining the population structure of the circulating Mycobacterium tuberculosis strains is important. Methods A total of 177 M. tuberculosis complex isolates from pulmonary tuberculosis (TB) cases in southwest Ethiopia were genotyped by spoligotyping. Of the strains included in this study, 126 were pan‐susceptible strains while the remaining 51 isolates were resistant to one or more first‐line anti‐TB drugs. The genotyping results were compared to the international spoligotyping (SITVIT) database of the Pasteur Institute of Guadeloupe and the newly revised publicly available international multi‐marker database (SITVITWEB/SPOLDB4). An online tool Run TB‐Lineage was also used to predict the major lineages using a conformal Bayesian network analysis. Results The spoligotyping of the 177 isolates resulted in 69 different spoligotype patterns of which 127 (71.8%) were clustered into 19 spoligoclusters (with clustering rate of 61.02%). Each cluster contains 2–29 isolates. Of the isolates with corresponding SIT in SITVIT/SDB4, the predominant strains identified were SIT 37 of the T3 subfamily with 29 isolates followed by SIT 53 of the T1 subfamily with 20 isolates. SIT 777 of the H4 subfamily and SIT 25 of the CAS1_DELHI subfamily each consisting of six isolates were identified. Eighty spoligotype patterns were orphan as they were not recorded in the SITVIT2/SPDB4 database. Further classification of the isolates on the basis of major lineages showed that 82.5% and 14.1% of the isolates belonged to Euro‐American and East African Indian lineages, respectively, while 2.8% of the isolates belonged to Mycobacterium africanum and 0.6% to Indo‐Oceanic. Conclusion The ill‐defined T and H clades were predominant around Jimma. The substantial number of orphans recorded in the study area warrants for additional studies with genotyping methods with better resolution and covering whole areas of southwest Ethiopia.


| BACKG ROU N D
Tuberculosis (TB) remains a public health challenge globally in spite of the availability of vaccine and drugs for treatment. WHO estimates that currently one third of the world's population is infected with bacteria of the Mycobacterium tuberculosis complex, and 10 million cases of active TB disease occur each year, resulting in almost 2 million deaths annually. Ethiopia is one of the 22 high burden countries in terms of the number of TB. In 2016, the estimated TB incidence was 177/100,000 population. The country is also one of the high burden countries for the multidrug-resistant tuberculosis (MDR-TB) and HIV (World Health Organization, 2017).
Ethiopia succeeded in reducing the burden of TB over the years. However, there is an increasing trend in drug-resistant TB. According to WHO 2011 report, drug-resistant TB was 1.6% among new cases and 12% among retreatment cases (World Health Organization, 2011). In 2017 report of the WHO, the rate increased to 2.7% for new cases and 14% for retreatment cases (World Health Organization, 2017).
Spoligotyping was developed as a genotyping tool to provide information on the structure of the direct region in individual M. tuberculosis strains and in the different members of the M. tuberculosis complex (Kremer et al., 1999). It is a valuable tool to study the population structure and molecular epidemiology of M. tuberculosis.
Understanding the genetic diversity of the circulating M. tuberculosis strains is important to define the population dynamics and to propose more effective preventive and control strategies. Certain genotypes of M. tuberculosis were associated with drug resistance (Glynn, Whiteley, Bifani, Kremer, & Soolingen, 2002;Haeili et al., 2013;Pang et al., 2012).
While the genetic diversity of M. tuberculosis lineages in Ethiopia has been investigated (Diriba, Berkessa, Mamo, Tedla, & Ameni, 2013;Garedew et al., 2013), there are little or no data in the southwest part of Ethiopia in general and Jimma area in particular. The objective of this study was to investigate the population structure of the mycobacterial isolates collected during November 2010 to June 2012 from Jimma area.

| Source of isolates
The isolates for this study were recovered from the TB patients who were included in the drug resistance survey around Jimma among new and retreatment cases (Abdella, Abdissa, Kebede, & Abebe, 2015;Abebe et al., 2012). The stored isolates were subcultured, and 179 isolates were recovered (115 from the study among new cases  and 64 from the study on retreatment cases (Abdella et al., 2015)). They were inactivated at 80°C for 1 hr and transported to Aklilu Lemma Institute of Pathobiology, Addis Ababa University, for spoligotyping using 43 spacer version.
The drug resistance patterns of the isolates were previously reported (Abdella et al., 2015;Abebe et al., 2012). Of the isolates included in this study, 126 were pan-susceptible to the first-line anti-TB drugs. The remaining 51 isolates were resistant to one or more first-line anti-TB drugs. Fifteen isolates were MDR. The isolates that were not MDR included four rifampicin mono-resistant, 22 isoniazid-resistant, 11 resistant against ethambutol, and 16 resistant to streptomycin. Of the 15 MDR-TB isolates, 11 were also resistant against ethambutol and 14 were resistant against streptomycin.

| HIV testing
HIV testing is provided for every presumptive TB case at health facility in Ethiopia. The results of the test were recorded from patients' cards after having consent.

| Spoligotyping
Spoligotyping was performed at the Aklilu Lemma Institute of Pathobiology, Addis Ababa University. One hundred and seventy-seven (177) isolates confirmed as M. tuberculosis using para nitro benzoic acid inhibition test, and RD9 typing was further analyzed by spoligotyping according to the standardized protocol (Kamerbeek et al., 1997) and following manufacturer's instructions (reagents from Ocimum Biosolution, custom Master Mix from ABgene). Two (2) isolates DNA were not amplified and therefore not spoligotyped. The presence of spacers was visualized on film as black squares after incubation with streptavidin-peroxidase and enhanced chemiluminescence detection reagents (RPN 2105 Amersham, GE Healthcare Bio-Sciences).
The spacer hybridization was read by two independent readers; in case of discrepancy, the opinion of a third reader was considered. The patterns were translated into binary and octal format as previously described (Dale et al., 2001). The 43-digit binary code was converted to 15-digit octal code (base 8, having the digits 0-7).

| Genotype database comparison and analysis
The spoligotyping results were compared to the international spoligotyping (SITVIT) database of the Pasteur Institute of Guadeloupe (https://www.pasteur-guadeloupe.fr:8081/SITVITDemo) (Brudey et al., 2006) and the newly revised publicly available international multimarker database (SITVITWEB) (Demay et al., 2012). Orphan strains were further compared with MIRU-VNTRplus, an online database (https://www.miru-vntrplus.org) (Weniger, Krawczyk, Supply, Niemann, & Harmsen, 2010). A SIT (spoligotype international type) was assigned to the isolates that share an identical spoligotype pattern in the database, while spoligotype patterns that had not been registered before were defined as "orphan." Spoligotype patterns that did not cluster with other patient isolates were defined as unique. An online tool Run TB-Lineage (https://tbinsight.cs.rpi.edu/run_tb_lineage.html) was also used to predict the major lineages using the conformal Bayesian network (CBN) analysis.

| Statistical analysis
Statistical analysis was performed using SPSS Statistic 17 (SPSS Inc., USA). Proportions of drug susceptibility test profiles for M. tuberculosis strains were compared using chi-square analysis. The two-sided Pearson's chi-square test was used to assess associations of drug resistance profiles with spoligotype families. Furthermore, an association of first-line anti-TB drug resistance profiles with MTBC genotypes was estimated and expressed as the odds ratio (OR) and 95% confidence interval (95% CI). A p value of <0.05 was considered statistically significant.

| Ethics statement
This study was part of the previous studies that were published (Abdella et al., 2015;Abebe et al., 2012). Both studies were ap-

| Characteristics of the patients from which the isolates were collected
In this study, data generated from 177 study subjects were used in the analysis of demographic, clinical, and mycobacteriological data.
Most of the study subjects (54.8%) were males and in the age group of 18-28 years (61.0%). Most (82.5%) of the study participants were new TB cases (Table 1).

| Spoligotype patterns of the Mycobacterium tuberculosis strains
The spoligotyping of the 177 isolates resulted in 69 different spoligotype patterns (with clustering rate of 61.02%). Of the isolates, 127 (71.8%) were clustered into 19 spoligoclusters containing 2-29 isolates per cluster. The remaining 50 (28.2%) isolates were unique which means that the isolates did not cluster with other patient isolates in this study. Over half (97 isolates) of the isolates representing 24 spoligotype patterns were shared types. Of these shared types, the predominant strains identified were SIT 37 of the T3 subfamily consisting of 29 isolates followed by SIT 53 of the T1 subfamily with 20 isolates. In addition to the T family, SIT 777 of the H4 subfamily and SIT 25 of the CAS1_DELHI subfamily each with six isolates were identified (Table 2). Eighty spoligotype patterns were orphans that were not recorded in the SITVIT2/SPDB4 database (Table 3).
Further classification of the isolates spoligotype patterns using the TB-insight RUN TB-lineage revealed that Euro-American lineages accounted for 82.5%, East African Indian for 14.1%, Mycobacterium africanum for 2.8%, and Indo-Oceanic for 0.6% (Tables 2 and 3).

| M. tuberculosis drug resistance by lineage and family
The result of drug sensitivity pattern of the isolates is presented in Table 4. Compared with other strains, CAS1_DELHI subfamily was more likely to be resistant to the first-line ant-TB drugs and to be MDR-TB strains (Table 4).

| D ISCUSS I ON
In the present study, the population structure of M. tuberculosis isolates was investigated in adults around Jimma, southwest Ethiopia.
Our previous report from this area indicated the prevalence of pulmonary TB was lower compared to the national prevalence for Ethiopia (Deribew et al., 2012). The current report suggests that there is high genetic diversity of M. tuberculosis around Jimma.
Previous reports have also indicated that in areas with low prevalence of TB high genetic diversity is expected (Lopez-Avalos et al., 2017;Michel et al., 2008) since the dominance and transmission of a single strain is less likely, rather TB in such settings is the result of reactivation from previous infections. This is substantiated in the current study by the fact that there were 69 spoligotype patterns from the 177 M. tuberculosis isolates typed.
On the other hand, our study suggests that the genetic diversity of M. tuberculosis in southwest part of Ethiopia is not well studied as there were a substantial number of "Orphan" strains lacking corresponding patterns from the available international databases.
It is known that spoligotyping has relatively low resolution power in terms of differentiation of clusters, thereby indicating strain differences. Thus, to make more understanding of the M. tuberculosis strains in the study area, further study with other methods having better resolution power is recommended.
Majority of the isolates with SITs reported in this study (SIT 37 and SIT 53) were dominated by modern clades of the ill-defined T and H families that were consistently reported from Ethiopia (Bedewi et al., 2017;Belay et al., 2013;Getahun et al., 2015;Maru, Mariam, Airgecho, Gadissa, & Aseffa, 2015). These clades were reported from the study on children in the same study area suggesting their transmission from adults (Workalemahu et al., 2013). This result suggests that these specific families may have shown successful epidemiological fitness in this geographic area. The ancient clades were also reported in this study with CAS family and Manu being the main clades. It is in agreement with the previous reports that TB originated at the beginning from Ethiopia with the ancient clades and the modern clades also imported following human population movement (Comas et al., 2015). The T3_ETH is the most dominant strain circulating in Ethiopia. Though these strains are new to the database/orphans, they are reported in southwest and other regions of Ethiopia (Tadesse et al., 2017;Tessema et al., 2013;Zewdie et al., 2016).
This study suggests that CAS1_DELHI clades were more likely to be associated with resistance to first-line anti-TB drugs. In agreement with our finding, a study from New Delhi has also concluded that CAS1_DELHI isolates have a high frequency of mutations in the rpoB and katG genes which were indicators for resistance against rifampicin and isoniazid, respectively (Stavrum, Myneedu, Arora, Ahmed, & Grewal, 2009).
In this study, a substantial number (28.2%) of isolates have exhibited unique spoligotype pattern suggesting that these cases could be the result of reactivation from past infection (Murray, 2002 In conclusion, the present study has shown the dominance of ill-defined T and H clades in the study area. Moreover, a substantial number of isolates were Orphan warranting for additional studies covering the whole geographic area of the southwest Ethiopia and genotyping methods with better resolution.