Genetic polymorphisms of long noncoding RNA RP11‐37B2.1 associate with susceptibility of tuberculosis and adverse events of antituberculosis drugs in west China

Background Little knowledge about the biological functions of RP11‐37B2.1, a newly defined long noncoding RNA (lncRNA) molecule, is currently available. Previous studies have shown rs160441, located in the RP11‐37B2.1 gene, is significantly associated with tuberculosis (TB) in a Ghanaian and the Gambian populations. Methods We investigated the influence of single‐nucleotide polymorphisms (SNPs) within lncRNA RP11‐37B2.1 on the risk of TB and the possible correlation with adverse drug reactions (ADRs) from TB treatment in a Western Chinese population. Four SNPs within lncRNA RP11‐37B2.1 were genotyped in 554 TB cases and 561 healthy subjects using the improved multiplex ligation detection reaction method, and the patients were followed up monthly to monitor the development of ADRs. Results No significant association between the SNPs of lncRNA RP11‐37B2.1 and TB susceptibility was observed (all P > 0.05). Surprisingly, significant association was observed between two SNPs (rs218916 and rs160441) and thrombocytopenia development during anti‐TB therapy under the dominant model (P = 0.003 and 0.014, respectively). Conclusions Our findings firstly exhibit that rs218916 and rs160441 within lncRNA RP11‐37B2.1 significantly associate with the occurrence of thrombocytopenia and suggest RP11‐37B2.1 genetic variants are potential biosignatures for thrombocytopenia during anti‐TB treatment.


| INTRODUC TI ON
Tuberculosis (TB), an ancient infectious disease, infects about a third of the world's population with a yearly incidence of approximately 10 million cases and a mortality of 1.57 million worldwide, based on data from the 2018 World Health Organization (WHO) Global TB report. 1 Response to the substantial variation between individuals' susceptibility to the pathogen Mycobacterium tuberculosis (MTB), 2 individuals infected with MTB have a five to ten percent lifetime risk of developing clinical TB. 3 There is considerable evidence that suggests the host genetics elements play a crucial role in protecting individuals from developing active TB disease. 4 Twin studies further substantiate the assumption that host genetics greatly contribute to the susceptibility to TB. 5 Möller et al 6 demonstrated that the contribution of host hereditary factors to the immune response and phenotypic variation in the population infected with TB ranges up to 71%. However, the exact molecule regulative mechanisms underlying TB remain largely unknown; therefore, further study of the host molecule elements involved in TB infection would be very helpful to understanding the pathogenesis of TB.
Long noncoding RNA (lncRNA) transcripts, the largest species of the nonprotein coding RNAs, have been reported to participate in diverse biological processes and their abnormal expressions have been related to various disease states. 7 In addition, they are increasingly being recognized to play significant roles in the biological behaviors of TB infection. For example, Yang et al 8  In addition to the close relationship between genic polymorphisms and TB susceptibility, series of variations were reported to be related to clinical response to drug therapy in recent years. [10][11][12] For antituberculosis drugs (ATDs), hepatotoxicity is known as the most serious and prevalent adverse drug reaction. There are a continuously increasing number of identified sequences of polymorphisms related to the occurrence of antituberculosis drug-induced hepatotoxicity (ATDH): for example, genetic variants within N-acetyltransferase 2 (NAT2), nuclear receptor subfamily 1 group I member 2 (PXR), and solute carrier organic anion transporter family member 1B1 (SLCO1B1) genes. [13][14][15] Several published literatures have shown that lncRNAs are significantly associated with the drug effects and resistance in various malignant diseases. 16,17 For example, in the breast cancer cell experiment, researchers found that downregulation of lncRNA ROR inhibited the resistance to Tamoxifen. 17 Moreover, the GTEx Project shows that the rs160441 and rs218921 within RP11-37B2.1 are eQTLs for the RIPK2. RIPK2 interaction with NOD2 enhances NF-kB activity making it an important player in immune response. 18 Ameliorate acetaminophen (APAP)-induced liver injury was attenuated by Tovophyllin A by activate Nrf2 and inhibit the NF-κB signaling pathway. 19 Therefore, it is speculated that these SNPs within RP11-37B2.1 may affect the occurrence of adverse reactions to antituberculosis drugs. In general, TB is a complex disease and immune factors affect its occurrence, development, and even adverse drug reactions. Moreover, no study about the correlation of ATDs with lncRNA molecules has been reported. Thus, the other special purpose of our study was to evaluate possible correlations between lncRNA RP11-37B2.1 and ATDs adverse effects.
For all of these reasons, we first evaluated the possible association between four common variations (rs160441， rs218916， rs218921， and rs218936) and TB susceptibility among 554 people with TB and 561 healthy individuals in a Western Chinese Han population in a retrospective study. Then, we explored whether these SNPs had correlations with multiple ATD-induced adverse reactions (eg, anemia, leukopenia, thrombocytopenia, hepatotoxicity, and kidney damage) in a prospective analysis.

| Subjects
In our retrospective study, five hundred and fifty-four cases and five hundred and sixty-one healthy controls were consecutively In our prospective section, TB patients whose ATDs regimens at least included rifampin (RIF, daily 450-600 mg) and isoniazid (INH, daily 300-400 mg) for 6 months or more were further selected to monitor the appearance of adverse drug effect from ATDs. This part included the subjects in the previous section without history of liver, kidney, or/and hematologic system disorder before ATDs treatment; additional ineligibility criteria were poor compliance or/and withdrawn during the 6-month treatment course. Finally, 453 eligible cases with TB were included. We detected peripheral complete blood counts, biochemical examinations, and routine urinalysis monthly for all 453 patients for 6 months or until treatment had been done. ATDs-induced adverse reactions in this study included anemia, thrombocytopenia, leukopenia, hepatotoxicity, and chronic kidney damage. In terms of hematologic toxicity, hemoglobin-valley ≤100 g/L, white blood cell count-valley <3.5 × 10 9 /L, and platelet count-valley <90 × 10 9 /L were considered to be anemia, leukopenia, and thrombocytopenia, respectively. 20 Drug-induced hepatotoxicity was diagnosed according to the criteria of drug-induced liver disorders in which aspartate aminotransferase (AST) and/or alanine aminotransferase (ALT) levels more than three times the upper limit of normal were considered to have hepatotoxicity. 21 Chronic kidney injury was diagnosed as persistence of hematuria, proteinuria, or/and casts for more than 90 days. 22 As you can see in Figure 1, the diagram of study enrollment is shown.

| Genetic molecular analyses
We obtained genetic variation data of the entire lncRNARP11-37B2.1 locus from the dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/) and comprehensively searched candidate SNPs for this study. We selected SNPs with a minor allele frequency >0.20 according to 1000 Genomes East Asian and their effects on gene expression based on the expression Quantitative Trait Loci (eQTL) information form HaploReg v4.1. Detailed information of candidate SNPs is shown in Tables S1 and S2. In addition, rs160441 was enrolled in this study due to their promising roles in the predisposition to TB based on the genome-wide association study by Thye et al. 9 Finally, a total of four SNPs (rs218921, rs160441, rs218916, and rs218936) were selected for subsequent genotyping. Genotyping of these SNPs was performed using an improved multiplex ligation detection reaction (iMLDR) method with the technical support from Shanghai Genesky Biotechnologies Company. 23 In addition, about ten percent of the total samples were randomly selected for a secondary genotyping, and the coincidence rate of quality control samples was 100%.

| Statistical analysis
Statistical analysis was performed with the use of SPSS version 20.0 (IBM, Chicago, USA). The chi-square test was used for categorical variables, and the Student's t test or Mann-Whitney U test for continuous variables was used to analyze the differences in clinical data among the two groups. The goodness-of-fit chi-square test was performed to exclude deviations Hardy-Weinberg equilibrium (HWE) for controls. The odds ratios (ORs) and 95% confidence intervals (CIs) were calculated by unconditional logistic regression analysis using PLINK version 1.07 24 ; the linkage disequilibrium (LD) was estimated by calculating the pairwise r 2 coefficient. Haplotype analysis was performed by Haploview software version 4.2, which employed the expectation-maximization clustering algorithm. Prior to data collection, PASS Statistical Software v11 was used to perform power calculations. All tests were two-sided, and P < 0.05 was considered to be statistical significance; PhenoSpD tool was adopted to correct for multiple testing. 25

| General characteristics of the Western Chinese Han population
We studied 1115 Western Chinese Han individuals including 554 TB patients and 561 controls (Basic data were shown in Table S3). No statistically significant differences were observed for age and gender between two groups. Significant differences in smoking status, Bacillus Calmette-Guerin (BCG) scar, and body mass index (BMI) were observed between the two groups, with smoking and having a BCG scar being more prevalent in the case group (both P < 0.001). From Table S3, TB patients had significantly higher levels of C-reactive pro-

| Association of lncRNA RP11-37B2.1 genetic polymorphisms with susceptibility to TB
The genotype distributions of the tested SNPs in the control group were consistent with the HWE (P > 0.05). TB patients and the controls had very similar genotype and allele distributions among all four SNPs (P > 0.05) in Table S4. Frustratingly, these four SNPs had nothing to do with predisposition to TB under three genetic patterns (P > 0.05) in Table S5. Furthermore, we also conducted the age-subgroup and clinical subtype-subgroup analysis according to the earlier studies 26,27 and determined these four candidate SNPs were not correlated with specific age group and specific tubercular subtype (data not shown).
All four variants in the lncRNA RP11-37B2.1 gene were analyzed (results showed in Figure S1) whether were in the linkage disequilib-

| Association of lncRNA RP11-37B2.1 genetic polymorphisms with ATD-induced adverse reactions
In this prospective part, we compared the incidences of ATD-induced adverse reactions in different genotypes. Although we failed to observe any significant associations between these four lncRNA RP11-37B2.1 genetic polymorphisms and TB risk, TB non-susceptibility loci posed the associations with the occurrence of drug-induced thrombocytopenia. Rs218916 is shown to be closely correlated with the presence of drug-induced thrombocytopenia by applying the dominant model (P = 0.003). The results suggested that the T alleles of rs218916 might serve as a hazard for thrombocytopenia induced by ATDs (OR = 5.32, 95% CI = 1.54-18.32 in Table 1). As for rs160441 and rs218936, patients carrying T allele-involving genotypes would have more chance to have thrombocytopenia arising from anti-TB chemotherapy treatment than CC genotype carriers with the estimated P = 0.014 (OR = 3.18, 95% CI = 1. 21-8.37, presented in Table 2) and P = 0.018 (OR = 3.23, 95% CI = 1. 16-8.97, presented in Table 3), respectively. Also, weak correlation was found between rs218921 and anti-TB drug-induced hepatotoxicity in the dominant model (P = 0.048, in Table 4).
We have adopted PhenoSpD tool to estimate phenotypic correlation and correct for multiple testing. 25 Our effective number of independent variables is 3, and the experiment-wide significance threshold required to keep type I error rate at 5% is 0.0170 according to PhenoSpD correction. After PhenoSpD correction, the correlation between the SNPs (rs218916 and rs160441) and the occurrence of drug-induced thrombocytopenia was still statistically significant, while rs218936 was not. The correlation between rs218921 and anti-TB drug-induced hepatotoxicity risk was not statistically significant too after PhenoSpD correction.

| D ISCUSS I ON
Over the last two decades, accumulating evidence indicates that lncRNAs might modulate the innate immune response. 28,29 More and more lncRNAs were determined, such as lnc-interleukin 7 receptor, nonprotein coding RNA repressor of NFAT (NRON), and many more, representing a new series of molecules that is associated with the gene expressions and functions of immune cells. 28  Except for the incidence of TB, ATD adverse reactions are an important part which contribute toward anti-TB treatment discontinuation or failure. Although drug-induced liver injury is the most general and well-studied adverse reaction caused by TB therapy with INH and RFP, 36 thrombocytopenia is less common but far more likely to be fatal adverse effect seen with certain ATDs. RFP is the agent most commonly associated with ATD-induced thrombocytopenia. 37 INH-induced thrombocytopenia is a rare presentation, and only a few such cases have been reported in the literature. 38  RFP appears again in the plasma, these antibodies are believed to fix a complement on the platelets, resulting in platelet destruction. 39 We can speculate that these loci in lncRNA RP11-37B2.1 may affect the incidence rate of thrombocytopenia by immunity. These results indicate that the variants of lncRNA RP11-37B2.1 contribute to host response to drug treatment; however, the mechanism of how they affect drug adverse reactions remains unclear. Thus, more rigorous research at the molecular gene level should be conducted.
There are several limitations to this study. First, our sample size is limited, which leads to a higher false-negative rate. Second, the only gene determinants are not sufficient to trigger ATD adverse reactions, and combination of nongenetic and genetic risk factors may be more potent in predicting ATD adverse reactions. Third, according to eQTL analysis, the rs160441 and rs218921 are eQTLs for both RP11-37B2.1 and RIPK2. Therefore, in the target tissue of TB infection RIPK2 is as good candidate as RP11-37B2.1 in the future studies. Therefore, better replications in other large independent populations and ethnicities are urgently needed to conclusively confirm or reject our findings.

| CON CLUS IONS
No significant association between the SNPs of lncRNA RP11-37B2.1 and TB susceptibility was observed in our study. However, our findings firstly exhibit that rs218916 and rs160441 within lncRNA RP11-37B2.1 significantly associate with the occurrence of thrombocytopenia and suggest RP11-37B2.1 genetic variants are potential biosignatures for thrombocytopenia during anti-TB treatment.

ACK N OWLED G M ENTS
This work was supported by grants from the National Natural Science Foundation of China (grant numbers 81472026, 81672095).