Gene silencing of Helicobacter pylori through newly designed siRNA convenes the treatment of gastric cancer

Abstract Background Helicobacter pylori is a gastric pathogen that is responsible for causing chronic inflammation and increasing the risk of gastric cancer development. It is capable of persisting for decades in the harsh gastric environment because of the inability of the host to eradicate the infection. Several treatment strategies have been developed against this bacterium using different antibiotics. But the effectiveness of treating H. pylori has significantly decreased due to widespread antibiotic resistance, including an increased risk of gastric cancer. The small interfering RNAs (siRNA), which is capable of sequence‐specific gene‐silencing can be used as a new therapeutic approach for the treatment of a variety of such malignancies. In the current study, we rationally designed two siRNA molecules to silence the cytotoxin‐associated gene A (CagA) and vacuolating cytotoxin A (VacA) genes of H. pylori for their significant involvement in developing cancer. Methods We selected a common region of all the available transcripts from different countries of CagA and VacA to design the siRNA molecules. The final siRNA candidate was selected based on the results from machine learning algorithms, off‐target similarity, and various thermodynamic properties. Result Further, we utilized molecular docking and all atom molecular dynamics (MD) simulations to assess the binding interactions of the designed siRNAs with the major components of the RNA‐induced silencing complex (RISC) and results revealed the ability of the designed siRNAs to interact with the proteins of RISC complex in comparable to those of the experimentally reported siRNAs. Conclusion These designed siRNAs should effectively silence the CagA and VacA genes of H. pylori during siRNA mediated treatment in gastric cancer.


| INTRODUCTION
The gram-negative bacterium Helicobacter pylori specifically colonizes the gastric epithelium. 1 The spiral-shaped bacterium has 3-5 polar flagella that are utilized for locomotion and is positive for the urease, catalase, and oxidase enzymes. 2 Most H. pylori strains express virulence factors that have evolved it interfere with host cell signalling pathways, this remarkable ability to persist for decades in the harsh stomach environments, primarily because the host cannot effectively eliminate it. 3,4By utilizing urease to convert urea into ammonia, H. pylori has acquired the ability to thrive in the extremely acidic environment, setting it apart from other viruses and bacteria. 5For many years, experts disagreed as to whether H. pylori causes stomach cancer.But other studies, including one involving 1526 Japanese patients, have now clearly demonstrated that H. pylori infection greatly raises the risk of stomach cancer. 6Uemura et al. reported that about 3% of people with H. pylori infection had gastric cancer, whereas the uninfected patients had no sign of it. 6Individuals infected with H. pylori but without premalignant lesions had a significantly reduced risk of developing stomach cancer following H. pylori eradication.Randomized prospective studies have shown a substantial decreases in the presence of premalignant lesions after eradication, highlighting the role of this organism in early gastric carcinogenesis. 7,8he small interfering RNAs (siRNAs), that are capable of regulating gene expression through a procedure named RNA interference (RNAi), have been regarded as one of the most notable advances. 9Since their discovery in the 1990s, RNAi therapeutics have shown great potential for reducing the expression of disease-related genes. 10A significant milestone for RNAi therapy was achieved in 2018 when the first siRNA-based drug 'Patisiran' (Onpattro®) received approval for the treatment of transthyretin-mediated amyloidosis. 11In this therapeutic method, the double-stranded RNAs (dsRNA) designed to target disease-causing mRNA sequences are incorporated into a gene regulatory complex known as the RNA-induced silencing complex (RISC), consisting of DICER, Argonaute-2 (Ago2), and transactivation response RNA-binding protein (TRBP). 12,13The enzyme Dicer first initiates the RNAi mechanism by splitting double-stranded RNAs (dsRNAs) into 21-25 nt long, double-stranded siRNAs.The siRNA guide strand is then filled onto the RNA-induced silencing (RISC) complex and the siRNA passenger strand is unwound. 9As a result, the target mRNAs can be cleaved by Argonaute 2 (Ago2).When the guide strand sequence is coupled with an mRNA corresponding sequence. 9he vast majority of earlier research on siRNA design is sequence specific to a target gene.Sohrab et al. designed potential siRNAs targeting ORF1ab of MERS-CoV and experimentally validated them in Vero cell line. 14Oany et al. designed a most probable siRNA in order to repress the nucleocapsid gene of several Nipah virus strains. 15howdhury et al. designed several siRNA molecules targeting the nucleocapsid phosphoprotein and surface glycoprotein gene of SARS-CoV-2 and showed the synergy of these siRNAs with Ago2 protein. 16To effectively address these maladies, it is essential to employ a proper computational approach for siRNA design and development. 17dditionally, the issues of stability and early in vivo clearance are resolved by incorporating siRNA molecules onto specialized nanocarrier tailored for specific tissues or cells. 18These nanocarriers enhance the overall effectiveness of naked siRNA molecules by mitigating concerns such as off-target binding and undesired immune reactions. 19herefore, in this study, we aimed to utilize several computational algorithms and molecular dynamics (MD) simulations to design a potential siRNA targeting the CagA and VacA genes of H. pylori which will suppress the translations of these virulent proteins, allowing the host to eradicate this infection.siRNAs were designed against both CagA and VacA as both are abundant proteins produced by the bacterium and targeting these proteins might result in bacterial inhibition.We hope this study will help to develop a new treatment strategy for H. pylori mediated gastric cancer.The overall workflow of the study including methodological steps is shown in Figure 1.

F I G U R E 1
The overall summary of this study.

| Pathophysiology of H. pylori and targeted gene identification
H. pylori is the primary risk factor for this cancer.Targeting H. pylori-infected individuals at high risk for stomach cancer for management is necessary. 20In H. pylori most well-known genes are cytotoxin-associated gene A (CagA) and cytotoxin gene (VacA). 21The severe form of gastroduodenal disease is caused worldwide by the virulence factors cagA and vacA genotypes and their variations. 22Several study cytotoxin-associated gene A (CagA) and cytotoxin gene (VacA) demonstrated, exhibit various signature pathways of initiation of gastric adenocarcinoma (GAC). 23Besides, H. pylori exhibits varying levels of antibiotic resistance in various geographic regions; this is one of the primary causes for the absence of treatment. 24e explored CagA and VacA gene powerful virulence factors for GAC and designed siRNA against this pathogenicity for lowering the infection in human.

| Selection of cDNA sequences of target gene
We collected CagA cDNA sequences from seven countries (Bangladesh, Sweden, USA, Australia, China, Philippines, and Vietnam) and VacA cDNA sequences from six countries (Bangladesh, Nepal, Japan, Indonesia, India, and China) respectively from NCBI database (https:// www.ncbi.nlm.nih.gov/ ).As there were no available CagA and VacA cDNA sequences of Bangladesh in NCBI database, we collected the available H. pylori whole genome sequences from this country.In case of other selected countries, the available CagA and VacA cDNA sequences were collected and subjected to multiple sequence alignment (MSA) using Clustal Omega 25 to find the conserved regions of the two genes in these sequences.

| Prediction of computational designed siRNAs
The conserved sequences were used to predict the possible siRNAs against the VacA and CagA genes.In this regard, the i-SCORE Designer web tool (https:// www.med.nagoy a-u.ac.jp/ neuro genet ics/i_ Score/ i_ score.html) was utilized for sequence-based siRNA design. 26This program analyzes several target mRNA nucleotide preferences to generate nine alternative algorithm scores (Ui-Tei, Amarzguioui, Hsieh, Takasaki, s-Biopredsi, i-Score, Reynolds, Ka toh, and DSIR) for siRNA prediction.Depending on their calculating nature, sequence-based algorithms can also be classified into two groups: rule-based and machine-learning aided.For this research, rule-based approaches such as the Ui-Tei, Amarzguioui, and Reynolds scoring systems were brought into consideration.For machine-learning aided siRNA prediction, the i-Score (inhibitory score) algorithm was used, that employs a linear regression model to forecast siRNAs.This method solely looks at the nucleotide preferences at each location when calculating the i-score.We only selected the results for each of the five algorithms that scored at or above the stated cutoff values.

| Filtration of off-target sites
A filtration approach was used to exclude the possibility of producing off-target effects of the potential siRNAs.The human RefSeq mRNA database was evaluated for a perfect (19/19) or near-perfect (18/19, 17/19) match using nucleotide BLAST (https:// blast.ncbi.nlm.nih.gov/ Blast.cgi).The screening was performed against both sense and antisense strands of candidate siRNAs.The siRNAs that showed complete or nearly complete complementarity with off-target mRNA were excluded considering that they would induce off-target effects.

| Thermodynamic analysis
The OligoEvaluator analysis tool (http:// www.oligo evalu ator.com) was used to determine the internal melting temperature (Tm) of the sense strand of each possible siRNA duplex.The MaxExpect tool 27 of the RNA structure website was used to predict the siRNA secondary structure along with the corresponding free energy.Higher energy values depict better candidates because those molecules are less likely to fold.Also, higher target-guide strand interaction indicates better siRNA efficacy.The DuplexFold tool 28 of the RNA structure web server was utilized to predict the target strand and the siRNA guide strand's thermodynamic interaction.

| 3D structural modelling of siRNAs
We generated the 3D structure of the chosen siRNAs using the freely available RNAComposer server 29 (http:// rnaco mposer.ibch.poznan.pl/ Home).This server utilizes the RNA FRABASE database, a search engine compatible with the RNA tertiary structures database.To forecast complex structures like multi-branched loops and pseudoknotted loops, a motif template-based technique is also used.The RNAalifold web server 30 (http:// rna.tbi.univie.ac.at/ cgi-bin/ RNAWe bSuite/ RNAal ifold.cgi) predicted the secondary structure of the siRNAs in the dot-bracket notation (Vienna format) which was used as the input for the RNA composer web tool.The dot sign used in dotbracket notation indicates the site of unpaired nucleotide.Finally, the 3D structure of the siRNAs was downloaded in PDB (Protein Data Bank) file format.

| Binding affinity of designed siRNAs with the components of RISC-Loading complex
We performed a series of molecular docking of designed siRNAs with transactivation response element RNAbinding protein (TRBP), dicer, and argonaute-2 proteins, major components of RISC-Loading complex.The crystal structures of these three proteins (5N8L, 4NGF, and 6RA4) were available in the Protein Data Bank 31 (https:// www.rcsb.org/ ).The HDOCK server 32 was used for this purpose (http:// hdock.phys.hust.edu.cn/ ).The HDOCK server accepts amino acid sequences as input and makes use of a hybrid docking method that enables the incorporation of experimental information on the protein-protein binding site and small-angle X-ray scattering throughout the docking and post-docking processes.Moreover, HDOCK has an intrinsic scoring function that enables protein-RNA/ DNA docking.The RNA-protein interactions were analyzed by the BIOVIA Discovery Studio tool. 33

| Molecular dynamics (MD) simulation of siRNAs and siRNA-protein complexes
The designed siRNAs, siRNA-protein complexes, and relevant controls were subjected to MD simulation using the CHARMM36 force field, which was produced with the CHARMM-GUI server and run with the GROMACS 2018.3 program. 34The solvation of the complexes was done using a rectangular box with a padding distance of 1.0 nm.Then, the solvated systems were neutralized by the addition of the counter ions.Moreover, the steepest descent algorithm was used to minimize the energy of the system.A 100 ps equilibration under NVT ensemble was performed after energy minimization.The systems were equilibrated using NPT ensemble for 1 ns in the following phase.The same molecular dynamic approach was used for the simulation of siRNAs in their final state for up to 100 ns.All the simulations were performed at 300 K temperature with 1 atm pressure for mimicking the general experimental conditions and a 2 fs time steps.GROMACS functions were used to evaluate the hydrogen bonds, radius of gyration (Rg), and root means square deviation (RMSD).Molecular dynamics simulations and result evaluations were done in the high-performance computing T A B L E 1 Two selected siRNAs from two genes including the sequence of sense and antisense strands, percentage of GC content, scores from different rule-based methods, and the i-Scores.(HPC) cluster of the Bioinformatics Division at National Institute of Biotechnology, Bangladesh.

| Sequence retrieval of targeted gene and siRNA design
Two potential siRNAs were identified for each gene after submitting the conserved regions into the i-SCORE Designer program and those siRNAs passed the suggested cutoff value for each of the five algorithms listed above.The percentage of GC in each siRNAs was observed since a low GC content can lead to poor and non-specific binding, while a high GC content prevents the RISC complex (RNA-Induced Silencing Complex) and helicase from unfolding the siRNA duplex.Various acceptable GC content thresholds have been proposed by numerous researches.In our current study, we have designed two unique siRNAs, Table 1 is highlighting a comprehensive overview of their corresponding i-scores, amarzguioui, and GC content.Additionally, Figure 2 is representing the two-dimensional structures of the selected siRNAs, offering a visual of their probable binding nature.
According to, 35 it is advisable for GC content of siRNAs to span from 30% to 60%, considering all nucleotide preferences.Notably, our designed siRNA demonstrated the GC content of all the selected siRNAs aligns with this specified range.

| Thermodynamics of guide and target-guide strand interaction
The thermodynamic stability of nucleotide base pairing plays a significant role in modulating the silencing mechanism of siRNA (Figure 3).The internal melting temperature (Tm) and free energy change (ΔG) between siRNA seed and mRNA target are trustworthy markers of the thermodynamic stability of such heteroduplexes.All four siRNAs were found to have internal melting temperatures (Tm) below 65°C.The calculated free energy of folding of the guide strands ranged from 1.6 to 1.8 for the four siRNAs.Additionally, the associated secondary structures were also identified.The free energy of binding between targets and guide strand was calculated.The values spanned from −28 to −32.Based on this analysis, we selected the siRNA_CagA-1 and the siRNA_VacA-1 for further analysis.

| 3D structure of the selected siRNAs
To visualize the intricate 3D structure of the chosen siR-NAs, we utilized the RNAComposer server (Figure 4).This multifaceted process unfolded through a series of steps, commencing with secondary structure fragmentation, 3D structure elements preparation, rigid body transformation, and optimizing their arrangement for precision and accuracy.It provided the output model in protein data bank (PDB) file format.

| Molecular docking results
We performed a series of molecular docking of our designed siRNAs with the major components of the RISC-Loading complex.First, we performed molecular docking of two designed siRNAs with human dicer.The crystal structure of the human Dicer Platform-PAZ-Connector Helix cassette in complex with 17-mer siRNA (PDB ID: 4NGF) is available in the RCSB PDB database.We identified the binding residues of the human dicer with the control siRNA and docked our The two-dimensional representation of the four siRNAs that have been designed.For off-target filtration, BLAST analysis was performed with both strands of candidate siRNAs against the human genome to filter out the undesired siRNAs.It was found that none of the selected siRNAs possessed nearly identical sequence segments except the transcripts of the two genes.
designed siRNAs targeting those residues of the structure.We found a number of 10 common interactions between dicer-designed_siRNA (CagA) and the control and found a number of 9 common interactions between dicer-designed_ siRNA (VacA) and the control.The interacting residues of control siRNA, designed siRNAs against CagA and VacA with dicer are shown in Figure 5.Then, we performed molecular docking of two designed siRNAs with TRBP.The crystal structure of TRBP dsRBD 1 and 2 in complex with a 19 bp siRNA (PDB ID: 5N8L) is available in the RCSB PDB database.We identified the binding residues of TRBP with the control siRNA and docked our designed siRNAs targeting those residues of the structure.We found a number of 8 common interactions between both TRBP-designed_siRNAs (CagA and VacA) and the control.The interacting residues of control siRNA, designed siRNAs against CagA and VacA with TRBP are shown in Figure 6.
Lastly, we performed molecular docking of the guide strands of the two designed siRNAs with human Argonaute-2.The crystal structure of the human Argonaute-2 PAZ domain (214-347) in complex with CGUGACUCU (PDB ID: 6RA4) is available in the RCSB PDB database.We identified the binding residues of Agonaute-2 with the control strand and docked the guide strands from our designed siRNAs targeting those residues of the structure.We found a number of seven common interactions between Argonaute2-guide strand (CagA) and the control and found nine common interactions between Argonaute2-guide strand (VacA) and the control.The interacting residues of the control strand, guide strands of the designed siRNAs against CagA and VacA with Argonaute-2 are shown in Figure 7.
In summary, the constructed siRNAs, whether interacting with TRBP, Dicer, or Ago2, consistently exhibited significantly more negative docking scores (Table 3) compared to their respective control siRNAs, indicating stronger and more stable binding interactions.Particularly, the CagA-designed siRNA for all three proteins (TRBP, Dicer, and Ago2) stands out with highly negative scores, suggesting strong binding probability.The VacA-designed siRNA also demonstrated improved binding compared to the controls.Finally, AGO2 protein docking scores reveal both CagA and VacA designed siRNA showcasing higher binding activity compared to the control guide stand.

| Molecular dynamics (MD) simulation results
We performed a series of MD simulations of the designed siRNAs and the siRNA-protein complexes including relevant controls in order to evaluate their atomic level movements.The RMSD plot from the simulation results of the designed siRNAs show that the siRNAs are quite stable and exhibit similar behavior as the control (Figure 8A).The Rg plot showed that the two designed siRNAs were more compact than the control revealed from their lower values (Figure 8B).The number of hydrogen bonds formed during the simulation period was found to be higher than the control (Figure 8C).
The RMSD plot from the simulation results of the designed siRNA-TRBP complexes depicts that the siR-NA-CagA-TRBP complex was the most stable while  siRNA-VacA exhibited similar behavior as the control (Figure 9A).The Rg plot showed that the TRBP was more compact in nature while bonded to two designed siRNAs than the control revealed from their lower values (Figure 9B).The number of hydrogen bonds formed during the simulation period was found to be similar compared to the control (Figure 9C).
The RMSD plot from the simulation results of the designed siRNA-Dicer complexes showed that the siRNA-Ca-gA-Dicer complex showed minor fluctuation throughout the period but the siRNA-VacA-Dicer complex exhibited similar behavior as the control (Figure 10A).The Rg plot showed that the dicer was more compact in nature while bonded to two designed siRNAs than the control revealed from their lower values (Figure 10B).The number of hydrogen bonds formed during the simulation period was found to be slightly lower than the control (Figure 10C).
The RMSD plot from the simulation results of the guide-Ago2 complexes showed that the guide (CagA)-Argonaute-2 complex exhibited a slightly higher value throughout the period but the guide (VacA)-Argonaute-2 complex showed similar behavior as the control (Figure 11A).The Rg plot showed that the protein was less compact in nature while bonded to the control strand while the guide strands of the designed siRNAs tend to make the protein more compact revealed from their lower values (Figure 11B).The number of hydrogen bonds formed during the simulation period was found to be similar compared to the control (Figure 11C).

| DISCUSSION
RNA interference (RNAi) is a biological defense mechanism that is used to protect the invasion of various exogenous genes.Small interfering RNA (siRNA) is a promising therapeutic strategy since it has the potential to silence any disease-related gene in a sequence-specific manner. 35hrough the targeted degradation of mRNA or suppression of mRNA translation, siRNA can reduce the expression of target genes, this technology has proved to be promising as therapy in a variety of diseases including cancer. 36Our strategy focused on the silencing of H. pylori's key virulence genes CagA and VacA, which are known to play crucial roles in the gastric cancer development. 37By designing siRNAs specifically tailored to inhibit the expression of these virulence factors, we aimed to disrupt H. pylori pathogenic machinery.Furthermore, through the utilization of nanoparticles such as lipid nanoparticles, gold nanoparticles, PEI and various other techniques, these siRNAs can be introduced into target cells, enabling precise gene knockdown of the gene of interest. 38n our current study, we applied a range of criteria to narrow down the search space and enhance the accuracy of siRNA molecule prediction and accuracy.For instance, CagA cDNA sequences collected from seven countries (Bangladesh, Sweden, USA, Australia, China, Philippines, and Vietnam) and VacA cDNA sequences from six countries (Bangladesh, Nepal, Japan, Indonesia, India, and China), we performed multiple sequence alignment (MSA) analysis and the resulting conserved regions of the two genes were used for designing two siRNAs.The two potential siRNAs identified from i-Score Designer passed the suggested cutoff value for each of the five different algorithms (Ui-Tei, Amarzguioui, Hsieh, Takasaki, s-Biopredsi, i-Score, Reynolds, Ka toh, and DSIR). 26The GC content in each of the siRNAs within the 31%-42%, a range considered advisable to be kept between 30% and 60%. 35From the thermodynamics analysis, we found that the siRNAs have internal melting temperatures (Tm) below 65°C for both guide and carrier strands.Each siRNA then computed free energies for folding the guide strands ranged from 1.6 to 1.8 and the calculated free energy of binding between the targets and guide strand ranged from −28 to −32 (Table 2).The free energy landscape serves as a valuable indicator of the polycationic binding capability to the nucleic acid and also carries implications for the process of nucleic acid release within the cytosol. 39n molecular docking analysis of a complex, when the energy of the complex is subtracted from its own elements, a low and negative score denotes that the complex is more stable than its individual components alone. 40his implies that an exceedingly negative score indicates strong binding, whereas a less negative or even positive score corresponds to a weaker or non-existent binding.We modeled 3D structures of the finally selected two siRNAs and performed molecular docking with the four major components of the RISC-Loading complex, TRBP, dicer and the argonaute-2 proteins.The control siRNAs in these structures were also removed and docked again to compare and validate the docking results.In most cases, the docking scores were notably higher than those of the controls (Table 3).Remarkably, the CagA-based siRNA consistently demonstrated promising results in each complex.We also found a minimum number of seven common interactions between the designed siRNAs and the respective proteins (Figures 5-7).To gain deeper insights into the structural dynamics of these complexes, we conducted molecular dynamics simulations (MDS) to study the transport process of siRNA through cell membranes, a powerful technique widely employed in the field of structural biology. 41As depicted in Figures 8-11, focused on key parameters indicative of stability and structural behaviors during the MDS period.The root mean square deviation (RMSD) analysis revealed a significant level of stability across all the complexes, with notable low fluctuations.This consistent RMSD pattern underscores the integrity and reliability of our modeled complexes.While the gyration radius, a critical metric in structural biology, provided compelling evidence of the complexes compactness over the course of the simulation, 42 which highlights the robust and predictable nature of the protein-siRNAs interactions.An exploration of hydrogen bond formations throughout the MDS yielded further validation of the constructed siRNAs, which signifies strong intermolecular interactions and the maintenance of the complex cohesion.Overall, the employment of MDS in our study adds rigor to our findings and offers a comprehensive understanding of the dynamic behavior of the protein-siRNA complexes.These results further enhance the credibility and potential utility of our developed siRNAs in targeted gene regulation.
Moreover, the outcomes of our research necessitate further evaluation to assess the efficacy of the chosen siRNAs in diverse cell lines, whether individually or in combination.Additionally, the exploration of alternative delivery methods and transfection reagents is a priority for future investigations.In summary, our findings highlight the potential of in-silico siRNA design, selection, filtration, and assessment in advancing the development of next-generation oligonucleotide-based therapeutic agents targeted at combatting H. pylori infection and protecting the global populations.

| CONCLUSION
It is possible to design and predict siRNA interactions against a specific target gene using computational methods, which will silence that gene's expression.Two siRNA molecules were designed in this study to be effective against the CagA and VacA genes of H. pylori using a computational method that took into account all maximum parameters in ideal conditions and cutting-edge molecular modelling and simulation analyses.The development of therapeutic siRNA techniques could be a potential alternative to decelerate the global gastric cancer cases and recover the affected people.

F I G U R E 3
Secondary structures of the designed siRNAs with probable folding and lowest free energy for consensus sequence.F I G U R E 4 3D structures of the designed siRNAs: (A) siRNA_CagA-1 and (B) siRNA_VacA-1.

F I G U R E 5
Protein-siRNA interactions between (A) human Dicer and control, (B) siRNA_CagA and (C) siRNA_VacA.F I G U R E 6 Protein-siRNA interactions between (A) TRBP and control, (B) siRNA_CagA and (C) siRNA_VacA.F I G U R E 7 Protein-siRNA interactions between (A) Argonaute-2 and control, (B) siRNA_CagA and (C) siRNA_VacA.

F I G U R E 8
MD simulation results of designed siRNAs.The figure showing (A) RMSD, (B) Rg and (C) hydrogen bonds analysis.F I G U R E 9 MD simulation results of designed siRNA-TRBP complexes.The figure showing (A) RMSD, (B) Rg and (C) hydrogen bonds analysis.

F I G U R E 1 0 2
MD simulation results of designed siRNA-Dicer.The figure showing (A) RMSD, (B) Rg and (C) hydrogen bonds analysis.F I G U R E 1 1 MD simulation results of designed siRNA-Argonaute-2 complexes.The figure showing (A) RMSD, (B) Rg and (C) hydrogen bonds analysis.Results of thermodynamic properties of the designed siRNAs.