Parameters In ﬂ uencing Gene Delivery Ef ﬁ ciency of PEGylated Chitosan Nanoparticles: Experimental and Modeling Approach

relationships and predict output variables. Herein, an ANN is adapted into plasmid DNA (pDNA) encapsulated and PEGylated chitosan nanoparticles cross-linked with sodium tripolyphosphate (TPP) to investigate the effects of critical parameters on the transfection ef ﬁ ciencies of nanoparticles. The ANN model is developed based on experimental results with three independent input variables: 1) polyethylene glycol (PEG) molecular weight, 2) PEG concentration, and 3) nanoparticle concentration, along with one output variable as a percentage of green ﬂ uorescent protein (GFP) expression, which refers to transfection ef ﬁ - ciency. The constructed model is further validated with the leave-p-out cross-validation method. The results indicate that the developed model has good prediction capability and is in ﬂ uential in capturing the transfection ef ﬁ ciencies of different nanoparticle groups. Overall, this study reveals that the ANN could be an ef ﬁ cient tool for nanoparticle-mediated gene delivery systems to investigate the impacts of critical parameters in detail with reduced experimental effort and cost.


Introduction
Nanomedicine is a subset of medicine that uses nanoscale materials such as nanoparticles and nanorobots for various purposes, including diagnosis, delivery, prevention, treatment of diseases, and sensing and monitoring a living body. [1]Nanoparticles (NPs), defined as materials with all external dimensions in the nanoscale (under %100 nm), have been extensively used in nanomedicine due to their unique characteristics, such as enhancement in the delivery of therapeutic agents. [1,2]In recent years, NP-based delivery systems have gained growing interest in gene therapy, particularly due to their physicochemical properties, including high gene loading capacity, low therapeutic toxicity, and prolonged half-life of naked plasmid DNA (pDNA). [3]o improve the properties of NP-based delivery systems, optimization of many parameters within both biological and material aspects is greatly needed, such as cell type, cell confluency, transfection media, type and size of the genetic material, as well as fabrication properties and techniques of NPs, such as polymers' molecular weight (MW) and crosslinker type. [4]owever, optimizing such systems is challenging because collecting each data point in NP-based delivery systems is Experimentation of nanomedicine is labor-intensive, time-consuming, and requires costly laboratory consumables.Constructing a reliable mathematical model for such systems is also challenging due to the difficulties in gathering a sufficient number of data points.Artificial neural networks (ANNs) are indicated as an efficient approach in nanomedicine to investigate the cause-effect relationships and predict output variables.Herein, an ANN is adapted into plasmid DNA (pDNA) encapsulated and PEGylated chitosan nanoparticles cross-linked with sodium tripolyphosphate (TPP) to investigate the effects of critical parameters on the transfection efficiencies of nanoparticles.The ANN model is developed based on experimental results with three independent input variables: 1) polyethylene glycol (PEG) molecular weight, 2) PEG concentration, and 3) nanoparticle concentration, along with one output variable as a percentage of green fluorescent protein (GFP) expression, which refers to transfection efficiency.The constructed model is further validated with the leave-p-out crossvalidation method.The results indicate that the developed model has good prediction capability and is influential in capturing the transfection efficiencies of different nanoparticle groups.Overall, this study reveals that the ANN could be an efficient tool for nanoparticle-mediated gene delivery systems to investigate the impacts of critical parameters in detail with reduced experimental effort and cost.
labor-intensive, time-consuming, and requires costly laboratory consumables. [5]As gathering each data point requires significant effort and is expensive, experimental datasets of such systems are relatively small. [6]Recently, mathematical models have gained growing interest in the research of such systems. [6,7]However, modeling such systems is also challenging because the relation between input and output parameters in NP-based systems is nonlinear and complex.The number of data points is usually not sufficient to construct a reliable model for these systems.Artificial neural networks (ANNs) have been considered an efficient approach to examine and understand biological systems for the different variables. [8]ANN models trained with small datasets are prone to overfitting and therefore require a generic and concrete validation technique that ensures the quality of the created model. [9]Overfitting often occurs in small experimental datasets, where the model closely fits the limited set of data points.9a] This procedure is distinguished into two types; exhaustive and nonexhaustive cross-validation. [10]Leave-p-out cross-validation is an exhaustive cross-validation type that is more suitable for small datasets and can be applied to detect and prevent overfitting in a developed model. [11]n this study, we utilized NPs as gene delivery systems and aimed to investigate the effects of critical parameters on gene delivery efficiency comprehensively with less experimental effort and cost by adapting the experimental results in an ANN model.To this end, various types of materials were utilized to synthesize NPs, including lipids, dendrimers, cationic polymers, natural proteins, and polysaccharides.Liposomes, spherical-shaped vesicles consisting of one or more phospholipid bilayers that mimic the cellular membrane, are one of the most widely explored nanosized carriers. [12]4a] Dendrimers such as poly(amidoamine) (PAMAM) have large numbers of functional groups that allow entrapping targeted molecules; however, these molecules have poor water solubility and high nonspecific toxicity. [1,13]Cationic polymers have emerged as an alternative class, mainly due to their large encapsulation efficiency and allowance for extensive modifications. [14]4a] To address the toxicity issue, proteins and polysaccharides isolated from natural sources have been recently studied.Alginate and gelatin are natural polymers with excellent biodegradability and biocompatibility. [15]However, these polymers still have some limitations; the application of alginate NPs is limited due to the burst release of the encapsulated material and poor particle size control of the particles formed. [16]The major disadvantage of gelatin is the lack of its solubility at room temperature. [17]mong a wide range of materials, we selected chitosan-based NPs due to their desirable characteristics such as biocompatibility, [18] biodegradability, [19] as well as cationic nature, [20] where highly positively charged chitosan can easily form complexes with negatively charged genetic materials.However, the application of chitosan has been limited due to the low colloidal stability and low solubility at physiological conditions.Dextran, poly(vinyl pyrrolidone), poly(β-malic acid), poly(aspartic acid), and polyethylene glycol (PEG) are frequently used to improve the properties of chitosan.PEG is the most widely used and Food and Drug Administration (FDA)-approved polymer, which could enhance the colloidal stability and solubility of chitosan, reduce the size of the produced NPs, and prolong their circulation half-life. [21]ere, we first prepared PEGylated chitosan derivatives using different PEG molecular weights (MWs) (2, 5, and 10 kDa) and PEG concentrations (4-20 μmole per 25 mg of chitosan).Then, we synthesized pDNA-encapsulated and PEGylated chitosan NPs (CS-PEG-pDNA NPs) by mixing PEGylated chitosan derivatives with a pDNA and cross-linker and utilized this particular NP system to modify human embryonic kidney (HEK293-T) cells genetically.An ANN model was constructed and then validated using the obtained experimental results through the leave-p-out crossvalidation method.After creating and validating the model, we investigated the effects of critical parameters on the NP-based delivery system in detail.Overall, this study revealed that ANN could be an efficient tool to model experimental datasets within the field of nanomedicine and could enable us to predict the impacts of crucial parameters on a particular system accurately with reduced experimental effort and cost.

Results and Discussion
The main steps of the study are represented schematically in Figure 1.First, we synthesized PEGylated chitosan derivatives using PEG with altered MWs (2, 5, and 10 kDa) and PEG concentrations (4-20 μmole per 25 mg of chitosan).We selected 2, 5, and 10 kDa PEG MWs due to the following: 1) PEG MW lower than 2 kDa would negatively affect the colloidal stability of NPs and 2) PEG MW higher than 10 kDa would reduce the number of positive charges of NPs which are essential for the stability of NPs.We selected 5 kDa PEG as an intermediate value between 2 and 10 kDa PEG MW.The criteria for the selection of the amount of PEG were based on our experimental observations.Because we could not obtain a monodisperse NP distribution for the conditions with lower than 4 μmole and higher than 20 μmole PEG, we selected the concentrations of PEG between these two values (4 and 20 μmole per 25 mg of chitosan).
Next, CS-PEG-pDNA NPs were prepared with ionotropic gelation technique by mixing synthesized PEGylated chitosan derivatives with tumor necrosis factor related apoptosis inducing ligand (TRAIL) inserted pDNA and sodium tripolyphosphate (TPP) as a cross-linker (Figure 1).After the synthesis, the pH values of all NP solutions were around 4.8.For further experiments, we adjusted the pH of NP solutions to 6.5 because 6.0-7.4 is an applicable pH interval for biomedical applications. [22]Therefore, we characterized altered groups of NPs based on their size and zeta potential values at pH 4.8 and 6.5.
Experimentally measured size and zeta potential values of CS-PEG-pDNA NPs are reported in Table S1, Supporting Information, and raw data are provided in Figure S1-S44, Supporting Information.The sizes of all NP groups were comparable, and measured around 90 nm at both pH 4.8 and pH 6.5.Zeta potential values of NPs in altered groups decreased when we increased the pH of the solutions from 4.8 to 6.5.This result can be explained due to the deprotonation of amino groups of chitosan, which causes a reduction in the zeta potential. [23]From different NPs, we selected 5 kDa-20 μmole PEG conditions as a representative group and presented the sizes and zeta potential distributions of these NPs at pH 6.5 in Figure 2A,B.We selected NPs prepared with 2, 5, and 10 kDa-20 μmole PEG as representative groups to compare the diameters and zeta potential values.The diameters of the NPs prepared with 2, 5, and 10 kDa-20 μmole PEG were obtained as 93.19 AE 0.32, 96.06 AE 0.14, and 91.84 AE 0.87 nm, respectively.We did not observe any significant alterations in size values of these three groups of NPs.In addition, zeta potential values of the NPs prepared with 2, 5, and 10 kDa-20 μmole PEG were measured as 10.77 AE 0.35, 6.34 AE 0.26, and 3.20 AE 0.35 mV, respectively.We observed that higher PEG MWs led to lower zeta potential values for the same PEG concentration (Figure 2C).The shape and morphology of representative groups of NPs were further examined through scanning electron microscopy (SEM).The SEM images also indicated that the sizes of NPs were consistent with the dynamic light scattering (DLS) results (Figure 2D and Figure S45 and S46, Supporting Information).
As a final step, we measured the transfection capabilities of different CS-PEG-pDNA NPs and then created an ANN model with these results to investigate the effects of critical parameters on the transfection efficiency in detail (Figure 1).Transfection is a process in which foreign genetic material is deliberately introduced into eukaryotic cells. [24]Because naked pDNAs are rapidly degraded by nucleases found in biological fluids, a delivery vehicle is needed to penetrate pDNA through the cellular membrane efficiently. [25]Here, we utilized the CS-PEG-pDNA NP system as a gene delivery vehicle to improve the transfection efficiency of naked pDNAs.HEK293-T was selected as a model cell line, and all transfection experiments were performed with HEK293-T cells.The details of the experimental workflow are given in Figure 2E.The pDNA encapsulated into NPs was expressing both green fluorescent protein (GFP) and TRAIL genes.Therefore, we could analyze the transfection efficiency of NPs based on 1) measuring the green fluorescence intensity originated from the GFP protein with flow cytometry and 2) measuring the TRAIL protein concentration in the cell culture supernatant with enzyme-linked immunosorbent assay (ELISA) (Figure 2E).We measured the TRAIL protein concentration released into the cell culture medium by transfected cells and compared these results with GFP expression.We observed a higher amount of TRAIL protein in cell culture supernatants, where more GFP-expressing cells were observed in the medium.The results indicated a very good correlation between flow cytometry and ELISA data, where the correlation coefficient was calculated as 0.87 (Figure 2F and Table S2, Supporting Information).This suggests that measuring either GFP expression or TRAIL protein concentration could be a reliable method to track the transfection efficiency of HEK293-T cells with the NP system developed here.Therefore, we conducted further experiments with GFP expression, where transfection of cells by NPs was tracked and quantified based on GFP expression.We selected 5 kDa-20 μmole NPs as a representative group and captured the fluorescence images of HEK293-T cells transfected with this NP group.Here, we concluded that transfection efficiency was higher when we encapsulated TRAIL pDNA into PEGylated chitosan NPs than transfection with naked TRAIL pDNA (Figure 2G).Furthermore, we performed cytotoxicity experiments with different CS-PEG-pDNA NPs.HEK293-T cells demonstrated more than 70% viability in all groups after 24 h of incubation, suggesting that CS-PEG-pDNA NPs are safe to use as a gene delivery vehicle at a 225 μg mL À1 concentration for HEK293-T cells (Figure 2H).
In this study, we developed the synthesis and characterization of the CS-PEG-pDNA NPs and utilized this NPs system as a safe gene delivery vehicle to modify HEK293-T cells genetically.However, the synthesis of PEGylated chitosan polymers and transfection experiments is labor-intensive, time-consuming, and requires expensive laboratory consumables to collect each data point.Modeling such systems is also challenging due to the difficulties in the collection of sufficient data points.8aÀc,8f] Therefore, we aimed to adapt our NP system into an ANN model for optimization and understand the system's dynamics in detail with reduced experimental effort and cost.For this purpose, we created a small experimental dataset that included 21 data points, where PEG MW, PEG concentration, and NP concentration of the particles were altered to test the transfection ability of HEK293-T cells with altered NP groups (Table 1).Next, we adapted the dataset into an ANN model and examined the effects of critical parameters.We created an ANN model using three input variables; 1) PEG MW, 2) PEG concentration, 3) NP concentration, and one output variable: % GFP expression, which refers to transfection efficiency.Because we kept the parameters constant, which may significantly affect the sizes of CS-PEG-pDNA NPs, such as chitosan and TPP concentration, we obtained a narrow NP diameter range that changes between 82.44 AE 0.34 and 101.11AE 1.57 nm (92.58 AE 4.74 nm on average). [26]This situation enabled us to safely investigate the effects of PEG MW, PEG concentration, and NP concentration (Figure S47 and Table S1, Supporting Information).We randomly divided the dataset provided in Table 1 as the training and test groups, and we structured the neural network using different layers and nodes.The schematic diagram of the ANN model is given in Figure 3A, and the ANN training parameters that we used for the model are shown in Table 2.After training the neural networks, the selected predictive model resulted in R 2 values of 0.9878 and 0.9769 for the train (18 samples) and test (3 samples) data, respectively.The ANN model developed here captures experimental observations of GFP expressions with a high correlation, as shown in Figure 3B, where the model's % GFP expression predicted from the model matched with observed % GFP values along a 45 diagonal line.However, models trained with small datasets tend to be prone to overfitting.Therefore, validation of the model is crucial to prove the robustness and accuracy of the created model. [9]ince our dataset can be considered small, further validation is needed to establish the validity of our model that does not retain any overfitting characteristic.To detect and avoid any overfitting, we performed a leave-p-out cross-validation method and evaluated the quality of the created model with this technique.Leave-p-out is a specific cross-validation technique that is more suitable for small datasets. [11]In this technique, the model is trained C (n, p) times, where n indicates the total number of samples, n-p presents the number of data points used to train the model, and p refers to the number of data points used for validation (Table 3).We selected p ¼ 3, 4, 5, and 6 to assess the quality of the selected network.For example, p ¼ 3, C (21, 3) ¼ 1330 neural network was created and the average Pearson correlation coefficient (R) was calculated as 0.9471 AE 0.1243 (Table 4 and 5).The 86.77% of the created networks had R-value higher than 0.90 for p ¼ 3 (Table 5).The selected network was further validated with p ¼ 4, 5, and 6.R values slightly decreased, as the model's prediction capability might have decreased with the fewer training data points.Normalized and mean squared error values have a similar trend, where error increased with increasing p values, and the highest R-value of 0.9471 AE 0.1243 was obtained with the lowest p condition (Table 4).For all p values, Pearson correlation coefficient distributions were similar; most networks had an R-value higher than 0.9, demonstrating that the selected neural network is competent for modeling this particular NP-based gene delivery system and has a good prediction capacity even using a small experimental dataset (Figure 3C).
After developing and validating the model, we analyzed the effects of PEG MW, PEG concentration, and NP concentration on the transfection efficiency of CS-PEG-pDNA NPs using the results obtained from the validated ANN model.We indicated the number of GFP-expressing cells with respect to the NP concentration at constant PEG MW or PEG concentration in Figure 4. We obtained a similar trend for all groups in Figure 4, where quantified GFP expressions increased with increasing NP concentrations.Figure 4A shows the number of GFP-expressing cells with respect to NP and PEG concentrations at constant PEG MW.We observed that the number of GFP-expressing cells did not show significant differences at low NP concentrations for all groups.For the 2 kDa PEG group, we observed increases in GFP expression when we increased PEG concentration at a high NP concentration.However, for 5 and 10 kDa PEG groups, we observed that the number of GFP-expressing cells increased substantially until a point with increasing PEG concentration.This result suggests a threshold value for PEG concentration in the 5 and 10 kDa groups.Beyond this threshold value of PEG concentration, dramatic decreases in GFP expressions were observed with increasing PEG concentration.This could be explained due to the hydrophilicity of PEG.It is known that PEG forms a hydrophilic shell around the surface of NPs, which negatively influences the interaction of NPs with cell membranes after a certain amount. [27]Therefore, PEG amounts higher than the threshold value reduce the interaction of NPs with the cell membrane and lead to decreased cellular uptake of NPs. Figure 4B displays the number of GFP-expressing cells with respect to NP concentration and PEG MW at constant PEG concentrations.It has been reported that higher PEG concentrations block protein adsorption on NPs and lead to less recognition by the immune system. [28]However, there is also a drawback that cancer cells reduce the cellular uptake of NPs at higher PEG concentrations. [29]In contrast, lower PEG concentrations lead to higher protein adsorption on NP surfaces.This situation may either increase the cellular uptake of NPs or result in rapid clearance from circulation due to macrophage detection. [29]Here, we observed minimum gene expression values at the lowest PEG concentration, 4 μmole PEG group.As excessive PEGylation also has a negative effect on cellular uptake of NPs, lower gene expression values were commonly obtained in the 20 μmole PEG group compared to 12 μmole PEG for the same NP concentrations and PEG MWs.It has been recently reported that higher PEG MW enhances the internalization of chitosan/siRNA complexes. [30]In this study, we observed increased gene expressions at the highest PEG MW (10 kDa) for 4 and 12 μmole PEG groups.However, gene expression was very low for 10 kDa-20 μmole PEG.This might also be explained due to negative effect of excessive PEG concentration on NP internalization.
In addition to all of these, we analyzed the effect of the zeta potential on the transfection efficiencies of NPs.The zeta potential is an important parameter that significantly influences particles' interaction with cell membranes and affects the cellular internalization of complexes.Commonly, it is known that particles with higher zeta potential values have a superior tendency to interact with negatively charged cell membranes, which results in higher internalization and higher gene expression. [27]Our previous study demonstrated that PEGylated chitosan NPs with a higher amount of positive charges were superior to the other groups in their ability to attach cell membranes. [22]However, in this study, we observed that NPs with higher positive charges did not result in higher transfection efficiencies; instead, PEG MW, PEG concentration, and NP concentration played a more significant role in cellular internalization of CS-PEG-pDNA NPs.These results suggest that higher interaction of NPs with cell membranes does not necessarily result in higher cellular internalization of NPs and in turn higher gene expressions.
To summarize, the ANN results indicated that the NPs that we developed here have the following properties: 1) NP concentration is the most critical parameter that significantly affects the transfection capabilities of NPs.For all groups, GFP expression of cells increased with increasing NP concentration.2) Higher PEG concentrations enhance the transfection capabilities of the 2 kDa PEG group.3) The hydrophilicity of NPs negatively influences the transfection efficiencies of NPs beyond a certain amount, in particular, for the 5 and 10 kDa PEG groups.4) NPs with a high amount of positive charges do not necessarily result in high transfection efficiencies.
One of the approaches for improving the transfection efficiency of the CS-PEG-pDNA NPs used in this study could be the conjugation of other biopolymers to the chitosan backbone.20a] In another study, alginate grafted to chitosan reduced the interaction strength between chitosan and DNA, enhanced the DNA release, and eventually improved the transfection efficiency. [31]

Conclusion
Many factors influence the transfection efficiency of chitosan nanoparticles, including cell type, cell confluency, transfection media, type and size of the genetic material, as well as the fabrication properties and techniques of NPs, such as chitosan molecular weight, degree of chitosan deacetylation, chitosancrosslinker ratio, nitrogen to phosphate (N/P) ratio, plasmid concentration, and PEGylation degree.In this study, we mainly focused on analyzing the impacts of PEG among all these parameters since PEGylation is one of the most critical factors in improving the application of chitosan in biomedical applications by enhancing the colloidal stability and solubility of chitosan.Here, we used the CS-PEG-pDNA NP system as a tool to indicate that optimal parameters for such an NP-based gene delivery
Cloning: Insertion of TRAIL Gene into Mammalian Expression Vector: The extracellular portion of human TRAIL (amino acids 95-281) was cloned into the pIRES2-AcGFP1 mammalian expression vector, which carries a GFP cassette downstream of the cloning site.The coding region of the TRAIL gene with cytomegalovirus (CMV) promoter was amplified from a previously produced pLV-TRAIL-GFP vector and inserted into this vector. [32]The designed oligos were polymerase chain reaction (PCR) amplified using the following primer pairs-forward: 5 0 -ATTAGTCGACGGAGTTCCGCGTTACATAAC-3 0 ; reverse: 5 0 -CACCGGCCTTATTCCAAGÀ3 0 .PCR products were gel purified, digested with SalI and BamHI, and ligated into the 5.3 kb pIRES2-AcGFP1 mammalian expression vector by flanking SalI-BamHI cut sites.The chemically competent Escherichia coli Oneshot Stbl3 bacteria (ThermoFischer Scientific) was transformed with the cloned vector, and kanamycin selection was applied at a final concentration of 50 μg mL À1 to increase the copy number of the plasmid.pDNA was isolated from transformed bacteria using the Macherey Nagel NucleoBond Xtra plasmid purification kit (Bethlehem, PA).Finally, the cloning was verified by sequencing (EZ seq V2.0, Macrogen, Rockville MD).
Synthesis of PEGylated Chitosan Derivatives: PEGylated chitosan polymers were synthesized as described previously. [22]The low molecular weight of chitosan (25 mg) was dissolved in glacial acetic acid solution (2 mg mL À1 ) at 0.5 mg mL À1 final chitosan concentration overnight.The pH of the chitosan-acetic acid solution was adjusted to 6.0 using 6 м NaOH.mPEG-SVA (2, 5, and 10 kDa) was dissolved in DMSO with a 50 mg mL À1 concentration.mPEG-SVA solutions were added into the chitosan solution at different concentrations (4-20 μmole per 25 mg of chitosan), and reactions were performed for 2 days.Unreacted molecules were removed from the reaction mixture with dialysis against ultrapure water for 3 days and then lyophilized.The resulting PEGylated chitosan polymers were dissolved in 1 mg mL À1 glacial acetic acid solution (0.5 mg mL À1 final chitosan concentration).The pH of

Figure 1 .
Figure 1.Summary of the study.
system could be predicted by developing an ANN model with minimal experimental effort, cost, resources, and time.Toward this goal, we first synthesized PEGylated chitosan NPs with different properties and performed transfection experiments.An ANN model was created using the obtained experimental dataset and further validated through the leave-p-out cross-validation method.The results indicated that the developed ANN model could accurately predict the effects of significant parameters on the transfection ability of the CS-PEG-pDNA NP system.In conclusion, this study shows that an ANN could be utilized as a tool in NP-mediated delivery systems to optimize, predict, and analyze the impacts of critical parameters in detail with reduced time and resources.The ANN could be further implemented in many other scientific studies with various input parameters, especially in the field of nanomedicine, where obtaining experimental data requires more effort.

Figure 4 .
Figure 4. 3D plots based on the ANN model for GFP expression of HEK293-T cells with respect to different variables.A) GFP expression of HEK293-T cells at constant PEG MW.B) GFP expression of HEK293-T cells at constant PEG concentration.

Table 1 .
Data used to set the ANN model.Values are presented as means AE SD of the mean (n ¼ 3).

Table 4 .
Normalized mean square error for the cross-validation.Values are presented as means AE SD of the mean.

Table 5 .
Summary of leave-p-out cross-validation results.