The generalized predictive control of bacteria concentration in marine lysozyme fermentation process

Abstract Due to the high degree of strong coupling and nonlinearity of marine lysozyme fermentation process, it is difficult to accurately model the mechanism. In order to achieve real‐time online measurement and effective control of bacterial concentration during fermentation, a generalized predictive control method based on least squares support vector machines is proposed. The particle swarm optimization least squares support vector machine (PSO‐LS‐SVM) model of lysozyme concentration is established by optimizing the regularization parameters and the kernel parameters of the least squares support vector machine by particle swarm optimization. To avoid the nonlinear problems in predictive control, the model is linearized at each sampling point and the generalized predictive algorithm is used to predict the bacteria concentration of lysozyme. The experimental simulation shows that the least squares support vector machine model with particle swarm optimization can achieve good prediction effect. The linearized model performs generalized predictive control, which makes the total activity of the enzyme increased from 60% to 80% and the yield improved by 30%.

lag caused by the LS-SVM-based prediction model is longer than that caused by the SVM-based prediction model, it does not affect the bacterial concentration prediction. (Huang, Zhai, Sui, & Chai, 2010;Suykens & Vandewalle, 1999;Wang, Zhen, & Zhu, 2013). However, the regularization parameter C and the kernel parameter σ of the LS-SVM model have a great influence on the fitting precision and generalization ability.
Particle swarm optimization (PSO) is a population-based stochastic optimization method, which can simultaneously search for more regions in the solution space of the target function to be optimized, and solves the problem of LS-SVM parameter selection (Li, Tang, & Liu, 2010;Yan & Cui, 2013). Therefore, this paper proposes a nonlinear model for establishing bacteria concentration after optimization of LS-SVM using PSO.
To avoid solving nonlinear problems in predictive control (Liu, Su, & Zhu, 2004;Mahmoodi, Poshtan, Jahed-Motlagh, & Montazeri, 2008;Xi, Li, & Lin, 2013), the obtained LS-SVM nonlinear model is linearized at each sampling point, the generalized predictive control algorithm is used to solve multi-step prediction and process control is performed on its prediction parameters.

| BAC TERIA CON CENTR ATI ON MODELING ANALYS IS
Lysozyme is an important enzyme preparation, which can hydrolyze mucopolysaccharide in pathogenic biomass. According to the bacteriolytic characteristics of lysozyme, it can be used in medical treatment, food preservation, and bioengineering. Especially in food preservation, it has been widely used in aquatic products, meat products, cakes, sake, wine, and beverages to replace chemically synthesized food preservatives (Ren et al., 2013;Wang et al., 2000;Zhao, Bai, Zhang, & Wu, 2010). However, the bacteria concentration is too high or too low, which can make the fermentation broth viscous or dilute, and the poor mass transfer conditions will make the product enzyme difficult to synthesize in the fermentation process. Therefore, reasonable control of bacteria concentration can increase enzyme activity and yield. Through the in-depth analysis of the process mechanism, the substrate feed rate has a great influence on the bacteria concentration, and the reasonable feed rate can improve the product activity (Huang, Sun, Sun, Liu, & Nie, 2013;Zhu, He, Sun, & Wang, 2013). The lysozyme concentration model can be expressed in the following nonlinear form: where f(g) represents a complex nonlinear function.

| Establishment of LS-SVM model
There is given a training set {x i , y i } with N data, and x i is input data, y i The LS-SVM model can use the following functions in the eigenspace: where ϕ(•):R n → R nh is a function that maps the input data of the original space to the higher-dimensional eigenspace, w is weight vector, δ is constant deviation, w ∊ R nh , δ ∊ R.
The LS-SVM regression optimization problem is as follows: The constraint is as follows: where e i is error variable, e i ∊ R, C is regularization parameter, C > 0.
In solving the above optimization problem, the Lagrangian function is introduced as: where α i is the Lagrange multiplier, α i ∊ R.
The optimization problem solved according to the KKT condition has the following solution: where y = [y 1 , y 2 , ···, 2, ···, N, k(x, x) is a kernel function, I is unit matrix.
In this paper, the Gauss radial basis function (RBF) is used as a kernel function (Lu & Yang, 2007;Zhu, Ling, Wang, Hao, & Ding, 2018). After obtaining δ and α in Equation (6), w can be further calculated, and the nonlinear model obtained by LS-SVM is as follows: When solving the above equation, the kernel parameter σ and the regularization parameter C have a great influence on the model fitting accuracy and generalization ability, in order to achieve the prediction effect, the two variables need to be PSO optimized.

| PSO-based parameter optimization
The basic idea of the particle swarm algorithm is to find the optimal solution through information transmission and information sharing among individuals in a group (Gu, Zhao, & Wu, 2010;Yao, Cai, & Zhang, 2009). Assuming that in a D-dimensional search space, population X = (X 1 , X 2 , ···, X n ) consists of n particles, where the i-th particle is represented as a D-dimensional vector X i = (x i1 , x i2 , ···, x iD ) T that is the position of the i-th particle in the D-dimensional search space.
According to the objective function, the fitness value corresponding to each particle position X i can be calculated, which represents the pros and cons of the particle. The optimal position of the i-th particle is P i = (P i1 , P i2 , ···, P iD ) T , whose corresponding fitness value is called the individual optimal solution P best,i ; the optimal position of the population is P g = (P g1 , P g2 , …, P gD ), whose corresponding fitness value is called the global optimal solution G best,i . The search speed of , ···, V iD ) T , and the particle updates its speed and position through individual value and group extremum during the iterative process as follows: where w is the inertia weight, d = 1, 2, ···, n, V id is the particle velocity, c 1 and c 2 are acceleration factors, r 1 and r 2 are random numbers distributed in the range of [0,1].
In order to prevent the blind search of particles, whose position and speed are limited to a certain interval [−X max ,

| Establishment of LS-SVM model based on PSO optimization
To sum up, the specific steps of the least squares support vector machine modeling based on PSO are as follows:

1.
A set of {C, σ} is randomly generated to establish the LS-SVM regression model. The particle swarm dimension is set to 2, the number of particles in each particle swarm is 20, the number of iterations is 150, c 1 = 1.5, c 2 = 1.7, and the regularization parameter C and the kernel parameter σ are selected within the optimization range of 0~2,000 and 0.01~100, respectively.

2.
The average absolute percentage error is chosen as the fitness function of the PSO algorithm, whose expression is as follows: where y i and ŷ i are the actual value and model prediction value, respectively, and N is the total number of training data.
According to the size of each particle value, {C, σ} is substituted into the LS-SVM reconstruction regression model, and the corresponding fitness value of each particle can be obtained from Equation (9) through the calculation results of the calibration sample.
3. According to calculating the fitness value of each particle, which is compared with the fitness value of individual optimal solution P best,i and global optimal solution G best,i . If it is better than P best,i and G best,i , update P best,i and G best,i , otherwise keep the original data.

4.
According to the PSO optimization Equations (8) and (9), the velocity and position of the particles are adjusted to produce new species.

5.
Check the end condition. If the condition is satisfied, the optimization is ended; otherwise, go to step (3) until the maximum number of iterations is satisfied.
6. The LS-SVM is assigned to the {C, σ} obtained after the optimization is completed. The prediction model is established by using the test data, whose prediction result of the test sample is obtained.
The LS-SVM model that has been optimized is linearized at the sampling point x 0 by using the Taylor formula, and the linearization model can be obtained through the method as follows: where A(z −1 ) = 1 + a 1 z −1 + ··· + a n z −n , B(z −1 ) = 1 + b 1 z −1 + ··· + b m z −m , ∂ is a constant.

| GENER ALIZED PREDICTION ALGORITHM FOR BACTERIA CONCENTR ATION
After a simple model transformation, the constant ∂ is discretized, and the following controlled autoregressive integral moving (CARIMA) average model is obtained as follows: where Δ = 1−z −1 is a difference operator, ɛ(t) is an unrelated random sequence that represents the effect of random noise, and the discretized constant ∂ is contained in the random sequence ɛ(t).
After continuing processing according to the standard GPC method (Deng, Huang, Fei, Zhen, & Jiang, 2014;Guo, Chen, Zhu, & Hu, 2002), the multi-step prediction vector expression that can output the predicted value is as follows: where Ŷ = [ŷ(t + 1|t), ⋯ ,ŷ(t + P|t)] T is the forecast output, U = [Δu(t), ···, , ···, f P (t)] T is a vector consisted of the free phases in the output prediction sequence, is the unit step coefficient, P is the prediction time domain, and L is the control time domain.
The moving horizon optimization performance index at t time in GPC takes the following form: where E is the mathematical expectation, w is the expected reference value of the object output, N 1 and N 2 are the initial and final values of the optimization time domain, respectively. λ(j) is a control weighting coefficient that is zero or a very small number, which can be increased until a satisfactory control effect is obtained if the control system is stable, but the control variable changes greatly in the actual selection (Liu, 2007.). The parameter λ(j) is generally set as a constant λ.
The reference trajectory is introduced to track it well for the output value: where β is the adjustment factor in interval [0, 1), y r is the reference trajectory, and y s is the set value for next moment.
When W = [w(t + 1), ···, w(t + P)] T , the formula (13) can be expressed as: When J U = 0, the control amount can be obtained as follows: where d T is the first line of (G T G + λI) −1 G T .
The generalized predictive control block diagram of marine lysozyme bacteria concentration based on LS-VM is shown in Figure 1.

| TE S T AND RE SULT ANALYS IS
The experimental data are from the fermentation control system platform of Jiangsu University. The fermenter model is RT-100L-Y, and the fermented variety is lysozyme. Batch fermentation experiments are performed according to the medium formulation provided by the fermentation process. After high-temperature sterilization of the fermenter steam, the tank pressure is controlled at 0.04 MPa by adjusting the gas output, the temperature is set at 32°C, the stirring speed is 400 r/min, the dissolved oxygen range is 35%-40%, and the pH is set at 7.3. In the experimental fermentation conditions, the control system collects the data of the substrate feed rate f that is obtained by the flow meter every hour and transmits it from the lower computer to the upper computer to form a database (Zhu et al., 2010). Under normal fermentation conditions, the bacteria concentration is measured by dry weight method. The fermentation broth is centrifuged at 20 ml/hr, washed with distilled water, and centrifuged twice; then, it is transferred to a constant-weight measuring flask, dried to constant weight at 105°C, and weighed; the bacteria concentration (g/L) can be calculated (Sun, Wang, Huang, & Ji, 2010).
According to the data collected by the upper computer, a batch of data is taken from one fermentation cycle, and 10 batches of data are extracted. The first nine batches of data are used as the training sample set, and the last batch is used as a test set. The simulation results are shown in Figures 2 and 3.
In the comparison of the prediction models in Figures 2 and 3, the LS-SVM prediction model based on PSO optimization is obviously better than LS-SVM model in fitting degree and prediction precision and has good modeling ability. Where the optimized parameters after PSO optimization are C = 508.06 and σ = 8.32.
After data preprocessing, the modeling method introduced in this paper is used to train the data, which is verified the fitting degree and prediction accuracy with test data, and select the root mean square error (RMSE) and maximum absolute error (MAXE) as the evaluation criteria for model prediction accuracy.
where y i and ŷ i are the actual value and model prediction value, respectively, and N is the total number of training data. Two types of model simulation results are shown in Table 1.  where − 10 ≤ Δu ≤ 10, 10 ≤ u ≤ 55.
During the accelerated and peak period of enzyme production, the cell increased logarithmically, and the activity and yield of the enzyme could be improved by controlling the bacteria concentration in this period. The predictive control is performed at the first 360 min of logarithmic growth period, where the prediction time domain is P = 5, the control time domain is N u = 3, the initial output is u = 15, the initial increment is Δu = 4.5, the initial output is y = 14.5, and the simulation step length is 1 min. In the acceleration period of enzyme production, the bacteria concentration is not too high, which is set to 20 g/L; at the peak period of enzyme production, the growth rate of the bacteria is slowed down due to the rapid consumption of the substrate, which is set to 35 g/L to accelerate the substrate feeding rate and improve the enzyme can be set as a step signal to study and analyze the performance of the control method. In the actual fermentation process, the bacterial concentration will not be changed rapidly, which is a slow timevarying process, so the reference input also needs to be a relatively slow rising process. Figures 6 and 7 are control process diagrams of the actual fermentation process.
Under such control, the bacteria concentration grows fast during the growing phase and remains high during the producing phase, which is good for the enzyme productivity. The total activity of the enzyme is increased from 60% to 80%, and the yield is improved by 30% in the actual fermentation process.

| CON CLUS ION
In this paper, the generalized predictive control based on least squares support vector machine is proposed. After the regularization parameter C and kernel parameter σ of the model are optimized by using the particle swarm optimization algorithm, the LS-SVM model of the bacterial concentration is established, which has high prediction accuracy and high fitting degree. To avoid solving nonlinear problems, the LS-SVM model is linearized at each sampling point, and the generalized predictive control algorithm is used to solve the multi-step prediction. The experimental results show that the method has good adaptability and robustness to the control of bacterial concentration in the fermentation process. It can be applied to the control of physicochemical parameters and biological indicators in the general fermentation process.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interests.

E TH I C A L S TATEM ENT
This study does not involve any human or animal testing.