Bayesian optimization algorithm‐based Gaussian process regression for in situ state of health prediction of minorly deformed lithium‐ion battery

Accurate on‐board state‐of‐health (SOH) prediction is crucial for lithium‐ion battery applications. This study presents an in situ prediction technique for minorly deformed battery SOH, utilizing a Gaussian process regression (GPR) model tuned by a Bayesian optimization algorithm. Unlike previous methods that interpret voltage–time data as incremental capacitance curves, our approach directly operates on raw voltage–time data. We apply gray relational analysis to select feature variables as inputs and train the Bayesian Gaussian process regression (BGPR) model using experimental data from batteries under different working conditions. To demonstrate the performance of the BGPR model, we compare it with stepwise linear regression, neural network, and Bayesian support vector machine (BSVM) models. The performance of these four models is evaluated using different performance indicators: mean absolute percentage error (MAPE), root‐mean‐squared percentage error (RMSPE), and coefficient of determination (R²). The results demonstrate that the BGPR model exhibits superior prediction performance with the lowest MAPE (0.11%), RMSPE (0.12%), and the highest R² (0.9915) for minorly deformed batteries. Furthermore, the BGPR model exhibits excellent robustness for SOH prediction of normal batteries under different conditions. This study provides an effective and robust method for accurate on‐board SOH prediction in lithium‐ion battery applications.


| INTRODUCTION
5][6] How to accurately predict the state-of-health (SOH) of the battery has always been the main problem of battery management systems (BMS).The physical and chemical changes involved in the mechanical response of LIBs after collision are extremely complex, studies have shown that the safety of the battery pack after collision is closely related to the SOH of the internal cells. 7Due to the unclear requirements of the passive safety design of the battery pack, it highlights the urgent need to study the SOH prediction of the deformed battery.In recent years, researchers have proposed many methods to achieve accurate prediction of battery SOH.[10] The ampere-hour integration method is a commonly used capacity test method in laboratory conditions.Berecibar et al. 11 used the results of ampere-hour integration method to verify the accuracy of other methods.The open circuit voltage (OCV) curve of a lithium-ion cell can be described as the difference between the half-cell open circuit potential curves of both electrodes.By fitting the reconstructed OCV curves with the OCV curves of aged batteries, the aging pattern of the aged batteries can be identified.Schmitt et al. 12 applied this method to part of the charging curves of commercial batteries with silicon-graphite and NMC-811 as electrode materials.The aging pattern and remaining capacity can be determined from the reconstructed OCV curve.To investigate how the performance of lithium-ion cells changes over time after long-term storage in certain conditions, a series of tests were performed to measure their capacity and internal resistance. 13odel-based methods mainly include adaptive filtering techniques and data-driven techniques.Xiong et al. 14 simplified the computational electrochemical model, used a genetic algorithm for parameter identification, and extracted five characteristic parameters for estimation by regression.Zhang et al. 15 proposed an equivalent circuit model (ECM) parameter identification method considering electrochemical performance.Incremental capacity (IC) is a valuable tool for evaluating the health status of LIBs.However, its use is limited due to the need for low-rate discharge testing, and it cannot accurately represent real-world conditions.To overcome this limitation, a new feature extraction technique has been applied to a large data set of batteries with different cycling lifetimes.This technique uses three regression models, support vector regression (SVR), multilayer perceptron, and random forest, to successfully predict the health status of batteries. 16Zhang et al. 17 proposed a new method for estimating the SOH of batteries based on an integrated analysis of incremental capacity analysis (ICA) and SVR using a voltage-capacity model.Xu 18 proposed a new method for estimating the SOH of LIBs based on discrete ICA.This method does not require any prior knowledge about the internal details of the battery and can be applied to the actual driving conditions, providing a more accurate and reliable estimate of the battery SOH.
Compared with other methods, the data-driven prediction method does not require the mechanism knowledge of the LIBs, and is a more practical prediction method. 19Data-driven prediction typically utilizes a partial IC curve for battery SOH prediction and shows good performance. 20,21Meanwhile, this approach has a number of drawbacks, such as the poor predicted accuracy caused by differentiating the voltage-time data, a long measurement duration, cumbersome preprocessing steps for feature selection, and so on.Furthermore, the previous studies 22,23 have shown that significant differences can be found between the minorly deformed battery and the normal battery, which makes it difficult to extract feature variables from the IC curve of the minorly deformed battery as inputs of the SOH prediction model.
To overcome these issues, this work extracts the voltage-time data directly from the CC charging curve to characterize the decline of battery capacity, and to construct a reliable and accurate battery SOH in situ prediction method.An SOH prediction method for LIBs based on Bayesian Gaussian process regression (BGPR) is proposed to accurately predict the SOH of minorly deformed batteries.The inputs of the proposed method are extracted from the charging curve, and the correlation between the input feature variables and battery SOH is proven by GRA.These feature variables can be easily accessible in practice.The BGPR model is trained by using the experimental data of battery under different working conditions, and the performance of BGPR model is compared with stepwise linear regression (SLR), neural network (NN), and Bayesian support vector machine (BSVM) model.The accuracy and robustness of the proposed BGPR method are verified in accordance with the evaluated results of different performance indicators.

| Battery SOH
The SOH of LIB, also known as the life state of the battery, is the characterization parameter of the battery health state that determines the working condition of LIB.SOH has many definitions, such as capacity ratio, 24,25 RUL, 26,27 and internal resistance. 28Among them, capacity ratio is the commonly used definition, which is adopted in this study and is expressed as Equation (1).
where Q c is the current capacity, and Q i represents the initial capacity of the battery.

| Experimental data analysis
Commercial INR 21700-30T LIBs manufactured by Samsung were used in this study to conduct the charge/discharge cycle experiment.The specifications of the cells are shown in Table 1.All batteries were cycled three times to select the batteries with good stability and consistency.The selected batteries were divided into four groups: #1-#4, #5-#8, #9-#12, #13-#16.An Instron Model LEGEND 2345 compression tester was used to perform quasi-static compression of #1-#8 batteries to create artificial defects, which is shown in Figure 1.The #1-#4 battery and #5-#8 battery were unloaded when the deformation reached 3 and 6 mm, respectively.
The initial damage "minorly" in our work is defined as there is no temperature increase or voltage drops significantly during the initial mechanical loading of the battery.The voltage and loading force profiles for those cells with 3 and 6 mm are shown in Figure 2. The surface temperature of those cells is constant.The difference between 3 and 6 mm is only due to the different loading forces, and the main purpose is to reflect different degrees of minor damage on cells in this work.Furthermore, there is no electrolyte leakage on the battery surface after the initial mechanical loading.
Three hundred cycles of charge/discharge experiments (Table 2) were conducted for the four groups of batteries #1-#16.The experiment was performed at room temperature by using a CT-4008-5V6A-S1 NEWARE battery test equipment, which is shown in Figure 1B.The room temperature is a winter temperature of approximately 16°C, which is controlled by an indoor air conditioner.The charging and discharging modes are both constant current (CC) and constant voltage (CV).The specific cycling procedure of each battery is as follows.
The charging current of #1-#12 batteries is 1 C, and that of #13-#16 batteries is 0.5 C. The C-rate here is the measurement of the charge and discharge current with respect to its nominal capacity.Other charge/discharge conditions are the same.Next, we select batteries with good consistency as representatives for analysis and presentation.The four representative batteries are #2, #8, #12, and #16.The charge/discharge experimental conditions of the four batteries are shown in Table 3. Figure 3A shows the charge/discharge and temperature curve of the battery at 1 C.The thermocouple used in this work is of K-type, with a temperature accuracy of 1°C and a resolution of 0.1°C.Figure 3B shows the capacity decay of the four batteries with the cycle times.The battery capacity decreases nonlinearly with the cycle times, and the nonlinear attenuation of the deformed battery capacity is more obvious than that of the normal battery.The nonlinear relationship between battery capacity and cycle times has an important influence on the SOH prediction of the deformed batteries.By conducting 300 cycles of charging and discharging tests, it is possible to obtain data on the current, voltage (V), and capacity (Q) corresponding to each second during each cycle of the battery.Then, the dQ/dV values can be calculated to obtain the IC curve with voltage as the horizontal axis and dQ/dV as the vertical axis. 29As the calculated IC curve usually contains noise, Gaussian filtering can be used to filter the IC curve. 30Finally, the filtered IC curves of the 50th, 100th, 150th, 200th, 250th, and 300th cycles can be plotted on one graph.Figure 4A,B  show the IC curves of normal battery #12 and minorly deformed battery #2 with different cycles.Combined with the previous studies, 29 it shows that the aging of a normal battery is mainly reflected in the variations of the peak position, height, and area of the IC curve, and the number of peaks remains unchanged, as shown in Figure 4A.However, the most significant variation in minorly deformed batteries presented in Figure 4B is the decrease in the number of IC curve peaks, followed by changes in peak position and height.When the battery #12 and #2 capacity is equivalent (Figure 4C), the number and position of the IC curve peak for the minorly deformed battery are different with the normal battery (Figure 4D).Precise mechanisms for these phenomena still need further investigation.Nevertheless, utilizing the selected features of the IC curves as inputs of the capacity prediction model is obviously inapplicable for minorly deformed batteries.

| Feature extraction of battery degradation
Figure 5A shows the CC charging curves of LIBs under different cycles.It displays that with the increase in battery cycle times, the time required for CC charging is gradually shortened.This condition helps to establish the correlation between the feature variables of the battery CC charging curve and the SOH.According to Deng et al., 31 the time corresponding to CC charging to different voltages (Figure 5B, F1, F2, F3, F4, F5, F6) and the cycle times F7 were used as candidate feature variables.
F1: The time required for CC charging to 3.7 V; F2: The time required for CC charging to 3.8 V; F3: The time required for CC charging to 3.9 V; F4: The time required for CC charging to 4.0 V; F5: The time required for CC charging to 4.1 V; F6: The time required for CC charging to 4.2 V; F7: Cycle times.
The selection of feature variables is the key to machine learning (ML) model construction.If the number of feature variables is extremely small, then the accuracy of model prediction is low.However, if the number of feature variables increases, the accuracy of model prediction will not monotonically increase.Selecting appropriate feature variables is conducive to model prediction.In this study, we use the GRA 25,31,32 algorithm to select the feature variables.
In accordance with the GRA algorithm, the relationship between candidate feature variables and SOH can be calculated, as shown in Table 4.The results show that the time corresponding to the high voltage range has a T A B L E 4 GRA relational grades between the features and battery SOH. 31,33ttery label | 1477 higher correlation with battery SOH.This indicates that, a higher voltage range has more information than a lower voltage range.However, the higher voltage leads to the more obvious temperature rise and the more dangerous of the deformed battery. 22In this work, x = {FV2, FV3, FV4, FV5} is eventually selected as the feature variables for building BGPR model and providing accurate SOH prediction.The availability of feature variables determines the feasibility of the In-situ prediction method of battery SOH.

| GPR
The GPR method is used to analyze the relationship between feature variables and output values by combining Gaussian probability density function.Gaussian process can be determined by using mean function m x ( ) and covariance function κ x ( ), which can be expressed as the following equation.
In most practical applications, obtaining f x ( ) is usually difficult.Only the observation data containing noise ε can be obtained in the below equation.
where y is the observation value, x is the input vector, f is the function value, noise ε N σ (0, ) , and σ n is the standard deviation of noise.These variables obey the joint Gaussian distribution and can be expressed as the below equation.
where X is the input, test set x * , I n is an n-dimensional unit matrix, K X X ( , ) represents the covariance matrix of input itself, K x x ( * , * ) is the covariance matrix of test set itself, K X x ( , * ) represents the covariance matrix of test point and input, K x X ( * , ) represents the covariance matrix between the input and the test point, and K X x K x X ( , * ) = ( * , ) T .
The posterior distribution of f * can be obtained in Equation (5) in accordance with the Bayesian principle and conditional probability characteristics of joint normal distribution. where The principle of joint Gaussian distribution enables the prediction results of the target to be inferred from the mean function f * and covariance function f cov( * ).When using GP to solve the problem, the choice of covariance function is extremely important, which involves the performance results after model fitting.Several covariance functions are commonly used such as exponential, squared exponential, rational quadratic, Matrén 3/2 and Matrén 5/2.
The unknown parameters in covariance functions are called "hyperparameters."The GPR model will be determined until the kernel function form and "hyperparameter" are confirmed.In the BOA, the process of the selection for kernel function in Gaussian regression fitting is regarded as a hyperparameter optimization problem.

| BOA of model hyperparameter
Bayesian optimization is an approximate approximation method that is an extremely effective method in global optimization.The BOA estimates the posterior distribution of the objective function on the basis of Bayesian theorem (Equation 6 34 ), and then establishes an alternative function on the basis of the past evaluation results of the unknown objective function to find the next hyperparameter combination for minimizing the value of the objective function, which is expressed as Equation ( 7). 35


where f represents an unknown objective function, D represents the set of observed parameters and observed values (D x y x y x y = {( , ), ( , ), …, ( , )} The BOA is used in this study because it is more effective than other available optimization methods and it is also an orderly process for the global optimization of black box functions.The BOA is combined with GPR algorithm to optimize the hyperparameters.K-fold cross-validation method is used before applying the BOA to avoid overfitting. 35The original data are divided into K groups (K-fold), each subset of the data is used as a verification set once, and the K − 1 subset of the data is used as a training set.This method is reiterated k times, and k = 5 in this work.

| Performance evaluation criteria
Three common statistical indicators, namely, mean absolute percentage error (MAPE), root-mean-squared percentage error (RMSPE), and R 2 , were used in this work to analyze the performance of the prediction model.The expressions are shown as follows: (1) RMSPE is used to evaluate the deviation between the predict SOH p and real SOH r of battery.
where n denotes the size of the data set.
(2) MAPE is used to calculate the mean of the relative error between the predict SOH p and real SOH r of battery.
(3) R² is used to measures the degree in which the predicted values match the real values, reflecting the goodness of fit of the regression equation.SOH a is the average value of SOH.

| SOH prediction based on BGPR
The SOH prediction process based on the proposed BGPR model is shown in Figure 6. 36  (50%, i.e., 151th-300th) (x r , y r ) = ({FV2, FV3, FV4, FV5} r , SOH r ).The Bayesian parameter optimization program is implemented based on MATLAB 2020a, as an optimization algorithm, Bayesian Optimization is used for hyperparameter tuning of Gaussian process regression models by minimizing the optimization objective of generalization error.The specific process of hyperparameter optimization is shown in Figure 6.Finally, by inputting the x r of each battery test set to the SOH prediction model, the model will give a predicted value y ˆ.The accuracy and reliability of the prediction model are evaluated by analyzing the error between the predicted value y ˆand the corresponding real value y r in the test set, and combining with three common statistical evaluation indicators: MAPE, RMSPE, and R 2 .
In the process of model development, the hyperparameter search range is ε [0.0001, 0.3164] ∈ , the base functions include constant, zero, and linear, and the kernel functions include nonisotropic exponential, nonisotropic Matrén 3/2, nonisotropic 5/2, nonisotropic rational quadratic, nonisotropic squared exponential, isotropic exponential, isotropic Matrén 3/2, isotropic Matrén 5/2, isotropic rational quadratic, isotropic squared exponential.The range of the kernel scale is [0.1133, 556.0942].The training time of the models range from 48.46 to 72.49 s.The iteration number is set to 30.All parameters and material properties used for the model are shown in Table 5.

| RESULTS AND DISCUSSION
As mentioned in Section 3, the proposed method is verified by using the test sets of batteries #2, #8, #12, and #16.The prediction effects of BGPR, BSVM, 35 NN, 37 and SLR 38 algorithms are compared, and the applicability of the selected features and the design idea of training set is verified.Figure 7 shows the SOH prediction results of minorly deformed batteries #2 and #8 and the normal batteries #12 and #16.
Figure 7A,B show that when the four methods are used to predict the SOH of minorly deformed batteries #2 and #8, the predicted results show that BGPR algorithm has good characteristics, and the predicted error shown in Figure 8A,B is controlled below 0.9%.When SLR and BSVM methods are used to predict SOH, the prediction errors are as high as 2.5%.The relative error of the NN algorithm for prediction is higher than 6%.
Combined with the results of quantitative evaluation of the prediction results by using RMSPE, MAPE, and R 2 in Figure 9 and Table 6, the predictive results of the BGPR algorithm indicate that the RMSPE value is below 0.47%, the MAPE value is below 0.38%, and the R 2 value is above 0.9595, indicating the best predictive performance.Using the BSVM algorithm, the RMSPE and MAPE values for deformed battery #2 are 0.43% and 0.33%, respectively, with good performance.However, for deformed battery #8, the relevant evaluation index values reach 1.47% and 1.23%, indicating poorer predictive performance.Using the NN algorithm, the RMSPE and MAPE values for deformed battery #2 are 2.42% and 1.59%, respectively, with the worst performance.However, for deformed battery #8, the relevant evaluation index parameters are 0.8% and 0.56%, respectively, indicating relatively good performance.The predictive performance of the SLR algorithm is at an intermediate level.Comprehensive evaluation results for deformed battery SOH prediction indicate that the BGPR algorithm has the best predictive performance.The BGPR algorithm produced the most accurate SOH predictions (Figure 7C,D), with a relative error controlled within 0.3%.The SLR algorithm also provided accurate predictions for battery #16 (Figure 8D), with a relative error of 0.35%.However, for battery #12, the SLR algorithm's predictions deviated significantly from the real values after the 280th cycle, leading to a deterioration in prediction accuracy and a relative error of up to 1.1% in the later stages of the cycle (Figure 8C).When using the BSVM algorithm to predict the SOH of normal batteries, the prediction accuracy deteriorated significantly in the later stages of the cycle.For battery #12, the relative error reached 0.8%, and for battery #16, it reached 1.6%.The NN algorithm produced the worst predictions among the algorithms tested, making it the least effective for predicting the SOH of normal batteries.
Combined with the performance evaluation criteria in Figure 9 and Table 6, the predictive performance of the four algorithms for battery #12 is relatively similar.Among them, the BGPR algorithm achieved the best prediction results, with an RMSPE value of 0.13%, an MAPE value of 0.12%, and an R 2 value as high as 0.9915.For battery #16, the predictive performance of the four algorithms is also not significantly different.Among them, the SLR algorithm obtained the smallest RMSPE and MAPE values, which were 0.09% and 0.07%, respectively.However, its R 2 value was only 0.8778, mainly due to a sudden deterioration in prediction accuracy in the later stages of the cycle.The BGPR algorithm achieved RMSPE and MAPE values that were smaller than those obtained by the BSVM and NN algorithms, which were 0.12% and 0.11%, respectively, and its R 2 value was 0.9905.Comprehensive evaluation results for normal battery SOH prediction using the four algorithms indicate that the BGPR algorithm achieved the best predictive performance.
The above SOH prediction results of deformed batteries #2, #8 and normal batteries #12, #16 show that the BGPR algorithm has good characteristics in the SOH prediction, and the goodness of fit of the regression equation is the best in the SOH prediction of normal

T A B L E 1
Specifications of the tested battery cells.LiNi 0.8 Co 0.1 Mn 0.1 O 2 F I G U R E 1 (A) Minorly deformed experiments and (B) Cyclic charging and discharging experiment.

F
I G U R E 2 Typical mechanical and electrochemistry behavior in compression loading, including force-displacement, and voltagedisplacement: (A) Mechanical deformation for 3 mm, and (B) Mechanical deformation for 6 mm.T A B L E 2 Cycling procedure for each battery.Steps Experiment description 1 Rest the battery for 5 min 2 C C -CV charging until the cut-off voltage reaches 4.2 V and the cut-off current is below 0.02 C 3 Rest the battery for 5 min 4 C C -CV discharging until the cut-off voltage reaches 2.75 V, and the cut-off current is below 0.02 C 5 Rest the battery for 5 min 6 If cycle ≥ set values, then end the process; otherwise, return to Step 2 T A B L E 3 Experiment condition of batteries #2, #8, #12, and #16.

F
I G U R E 3 Aging cycle schemes and capacity degradation profiles of LIBs: (A) Completed charge/discharge test at 1 C; (B) Capacity degradation curve of the four batteries.

F
I G U R E 4 (A) IC curves of normal battery #12 with different cycles; (B) IC curves of minorly deformed battery #2 with different cycles; (C) Capacity loss of batteries #12 and #2; (D) IC curves of batteries #12 and #2 at the 21th cycle.IC, incremental capacity.

F
I G U R E 5 Feature selection in charging curve.(A) Constant current charging curve; (B) Candidate feature selection.

F
I G U R E 6 Battery SOH prediction procedure based on BGPR.

F
I U R E 7 SOH prediction results for different batteries.battery.The SLR algorithm shows good performance in the SOH prediction of battery #16, but the goodness of fit of the regression equation is poor.The BSVM algorithm has the best performance in the SOH prediction of battery #2, but shows poor characteristics in the SOH prediction of the batteries #8.The prediction performance of the NN algorithm is generally poor.Therefore, the SOH prediction results of the battery under four different working conditions show that the BGPR algorithm has high accuracy and good robustness.

F I G U R E 8
The relative errors for different batteries.

F I G U R E 9
Performance of the SOH prediction results with different algorithms.
Parameters and material properties used for the model.
Verifies the accuracy and robustness of the BGPR model: The superior prediction performance of the BGPR prediction technique is verified by comparing the performance indicators (MAPE, RMSPE, and R 2 ) of BGPR, BSVM, NN, and SLR prediction models.Wang T, Zhang Y, et al.Dynamic behavior and modeling of prismatic lithium-ion battery.Int J Energy Res. 2020;44(4):2984-2997. 2. Vashisht S, Rakshit D, Panchal S, Fowler M, Fraser R. Thermal behaviour of li-ion battery: an improved electrothermal model considering the effects of depth of discharge and temperature.J Energy Storage. 2023;70:Art no.107797.3. Baveja R, Bhattacharya J, Panchal S, Fraser R, Fowler M. Predicting temperature distribution of passively balanced battery module under realistic driving conditions through coupled equivalent circuit method and lumped heat dissipation method.J Energy Storage. 2023;70:Art no.107967.4. Kausthubharam ▫, Koorata PK, Panchal S. Thermal management of large-sized LiFePO4 pouch cell using simplified minichannel cold plates.Appl Therm Eng.2023;234:Art no.121286.5. Liu B, Jia Y, Yuan C, et al.Safety issues and mechanisms of lithium-ion battery cell upon mechanical abusive loading: a review.Energy Storage Mater.2020;24:85-112.6. Talele V, Moralı U, Patil MS, Panchal S, Mathew K. Optimal battery preheating in critical subzero ambient condition using different preheating arrangement and advance pyro linear thermal insulation.Thermal Sci Eng Prog.2023;42:Art no.101908.T A B L E 6 R 2 of SOH prediction results by different methods.