State of charge estimation for lithium‐ion battery pack based on real vehicle data and optimized backpropagation method by adaptive cross mutation sparrow search algorithm

In response to the issues of traditional backpropagation (BP) neural networks in state of charge (SOC) estimation, including easy convergence to local optima, slow convergence speed, and low accuracy, this paper proposes a novel adaptive crossover mutation strategy and dynamic sparrow search algorithm to optimize BP networks' initial values and thresholds (ACMSSA‐BP). The proposed method is based on the sparrow search algorithm, where the number of producers and scroungers is adjusted through an adaptive factor. This improvement effectively transitions the search process from extensive full exploration to localized fine‐tuning search. In the position update phase of the producers, crossover mutation and dynamic search strategies are introduced to increase the diversity of good populations, prevent the algorithm from converging to local optima, and maintain its local search capability in the later stage. Using real transportation data from coal mining flame‐proof tracked vehicles, we applied correlation theory to extract model feature parameters and constructed a training data set to estimate the SOC. The results of both static and dynamic validation experiments have indicated that the ACMSSA‐BP method has delivered impressive performance in predicting SOC, as reflected in the mean absolute error, root mean squared error, and mean absolute percentage error values of less than 1.5%, 1.5%, and 1.6%, respectively. Compared with BP, SSA‐BP, CMSSA‐BP, PSO‐BP, and NARX_NN methods, the ACMSSA‐BP approach demonstrates enhanced accuracy in estimation, significant robustness, and impressive generalization capabilities.

Based on the practical needs of the mining industry for safety production, energy conservation, emission reduction, and intelligent construction, the mining lithium-ion battery (MLIB) has been widely applied in various mining equipment, including flame-proof tracked carriers, electric backhoe loaders, standby power supplies, mobile monitoring equipment, and other mining equipment. 1Mining companies can achieve the goals of reducing emissions, enhancing energy efficiency, and reducing operating costs, while also improving the working environment and reducing noise pollution by utilizing mine-specific lithium-ion batteries. 2,3To meet the requirements of mining electrical equipment, it is often necessary to connect tens, hundreds, or even more cells in series or parallel to form large-scale mining battery packs.Generally, as the battery pack scale increases, the energy storage capacity also grows.But purchase and maintenance costs have become higher, and potential risks have become more prominent. 4With a deepening understanding of safety and reliability in mining electric equipment, operators have recognized the necessity of efficient battery management systems (BMS).Among these parameters, the state of charge (SOC) plays a crucial role in preventing battery overcharging, avoiding deep discharging, and avoiding irreversible damage.However, the estimation of SOC is challenging due to the complex electrochemical reactions involved with MLIBs, the significant dependence on environmental conditions, and the inconsistent behavior among individual mining cells. 5OC is a latent parameter of lithium-ion batteries, the ratio of remaining capacity to the rated capacity.The value of SOC is determined by the concentration of lithium ions in the electrode, which cannot be directly measured.Instead, it must be estimated based on various signals such as current, terminal voltage, and temperature. 6,7Due to the late start of the application of lithiumion batteries in the mining industry, affected by the uniqueness of the sector, the current method for estimating the SOC of MLIBs adopts the ampere-hour integration method.This method starts from the definition of SOC, without considering the internal mechanism, integrates time and current to calculate the total amount of charge flowing into or out of the battery, and thereby estimates the battery's charge status. 8,9However, the ampere-hour integration method belongs to openloop control.If the accuracy of the current collection is not high or the initial SOC given is inaccurate, errors will gradually accumulate as the system operates over time. 10n recent years, with the widespread application of 5G technology in the mining industry, the data throughput and computational power of cloud terminals have greatly improved, 11,12 leading many scholars to begin studying remote SOC estimation based on cloud terminals.Ohood et al. 13 first proposed a 5G-based open-source architecture for BMS, which can manage the SOC, improve the health status of batteries (SOH), and maximize the lifespan of battery packs.Fazel et al. 14 proposed an overview of the technical challenges of real-time monitoring and control of energy storage systems (ESSs) for electric vehicles (EVs) in intelligent cities.It also covers the internet-of-the-things (IoT) technology that can address the challenges and improve the efficiency of BMS.Li et al. 15 proposed a novel cloud-assisted online battery management method based on artificial intelligence and edge computing technologies.Integrating cloud computation and extensive data resources into real-time vehicle battery management is realized by establishing a novel cloud-edge battery management system (CEBMS).Zhou et al. 16 propose the architecture of the combination of the BMS and the cloud big data platform.Cloud terminals' multithread online computing capability addresses the issues with data-driven methods for estimating SOC.Backpropagation (BP) neural networks [17][18][19] are traditional data-driven mainstream algorithms that simulate the human brain and neurons to process nonlinear systems.This method does not require in-depth exploration of complex electrochemical reactions; instead, it extracts input and output samples that conform to the operating characteristics from the BMS, trains, and acquires a network model to achieve SOC estimation.Liu et al. collected over 8 million data sets of titanium battery SOC regions and discharge current rates as samples and proposed a BP neural network efficiency estimation and simulation model for estimating continuous-time energy efficiency and Coulombic efficiency. 20However, there are still issues with the BP method, such as its tendency to converge to local minima, slow convergence speed, difficulty in selecting appropriate learning rates, and low accuracy.To address these issues, many researchers have proposed various improvement measures.Xu et al. 21proposed an improved Drosophila algorithm that combines BP neural networks with individual migration dynamic step sizes and tested it on complex dynamic stress test (DST) and Beijing bus dynamic stress test (BBDST).Gong et al. 22 proposed a data-driven SOC estimation method based on deep learning, which consists of a long short-term memory neural network and a BP neural network.Wen et al. 23 proposed a battery SOH prediction model based on incremental capacity analysis and BP neural network for predicting the SOH of batteries under different ambient temperatures.Zhao et al. 24 used an improved firefly algorithm to optimize the weights and | 897 thresholds of the BP neural network, which improved the global optimization capability and convergence speed and reduced the fluctuation range of the battery SOH estimation error.After optimizing the parameters of the BP neural network model, it can improve the data prediction performance to a certain extent and achieve good results.Based on cloud data, using an optimized BP network to estimate battery status is the main direction of research for new BMS systems.
The sparrow search algorithm (SSA) is a recently proposed algorithm based on swarm intelligence optimization.It is simple to adapt, easy to use, flexible and scalable, and sound and complete.In addition, compared with other swarm intelligence algorithms, the sparrow algorithm has the characteristics of high search accuracy and strong robustness. 25,26Since 2020, it and its variants have been applied to a wide range of optimization problems in different scientific research topics, including lithium-ion battery model parameter identification and state estimation.Jia et al. 27 used an improved SSA to optimize the deep extreme learning machine (DELM) for predicting the SOH estimation of LiBs under random load conditions.Hou et al. 28 proposed an improved SSA combining chaotic mapping, quantum behavior strategy, and Gaussian mutation to adjust the early population quality, enhance its global search ability, and avoid trapping in local optima.This algorithm is used for identifying the model parameters of lithium-ion batteries.Liu et al. 29 proposed a new method for predicting lithium-ion batteries remaining useful life (RUL) based on an improved SSA-optimized long short-term memory (LSTM) network.However, in the SSA algorithm, the number of producers and scroungers remains constant, and there is no mutation mechanism, which can quickly reduce the diversity of the population and lead to trapping in local optima, thereby reducing the search accuracy.Directly using SSA to optimize the model parameters of BP neural networks may not achieve the expected results. 25,30loud-based remote estimation of SOC is an emerging trend in BMS, providing a fundamental backbone for big data-driven SOC estimation.Unfortunately, existing machine learning methods have yet to gain widespread application in this domain, while the precision of existing optimized BP network techniques falls short of meeting practical requirements.In response to the challenges posed by the tendency of BP to converge easily to local optima, sluggish convergence rates, and inadequate accuracy, this paper proposes an adaptive cross-variance strategy and a SSA with dynamic search enhancement for BP in SOC estimation.The proposed method optimizes the parameters of BP to obtain the SOC estimation model.To demonstrate the effectiveness of our proposed approach, we compare it with the SSA and CMSSA methods using results from both the training and test sets.Additionally, we compare our method with the particle swarm optimization BP (PSO-BP) method and the nonlinear autoregressive exogenous model neural network (NARX_NN) method to analyze the model's performance metrics and verify its accuracy and stability.
The remainder of this article is organized as follows.Section 2 presents the proposed method.Section 3 explains the LiBs data sets and implementation details.Section 4 gives the experimental results and discussion.The conclusions are explained in Section 5.

| PROPOSED METHOD
The proposed method in this paper to optimize the BP algorithm using an adaptive crossover mutation sparrow search algorithm (ACMSSA) consists of four parts: real vehicle data acquisition, traditional BP neural network, ACMSSA optimization, and optimized network prediction.The model process is shown in Figure 1.
The detailed introduction to the model is as follows: 1. Load actual data from the coal mining flame-proof tracked vehicles cloud system and extract variables representing the SOC relationship.The BMS system of the battery pack includes 337 columns of data, such as system voltage, current, SOC, and temperature.The voltage and temperature of 100 battery cells and fault information.Feature parameters are extracted using correlation theory to reduce the complexity of the model.2. Normalization processing.The feature parameters' numerical range and measurement scale are different, and normalization can reduce the impact of these differences on the model.It can also effectively eliminate outlier data and avoid causing invalid convergence of the network.The normalization method used in this article is formulated as the following equation: where X represents the feature vector, x is an arbitrary variable in X, and x' is the normalized variable.
1. Parameter setting for BP network.Based on Equation (2), 21 the mean square error of the training set is recursively calculated under different adjustment constants, and the s that yields the minor mean square error is selected as the number of nodes in the hidden layer.Additionally, the transfer function of the hidden layer is set to transit, the transfer function of the output layer is set to purlin, and the upper limit of the neural network training times is set to 1000.
According to the empirical value, setting the learning rate to 0.01 has a better effect and setting the iteration accuracy to 10 −6 .
where a is a tuning constant between 1 and 10.
3. Multicriteria evaluation.The calculation formulas for mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE) are as follows:

| Traditional BP network
BP is one of the most well-researched algorithms for neural network studies and is mainly divided into two stages: forward propagation of signals and BP of errors.It sequentially adjusts the weight and bias from the hidden layer to the output layer, as well as the weight and bias from the input layer to the hidden layer. 31A three-layer BP neural network can accurately approximate any nonlinear function by using a simple gradient descent method to minimize the overall error of network computation.As depicted in Figure 2, this network consists of an input layer, an output layer, and one or multiple hidden layers.
Suppose there are n inputs and m outputs in the network and s neurons in the hidden layer, the output of the hidden layer is b j , the threshold value of the hidden layer is θ j , and the weight from the input layer to the hidden layer is w ij .The output of jth neuron of the hidden layer is In Equation ( 7), f 1 is the transfer function of the hidden layer.Calculating the output y k of the output layer, this is In the formula, θ k is the threshold value of the output layer, f 2 is the transfer function of the output layer, w jk is the weight from the hidden layer to the output layer.Then y k is the output of the network.Suppose the desired output is t k .We can define the error function by the network's actual output, that is If the output error of the neural network exceeds the preset threshold, the error is propagated backward from the output layer, continuously correcting the weight coefficients and thresholds during this propagation until the preset estimation accuracy is attained.Then, input the forecasting samples to the trained network and obtain the forecasting results.
The critical parameters of the neural network are the number of hidden layers and neurons in this layer.Setting an appropriate number of neurons can improve the calculation accuracy and convergence speed and avoid over-adaptive. 21The number of hidden layers in a neural network is often determined using empirical formulas such as the one suggested by Equation (2).

| SSA
SSA is one of the most recent swarm-based algorithms introduced by Xue and Shen. 32It mainly mimics the foraging and anti-predation behavior of sparrows in the population.During the process of sparrows searching for food, it is divided into two groups: producers and scroungers.Producers are responsible for looking for food and providing feeding areas and directions for the entire sparrow population, while scroungers utilize the efforts of producers to obtain food. 33f a sparrow population with a total number of N individuals is represented by where d is the dimension of the variables, that is, the number of variables to be optimized, each x ij is assigned a random number between the lower and upper bound limits.Later on, the fitness value or the quality of the positions of each sparrow in the population is calculated using the fitness function f(x i ). 25 After that, the position of the best sparrow in the population should be determined.
Due to producers' strong global search capabilities, their population typically accounts for 10%-20% of the total population.As the iterative process proceeds, the positions of producers are updated as follows: The t and G in Equation (11) represent the current iteration count and the maximum number of iterations, respectively.α is a uniform random number, where α ∈ (0,1].R ∈ [0,1] and S ∈ [0.5,1] represent the alarm and the safety threshold, respectively.Q is a random number that belongs to the normal distribution.L is a one-dimensional vector with the length of d where each element of L is assigned with 1.
The position of the scroungers is updated and can be described as follows: where x t worst is the worst position of the sparrow at the tth iteration, A indicates a d-dimensional vector where each element is 1 or −1, and A + = A T (A T A T ) −1 .Moreover, i > N/2 demonstrates that the ith scrounger is starving and must fly somewhere else to obtain food.i ≤ N/2 means that the ith scrounger searches for food around x P .
Due to the constant threat posed by sparrow predators, it is necessary to randomly select 10%-20% of the sparrows from the population to serve as scouters, following each movement by the producers and The topology of backpropagation neural network.
scroungers.Scouters will be responsible for maintaining vigilance and issuing warnings if necessary.The positions of these scouters are modified using the following equation: where x t best is the best position of the sparrow at the tth iteration, τ is the step size control parameter, and it assigns a random number that follows a normal distribution with a mean value between 0 and a variance of 1. k denotes a random value belonging to [−1,1].f i is the fitness value of the ith sparrow at the t iteration, f g and f w are the best and worst fitness values, respectively.When f i is greater than f g , it indicates that the sparrow is currently located at the periphery of the population.When f i equals f g , it means that the sparrow located at the center of the population suddenly becomes aware of the risk and needs to approach other sparrows to reduce the risk.ε is a tiny constant added to avoid the denominator becoming zero.

| Adaptive SSA
The number of producers and scroungers is determined initially and remains unchanged during the iterative process in SSA.This characteristic may lead to insufficient producers, preventing them from fully exploring the vast search space in the early stage of the algorithm.Conversely, in the later stage of the algorithm, there may be an excess of producers and insufficient scroungers, preventing them from conducting fine-grained searches within a small search space.This paper proposes an adaptive adjustment method for the number of producers and scroungers to address this issue, expressed as the following equation: where P num and S num represent the numbers of producers and scroungers, respectively, b is a proportional coefficient used to ensure that the number of producers exceeds the number of scroungers and its value falls from 0 to 0.5.λ is the perturbation factor and its value ranges from 0 to 0.1.As shown in Equation ( 14), with the increase of iteration times, the number of producers' adaptive reduction, while the number of scroungers adaptive augmentation, transitioning from extensive and complete search to local and refined search.

| Cross mutation strategy of SSA
As shown in Equations ( 11) and ( 12), the position update method for producers and scroungers remains unchanged and unique, and there is no mutation mechanism for the sparrow population, which can easily lead to a lack of diversity within the population and subsequently cause the algorithm to converge to a local optimal solution.To address this issue, we introduce a cross mutation strategy to improve the SSA. 34he crossover mutation strategy is derived from the differential evolution algorithm, and the mutation strategy is represented by the following equation: where v i,n + 1 represents the mutated vector generated, x i,n is the target vector, and r1, r2, and r3 are randomly selected indices, ensuring that none are equal.F is a realvalued factor that ensures diversity within the population when it is large and the local search ability of the population when it is small.It typically has a value range of 0-2.
A new experimental individual is obtained by executing a crossover operation between the mutated and target individuals.The crossover strategy is represented by the following equation: , > and ( ), where u j,n + 1 represents the new experimental individual.The a is a random number with a value ranging from zero to one, while rnbr(i) forms a sequence of random numbers.CR is a crossover operator with a value ranging from 0 to 1.By integrating the position update equation with the crossover mutation strategy, the updated formula for the position of the producers is obtained as follows: , .
FENG ET AL.

| Dynamic search for SSA
As Equation (17) indicates, when R is less than S, that is, when no predator threat is detected, the producer can conduct a broader search.Conversely, the position update of the producer may lead to a decrease in population diversity once again, causing the objective function to converge to a local optimal solution. 35herefore, the entire population can be diversified through crossover and mutation strategies to enhance search ability.This article proposes a dynamic search strategy to address this issue.This strategy utilizes the current optimal position x t best and the worst position x t worst as reference criteria, and the individuals combine x i t , x t best , and x t worst and choose the best direction for movement based on the locations of the producers.The position of the producers is updated using the following equation: where β is a random variable that follows a normal distribution with a mean value of 0 and a variance of 1.
As Equation (18) shows, the producer can maintain good population diversity and preserve local search ability when it fails to detect or detect danger.The population can locate the best safe areas to operate based on the global optimal and worst positions, significantly reducing trapping into local optimal situations.

| ACMSSA method
The ACMSSA proposed in this paper is a fusion method obtained through the above improvements.The flowchart of its central ideas is shown in Figure 3.The improvement measures can be summarized as follows: an adaptive method is added to the initial value setting of the original SSA method to determine the number of discoverers and followers, and a cross mutation and dynamic search strategy are introduced in the position update method.time series data.In this article, 60 days of data, starting from November 24, 2022, were extracted.Figure 5 shows the battery pack's daily current and voltage changes on November 25, 2022.
As shown in Figure 5, the current less than 0 indicates that the vehicle is running and the battery pack is being discharged.At the same time, it is equal to 0, indicating that the vehicle is stationary.The current greater than 0 indicates that the vehicle is braking and the battery is being charged through power feedback.According to November 25, 2022 data, the vehicle operated under complex conditions, with the battery pack constantly alternating between charging and discharging.In this case, the vehicle was used until 7:30 a.m., when charging began at a current of 36 A. Charging continued until the voltage was cut off at 346.3 V, with a cutoff time of 8:21 am.It was then set aside for half an hour and continued to be used.It was charged again at 12:02 with a current of 36 A for 40 min and stopped at 346.3 V.It was used again at 15:31 until it was taken out of service at 19:36.
To study the effect of battery monomer on SOC estimation, we analyze the data from November 25, 2022, and extract the maximum and minimum voltage, as well as the maximum and minimum temperature of No. 1 battery pack cells and plot their curves in Figure 6.
As shown in Figure 6A, it is evident that the voltage difference between the cells is maximized at the end of charging, reaching 0.25 V.During the charging process, the cells with lower voltages do not achieve their maximum capacities.In comparison, cells with higher voltages have relatively higher internal resistance, and some energy is converted to heat energy, resulting in a decrease in actual discharge capacity.As shown in Figure 6B, the maximum and minimum temperature differences can reach up to 6°C, which reflects battery inconsistency and results in the conversion of some electrical energy into heat energy during charging and discharging.It can be inferred that individual batteries' inconsistent behavior seriously impacts the estimation of the system's SOC.Therefore, the voltage and temperature of individual cells must be included as critical characteristic parameters in the SOC prediction model to accurately characterize cell differences' impact on SOC.
To analyze the vehicle's operating conditions under different dates, data related to the SOC and temperature of the battery system were extracted on November 25, 2022, December 5, 2022, and January 12, 2023, as shown in Figure 7.The SOC in Figure 7A is calculated by the on-board BMS by the ampere-time integration method.
As is evident from Figure 7, the changes in SOC during different dates are significant.The discharge behavior is unclear and strongly influenced by the operator's driving habits.During the charging process, the slope of the increase in SOC is consistent, and the charging current strictly follows the requirement of multiples of 0.5.Additionally, there is a significant correlation between the changes in temperature of the battery pack system and the changes in SOC.As the SOC increases, the temperature rises significantly.However, when the SOC decreases, the temperature undergoes fluctuations and modifications under the influence of the current magnitude.Nevertheless, all recorded temperatures remained within the 12°C to 25°C range, indicating that the battery pack was operating in an optimal condition.

| Feature selection of data
Based on the above analysis, eight variables were initially selected as characteristic parameters, including the system voltage, current and temperature, maximum voltage, minimum voltage, average voltage, maximum temperature, and minimum temperature of the battery system.The partial correlation coefficients (PCC) are calculated to evaluate the effectiveness of each parameter between them and the SOC based on data from different dates, and the results are shown in Table 1.
In Table 1, certain characteristic parameters strongly correlate with the SOC, including the system voltage, the maximum voltage, and the average voltage of the individual battery (PCC values exceed 0.7).Due to the influence of different dates and operating conditions, the correlation between the minimum temperature of the particular battery and the SOC fluctuates significantly.In contrast, the correlation between the maximum temperature of the individual battery and the SOC remains relatively stable and is maximized.In the practical operation of the vehicle, current fluctuations are significant, making it difficult to establish a high correlation with the SOC.However,  when the correlation between the SOC and current is low, the correlation with temperature increases instead.Selecting both as feature variables can optimize complementarity.

| Analysis of training data set for the model
Due to the significant differences in the vehicle's operating conditions every day, to achieve high prediction accuracy, the training data set needs to encompass these differences as much as possible.In this paper, the data from 5 days of No.1 battery pack were selected as the training set, with the specific dates and data sizes shown in Table 2.
The data presented in Table 2 come from three distinct months, representing the long-term operation of the vehicle.The data is formatted in a two-dimensional matrix, with rows representing the number of sampling points in the time series and columns representing individual variables.Among these, the first eight columns constitute the input feature variables, while the last column represents the output SOC.
To justify the rationality of the selection of the training data set, an analysis is conducted on the input feature variables, and the curves representing the data variations are plotted in Figure 8.
It is easy to find from Figure 8 that the training data set includes messages from multiple charging, static, and operating conditions of the vehicle.The charge or discharge current ranges from −102 to 80 A, and the temperature ranges from 12°C to 26°C.The current and temperature fields are broad, containing a large amount of information, and this data set's input variables can represent the vehicle's long-term operational characteristics.
The SOC, serving as the output variable of the training model, has its prediction accuracy used to evaluate the model's performance directly.To validate the ACMSSA-BP model's effectiveness, traditional BP, SSA-optimized BP,36, 37 and CMSSA-optimized BP were selected for comparison, with the results displayed in Figure 9.
The BP model exhibits the poorest prediction performance for the training data set, with a maximum error of 0.2.In contrast, the ACMSSA-BP model demonstrates the optimal prediction accuracy with a maximum error of only 0.035.The enlarged results in Figure 9A show that when the SOC is below 0.6 and the vehicle outputs a high current, the proposed method has significantly better prediction accuracy than other models.As seen from Figure 9B, when the vehicle is in standby mode, the fluctuations of the SOC are extremely small, and the prediction error difference is not significant.Nevertheless, the ACMSSA-BP model still exhibits the best prediction accuracy among all the models.We have constructed a testing data set to verify the trained model's effectiveness.The data is randomly selected from a cloud-based database and undergoes feature extraction processing.The data set information is provided in Table 3.
No.2 represents the second battery pack, which, together with No. 1, forms the vehicle's propulsion system.Therefore, the operating conditions of No. 1 and No. 2 are essentially the same.The four data sets are denoted as No1-1, No1-2, No2-1, and No2-2, respectively.Figure 10 presents the SOC estimated based on these four test data sets.
As can be seen from the SOC curve of No1-1, the vehicle underwent two charging and discharging cycles on the same day, with the SOC reaching 0.59 and 0.67 after discharge.Each discharge event exceeded 4 h in duration.The complex operating conditions pose challenges to estimating the SOC, resulting in large prediction errors for the BP and SSA-BP methods.However, the ACMSSA-BP method demonstrates the best prediction accuracy.As shown in Figure 10B, the vehicle remained in standby mode throughout the day, and the operation of onboard low-voltage equipment caused a decrease in SOC of 0.02.The prediction errors were relatively similar among the three models, excluding the BP method, and remained around 0.015.Upon  observing the SOC curve of No2-1, we can see that the vehicle underwent two charging and discharging cycles on the same day, with the SOC reaching 0.52 and 0.67 at the end of discharge.The first discharge process was noncontinuous and lasted for 13 h.As the operating conditions of the No2-1 battery pack were identical to those of the first group in the training data set, the prediction accuracy of all four methods was relatively high, especially for the method proposed in this paper.As portrayed in Figure 10D, the vehicle was operational for 4 h before initiating charging at a SOC value of 0.77.After completing the charging process, the operator deactivated the BMS system, which resulted in the cloud platform only collecting data for 8 h.The SOC prediction curve in Figure 10 shows that the ACMSSA-BP model demonstrates high prediction accuracy and good robustness.
To objectively evaluate the accuracy and robustness of the ACMSSA-BP model, the absolute error of the predictions from four models was calculated and plotted in Figure 11.
As shown in Figure 11, the prediction error of the BP model for SOC is generally significant, with a maximum of 22%.Additionally, this method has poor antiinterference ability, with a maximum error fluctuation of 50%.Although the prediction errors of the SSA-BP and CMSSA-BP models for SOC are significantly lower, they are still susceptible to changes in operating conditions, with a prediction error fluctuation of about 10%.Moreover, there is evidence of over-fitting, resulting in poor robustness of these models.The ACMSSA-BP model demonstrates the lowest prediction error for SOC, with a maximum fluctuation of only 5%, highlighting its high prediction accuracy, strong resistance to interference, and exceptional generalization performance.
Table 4 presents the evaluation performance metrics of four models on different data sets.Figure 12   It can be seen from the No1-1 test data that the vehicle operating conditions are relatively complex, which has resulted in the BP method having the worst performance in SOC prediction indicators.The optimized models have shown improvement after incorporating the SSA method, which enhances the search capability and effectively prevents over-fitting.Among them, the ACMSSA-BP method can adaptively adjust the number of producers and scroungers to achieve precise dynamic search within a small range.Therefore, the MAE, RMSE, and MAPE are reduced by 69%, 70%, and 68%, respectively, compared with the BP method.For the No1-2 test data, the evaluation indicators obtained by the three optimized methods are essentially the same.The proposed method in this paper demonstrates the best performance among them all.The No2-1 test data set faithfully reflects the complex operating conditions of the vehicle, highlighting the advantages of the ACMSSA-BP method.The evaluation indicators of the proposed method are significantly superior to other models, with reductions of 82%, 80%, and 84% compared with the BP model for MAE, RMSE, and MAPE, respectively.For the No2-2 test data, the ACMSSA-BP method also exhibits the optimal evaluation indicators for predicting the SOC, with reductions of 60%, 56%, and 59% compared with the BP model, respectively.

| Comparison with other methods
To further verify the effectiveness of the ACMSSA-BP algorithm, this paper compares it with PSO-BP, 38 NARX_NN, 39 and SSA-BP.The data of battery packs NO.1-1 and NO.2-1 in Table 3 are selected as the comparative analysis object.Figure 13 displays the comparison results and errors.
Figure 13A,B show that when the SOC changes smoothly, all methods except the BP network exhibit a strong fit to the original running data.This observation highlights the high accuracy of methods like PSO-BP, NARX_NN, and ACMSSA-BP.However, when the SOC changes drastically due to the sudden current change, the algorithm proposed in this paper is significantly better than other methods, as shown in the local zoomed-in plot in Figure 13B, indicating the method has strong robustness.Figure 13C,D illustrate the estimation errors present under two distinct working conditions, and it is easy to find that the estimation errors of the PSO-BP method and the ACMSSA-BP method are basically the same when the SOC fluctuation is smooth.When the SOC fluctuation is large, the ACMSSA-BP method is obviously better than the other methods, which again shows that the accuracy and robustness of the proposed method are better than the other methods.Also, from Figure 13, it can be seen that the method of NARX_NN performs better at the beginning of SOC estimation, but with the sudden change of SOC, the method shows over-fitting manifestation, resulting in large local errors.To further analyze the accuracy and robustness of the BP, PSO-BP, NARX_NN, and ACMSSA-BP methods, the MAE, RMSE, and MAPE of the four methods are calculated according to Equations ( 4)- (6).and the results are shown in Table 5.To visualize the performance indexes of the four models, the data in   | 909 Table 5 are plotted as multivariate histograms, as shown in Figure 14.
As evident from Table 5 and Figure 14, the PSO-BP, NARX_NN, and ACMSSA-BP methods exhibit superior performance metrics compared with the BP network, affirming the efficacy of the optimization approach.Furthermore, the ACMSSA-BP method proposed in this study surpasses the PSO-BP and NARX_NN methods in terms of performance indicators, suggesting higher accuracy and stronger robustness in estimating the SOC.This paper proposes a novel SSA with an adaptive crossover mutation strategy and dynamic search, combining the BP neural network method to estimate battery packs' charge status effectively.The battery pack data of mining flameproof trackless vehicles is analyzed in detail.As a result, the model feature variables have been extracted to establish a training data set with optimal qualities for analysis and prediction.The results of SOC prediction and evaluation criteria have validated the feasibility of the proposed method.
The main contributions of this paper are as follows: 1.During the SSA iterative process, an adaptive adjustment method is proposed to address the problem of constant producers and scroungers' quantity.In the early stage of iteration, the method can perform sufficient search within a broad search range, while in the later stage of iteration, it can conduct fine-grained search within a smaller search range.2. To address the issue of the unchanged position update method for producers and scroungers, a crossover mutation strategy is proposed to increase the diversity of the population and prevent the algorithm from trapping in local optima.Meanwhile, a dynamic search strategy is added to maintain good population diversity and also maintain strong local search ability in later stages.3. Based on applying of 5G technology in intelligent mines, a data set for estimating the SOC of mining transportation vehicle power systems was constructed using cloud platform data.The effectiveness and reliability of the ACMSSA-BP method for SOC estimation were verified on this data set, with MAE, RMSE, and MAPE values of less than 1.5%, 1.5%, and 1.6%, respectively.Comparison with other machine learning methods verifies that the method has high accuracy and strong robustness.

F
I G U R E 1 Flowchart of training and prediction of adaptive crossover mutation sparrow search algorithm optimized BP model based on real vehicle data.

3 | 3 . 1 |
ANALYSIS OF REAL VEHICLE DATA Data analysis of battery packThe mining flame-proof trackless vehicles studied in this paper have a load capacity of five tons, and they are used in Inner Mongolia, China, for transporting mining workers.The vehicle's battery system consists of two battery packs, No. 1 and No. 2, as shown in Figure 4.Each battery pack is constructed by connecting 100 lithium-iron-phosphate battery cells in series, with the capacity and standard voltage of each battery cell being 72 Ah and 3.2 V, respectively.The vehicle uploads data to the cloud every 5 s, and the decoded data is in the form of F I G U R E 3 Flowchart of the core ideas of the adaptive crossover mutation sparrow search algorithm methodology.F I G U R E 4 Physical drawing of battery system for transportation vehicles.

5 F I U E 6
Current and voltage variation curves for battery pack No. 1 on November 25, 2022.(A) Current change curve and (B) voltage change curve.Voltage and temperature curves of battery cells.(A) Maximum and minimum voltage curves of battery cells and (B) maximum and minimum temperature curves of a battery cells.

7
The vehicle's operating conditions under different dates.(A) State of charge (SOC) variation curves for different dates and (B) temperature variation curves for different dates.T A B L E 1 Correlation between characteristic variables and state of charge.

T A B L E 2 6 F
Indicator parameters of the training data set.I G U R E 8 Input variables to the training data set.(A) System voltage and current profiles for the training data set and (B) voltage and temperature information of the battery cells of the training data set.model

F I G U R E 9
Prediction results based on the training data set.(A) Prediction results of training data set, (B) prediction error of training data set, and (C) box-plot of prediction error of training data set.T A B L E 3 Indicator parameters of the testing data set.
shows a multivariate histogram of evaluation indicators.

F
I G U R E 10 State of charge (SOC) prediction results on testing data sets.(A) SOC prediction results of No1-1, (B) SOC prediction results of No1-2, (C) SOC prediction results of No2-1, and (D) SOC prediction results of No2-2.

F
I G U R E 11 State of charge (SOC) prediction error on testing data sets.(A) SOC prediction error of No1-1, (B) SOC prediction error of No1-2, (C) SOC prediction error of No2-1, and (D) SOC prediction error of No2-2.

F I G U R E 12
Multivariate histogram of evaluation indicators for backpropagation (BP) and improved methods.

I
G U R E 13 State of charge (SOC) estimation results of different methods for battery pack NO.1-1 and NO.2-1.(A) SOC estimation results of NO.1-1.(B) SOC estimation results of NO.2-1.(C) SOC estimation error of NO.1-1.(D) SOC estimation error of NO.2-1.T A B L E 5 Comparison of performance metrics with other machine learning methods.

2. ACMSSA optimizes the BP network. Parameters w 1 , B 1 , w 2 , and B 2 are obtained based on the ACMSSA method
introduced in Section 2.3.These parameters are then inserted into Equation (3) to optimize the parameters of the BP network.The Reshape() function implements grid reconfiguration.H num , I num , and O num represent the number of nodes in the hidden, input, and output layers, respectively.
T A B L E 4 Comparison of evaluation indicators for backpropagation (BP) and improved methods.