Accelerated Search for BaTiO3‐Based Ceramics with Large Energy Storage at Low Fields Using Machine Learning and Experimental Design

Abstract The problem that is considered is that of maximizing the energy storage density of Pb‐free BaTiO3‐based dielectrics at low electric fields. It is demonstrated that how varying the size of the combinatorial search space influences the efficiency of material discovery by comparing the performance of two machine learning based approaches where different levels of physical insights are involved. It is started with physics intuition to provide guiding principles to find better performers lying in the crossover region in the composition–temperature phase diagram between the ferroelectric phase and relaxor ferroelectric phase. Such an approach is limiting for multidopant solid solutions and motivates the use of two data‐driven machine learning and design strategies with a feedback loop to experiments. Strategy I considers learning and property prediction on all the compounds, and strategy II learns to preselect compounds in the crossover region on which prediction is carried out. By performing only two active learning loops via strategy II, the compound (Ba0.86Ca0.14)(Ti0.79Zr0.11Hf0.10)O3 is synthesized with the largest energy storage density ≈73 mJ cm−3 at a field of 20 kV cm−1, and an insight into the relative performance of the strategies using varying levels of knowledge is provided.


Section 1: Dielectric and ferroelectric properties of simple systems
Figures S1(a1)-(c1) show the dielectric permittivity (ε) versus temperature (T) curves at different frequencies in the BaTi 1−x Zr x O 3 , BaTi 1−x Hf x O 3 and BaTi 1−x Sn x O 3 system, respectively. The different colors represent various compounds in the system. As the dopant increases, the temperature window of the dielectric permittivity peak becomes wider. Moreover, there is a clear frequency dispersion once x exceeds a critical value. A modified Curie-Weiss law has been proposed to describe the diffused phase transition, where γ and C are assumed to be constants. The parameter, γ gives information on the character of the phase transition: γ = 1 indicates a normal ferroelectric phase transition and γ = 2 represents a complete or ideal diffused phase transition. Figures S1(a2)-(c2) plot ln(1/ε -1/ε m ) as a function of ln(T -T m ) for the compounds in each system, respectively. The γ value is determined from the slope of the fitted curves using equation 1. A similar tendency is seen in the three simple systems, i.e., parameter γ increases with dopant concentration, x. Figures S1(a3)-(c3) show the polarization (P) versus electric field (E) loops at room temperature of compounds in each system. The P-E loop changes from fat to slim and finally to almost linear as x increase, at the same time both P max and P r decrease monotonically. The energy storage density can be calculated from these curves. 13 features are assembled based on casting a wider net of knowledge in terms of choosing features that could influence the objective. The definition of the 13 features are given in Tab. S1. The feature for a given compound can be calculated using the weighted method. For example, the feature P can be calculated by the following equation.
Where f Ba and P Ba are the mole fraction and polarizability of Ba element, respectively. We find that the features are not particularly linearly correlated, as shown in Fig. S2. The initial training data was divided into training data and test data randomly with a ratio 0.8/0.2. A support vector regressor with a radial-based kernel function (SVR.rbf) was built using the training data for each of the 13 features. The trained models were then applied to the test data. The mean squared error (MSE.error) calculated for both training data and test data for each feature, is shown in Fig. S3. The error bar is from the predictions of 100 repeats of the randomly divided training and test data. We can see that 'DB', 'NCT', 't' are the best performing three features. Also, the next 10 are quite similar in terms of the test error and we choose P because of its physical appeal in terms of polarization. We also used gradient boosting tree to rank all the 13 features by considering their relative importance. Figure S4 shows that 'NCT', 't', 'DB' are the best three, in agreement with that in Fig. S3. Section 3: Regression model performance evaluate Figure S5 shows the regression models based on the whole training data (182 compounds), (a)-(d) indicate the following four different algorithms, respectively. SVR.rbf and Random forest are the two with the best perfromance.
• SVR.rbf: support vector machine with a radial-based kernel function.
• Random forest (RF): an ensemble learning method that is trained on a multitude of decision trees and the output is the mean of predictions from individual trees.
• Gradient boosting (GB): building an additive model in a forward stage-wise fashion, in each stage a regression tree is trained to minimize the given loss function in the negative gradient direction.
• KRR: combines Ridge Regression (linear least squares with L2-norm regularization) with a radial-based kernel function. To evaluate the predictive performance of the regression model, we divided the whole data into two parts: a training data with 152 compounds and a test data with 30 compounds. Figures S6(a)-(d) shows the performance of various models, blue circles are the training data and red points are the test data. Red points suggest that SVR.rbf performs better than other models, especially at high energy storage density.  Figure S7 shows the predictive errors for several regression models. We evaluate the leave-one-out cross-validation error (CV.error) in the whole set of 182 compounds, and the mean squared error (MSE.error, based on 1000 models from bootstrap method) in the test set of 30 compounds used in Fig. S6. Considering both two errors, SVR.rbf is the better model and we use it to make predictions on the virtual data. Moreover, Fig. S8 shows that the two errors are very similar, indicating that our model is not over-fitting or under-fitting.   Figure S9 shows the predicted energy storage density as a function of the number of iterations, which has a similar tendency with the measured values shown in Fig. 3c in the manuscript.   Table S2 lists all the 32 new compounds synthesized in strategy I.
Section 4: Dielectric and ferroelectric properties of (Ba 0.79 Ca 0.21 )(Ti 1−x Sn x )O 3 system Figure S10(a) shows the permittivity as a function of temperature for different Sn 4+ concentration in the (Ba 0.79 Ca 0.21 )(Ti 1−x Sn x )O 3 system. The plot shows a similar tendency with that in Fig. S1(a1)-(c1), the phase transition product changes from normal ferroelectrics to relaxor mediated by a "crossover region". Figure S10(b) shows how the P-E loops change with Sn 4+ concentration. Section 5: Classification model performance evaluate Figure S11 shows the accuracy (error) of the classification model, we calculated both train error and predictive errors.
• Train error (Train.error) evaluates the model performance on the whole training data: all 183 labeled compounds were used to build a classifier and then applied on the same 183 compounds.
• Predictive errors include both cross-validation error (CV.error) and error on the test data (Test.error).
CV.error: leave-one-out cross validation was used to calculate the error from the whole set of 183 compounds.
Test.error: the whole set of 183 compounds was randomly divided into two parts: a training data with 153 compounds and a test data with 30 compounds, the classifier was trained based on 153 compounds and then applied on the other 30 compounds, from which we can calculate the Test.error.
We find that all the accuracies are higher than 0.9, demonstrating superior model performance of the classifier.  Misclassification of compounds can be an issue and we used bootstrap sampling to obtain the misclassification rate or frequency for every compound in the whole training data. That is, for each prediction, 183 samples were selected with replacement from the original 183 compounds and based on this data, a classification model was constructed and applied to the 183 compounds. This process was repeated 1000 times to obtain the misclassification frequency for each compound. The 19 compounds with high misclassification frequency ( 0.2) are listed in Tab. S3. 10 compounds with much higher misclassified frequency ( 0.4) are presented in bold.
Section 6: Dielectric and ferroelectric properties of 8 compounds in strategy II Figure S12(a)-(b) plot the P-E loops of 8 new compounds synthesized in strategy II.   Section 7: Distribution of the predictions of virtual space used in strategy II The crossover region consists of four steps or layers as described in the main text. Figure S14(a)-(d) show the distribution of predictions for each step, respectively. Inset in each panel is the amplification of the range with high predicted values. The reason we choose the crossover region of four steps from the boundary towards the relaxor side is illustrated in Supplementary Fig. S15, where the distribution of predictions for each step or layer is presented. For the first step, the predictions of large U re have very low frequencies, with the largrest density of 60 mJ/cm 3 . For the second and third steps, the predictions for the U re move to higher values and the frequencies also increase. Thereafter in the fourth step the tendency is in the opposite direction, i.e., both the U re and corresponding frequencies decrease. This suggests that the fourth or more steps are tending to move into the relaxor region.