Prediction of Operational Lifetime of Perovskite Light Emitting Diodes by Machine Learning

Perovskite light‐emitting diodes (LEDs) with advantages of high electroluminescence efficiency at high brightness, good color purity, and tunable bandgap, are believed to have potential applications in the next generation display and lighting technologies. Due to the complex degradation process, mathematic models to describe the degradation process of perovskite LEDs are absent. In this work, it is found that the mathematical fitting methods which have been widely used to describe the decay trend of organic LEDs and quantum‐dot LEDs, are unable to accurately predict the lifetime of perovskite LEDs. Then an ensemble machine learning model is developed, which utilizes data augmentation technique to predict T50 of perovskite LEDs based on features before T80, achieving an accuracy of 0.995. Furthermore, the model can also accurately predict the T90 lifetime of quantum‐dot LEDs (QLEDs) using features before T98, suggesting it is a useful tool to efficiently evaluate LED lifetimes.


Introduction
The efficiency and stability are two important evaluation parameters of light-emitting diodes (LEDs).Compared to efficiency testing, usually stability measurement is much more timeconsuming.3][4][5][6][7] In recent years, the stability of perovskite LEDs has been persistently improved, and the half-life (T 50 , 50% of its initial efficiency) has exceeded 100 h at a high current density of 100 mA cm À2 . [2,4]This means that with further improvement in stability, the traditionally experimental testing method will become very time-consuming.[17][18] So, one usually only experimentally tests T 95 of their lifetimes, and the rest lifetimes can be predicted by the exponential decay models.However, although it is generally accepted that the decay of perovskite LEDs are associated with the ion migration, the degradation mechanism of perovskite LEDs is much more complex. [19]And it is unclear whether the mathematical fitting methods used for stability prediction in OLEDs and QLEDs can still be applied in perovskite LEDs.Alternatively, it is useful to check whether we can predict the lifetime of perovskite LEDs by using machine learning (ML) algorithms, based on the existing limited lifetime database.
In this work, based on the typical literature data and our experimental data, we find that the performance of mathematical fitting methods is poor in predicting the lifetime of perovskite LEDs, with only an average prediction accuracy of 0.280.By developing an ensemble machine learning model that utilizes data augmentation technique, we can successfully predicted T 50 lifetime based on T 80 (80% of its initial efficiency) lifetime, with a prediction accuracy of 0.995.Moreover, our ensemble learning model can also be used for predicting the lifetime of QLEDs, suggesting it is a useful tool to efficiently evaluate LED lifetimes.

Results and Discussion
[22] We conducted the above exponential fitting methods for the effeciency decay characteristic curves of perovskite LEDs with good stability reported in literature, [1][2][3][4] and of our 20 perovskite LEDs.As shown in Figure 1 and Table SI-1, Supporting Information, we can observe that regardless of the fitting method used, there is a significant deviation between the predicted lifetime and the actual lifetime data.Among these, the results from monoexponential fitting were the best with the R 2 -score and root-mean-squared error (RMSE) were 0.280 and 38.028, respectively (Table SI-2, Supporting Information).Therefore, these results clearly indicate that the exponential fitting method is not capable of meaningfully predicting the lifetime or half-life of perovskite LEDs.
We then focus on developing ML models to predict the perovskite LEDs lifetime.We primarily utilized data on the lifetime of red and near-infrared perovskite LEDs as a case study due to the relatively long operation lifetime. [1,3,4,23,24]The device structure is indium tin oxide (ITO)/metallic oxide/perovskite Cs a FA b PbI c Br d (0 ≤ a, b, c, d ≤ 3.1)/poly(9,9-dioctyl-fluoreneco-N-(4-butylphenyl)diphenylamine) (TFB)/molybdenum oxide (MoO x )/gold (Au).We have collected lifetime data from 210 devices obtained under different current densities and temperatures, and observed that their half-lives varied between 0.6 to 137.2 h.(Details in Experimental Section).Note that the aforementioned 6 reported and 20 of our device data used for prediction are not part of the training set.These data were obtained through testing under constant temperature and current density for one data sample, but different data samples can have various constant temperature and current density.We attempt to predict T 50 of perovskite LEDs based on the features extracted from T 80 with suitable ML algorithms (Figure 2).
Feature engineering plays a crucial role in enhancing ML performance, so we first perform feature extraction on the lifetime database, where all these features are parameters acquired during the device stability testing process.From the lifetime database, features such as decay time (T ), time variable (ΔT ), luminance (L), and voltage variable (ΔV ) are extracted.To measure lifetime decay over a specific period, we introduced time variables (ΔT a-b = T a -T b , a < b) as a feature.Then, we employed variance threshold and Pearson correlation coefficient (PCC) to filter out redundant features. [25,26]In the variance threshold, the higher the value, the more information the feature carries.The variance scores of the features are presented in Table SI-3, Supporting Information, indicating that the voltage variable is an ineffective feature.The variance values for decay time, time variable and luminance are between 1 and 14, thereby signifying their importance.The PCC results are shown in Table SI-4, Supporting Information.The correlation coefficients between decay time (T 90 , T 85 , T 80 , T 75 , T 70 , T 65 , and T 60 ) and T 50 are high, ranging from 0.786 to 0.985, implying a strong correlation with T 50 .In addition, the correlation coefficient between T 60 and T 50 is higher than that between T 90 and T 50 , indicating that predicting T 50 becomes more advantageous as the time draws closer to T 50 .
Similar to the variance threshold results, the PCC values of voltage and luminance with T 50 are also low, indicating that these features are not critical.Therefore, decay time and time variable are selected as the primary features.
In order to accurately predict the lifetime of perovskite LEDs, we conducted a comprehensive study on multiple ML models, including least absolute shrinkage and selection operator (LASSO), [26] support vector regressor (SVR), [27] gradient boosting regressor (GBR), [28,29] Gaussian process regressor (GPR) [30] and elastic net (EN). [31]We also created an ensemble learning model (5EML) based on these five algorithms, and evaluated them using features T 100 to T 80 .We compared the effects of three types of features (Data A, Data B, and Data C) as input for the algorithms (Table SI-5, Supporting Information).We observed that the six algorithms with Data A as features had lower scores, consistent with our previous feature analysis conclusion that voltage and luminance are useless features (Table 1 and Figure 3).To obtain high-accuracy predictions with reduced testing durations, we compared the results of predicting T 50 using different testing durations.In our feature engineering analysis, we observed that feature closer to T 50 had higher correlation with T 50 , which may indicate its higher accuracy in predicting T 50 .However, this also implies the need for longer testing durations.Therefore, we predicted T 50 separately using different testing durations (T 90 , T 80 , T 70 , and T 60 ).Utilizing features extracted before T 90 , the prediction accuracy for T 50 was lower than 0.8 (Table SI-6, Supporting Information), indicating that features  The average prediction accuracy of 5-fold cross validation (The highest prediction accuracy among 5 tests).b) The prediction accuracy of data containing data augmentation.within these two periods lack sufficient data information to predict T 50 well.In contrast, features extracted from the T 70 and T 60 time periods yielded predictive accuracy of over 0.94, which is not significantly different from the 0.92 predicted by T 80 .Therefore, considering both high accuracy and short testing time, we mainly adopt T 80 .
Given the significant impact of sample size on prediction outcomes, we have incorporated data augmentation techniques into our approach to mitigate the issue of limited sample size in the existing database.Data augmentation techniques increase the diversity of sample data and enhance the generalization ability of models by applying transformations and perturbations to the original data.We employed a deep neural network with the fast gradient sign method to implement data augmentation, and expanded the original 210 samples to 420 by generating new data. [32,33]The distribution of the newly generated data samples is almost identical to that of the original samples by kernel density estimation, indicating that the new data accurately reproduces the statistical characteristics and distribution of the original data (Figure SI-1, Supporting Information).Therefore, both the newly generated and original samples were used to train our models.Comparing the six models mentioned above, we found that data augmentation resulted in higher R 2 -scores and lower RMSE scores.Moreover, the 5EML model still achieved the highest R 2 -score of 0.98.This indicates that data augmentation can further improve the predictive accuracy of 5EML model.
In order to further validate the reliability of our model, we utilized 5EML to predict the lifetimes of the aforementioned 26 perovskite LEDs, none of which were part of the original dataset (Figure 1, Table SI- SI-2, Supporting Information), respectively, indicating superior performance compared to exponential fitting methods, including monoexponential, biexponential, or stretched exponential functions.To verify the universality of our algorithm, we applied the 5EML model to predict the lifetime of quantum-dot LEDs. [34,35]The quantum-dot LED lifetime data samples were provided by the Jin's lab at Zhejiang University, and included blue, green, and red quantum-dot LEDs.Based on 5EML with data augmentation techniques, we successfully predicted the T 90 lifetime using only the features before T 98 .The average R 2 -score of the 5 experiments was above 0.96, with the highest R 2 -score reaching 0.98 (Table SI-7, Supporting Information).These results demonstrate the excellent application potential of our algorithm in predicting lifetimes of LEDs with different material systems.

Conclusion
In summary, our results show that the mathematical fitting methods used for stability prediction in OLEDs and QLEDs can not be applied in perovskite LEDs, and we have developed a machine learning approach (5EML) with the data augmentation techniques to successfully predict the LED T 50 lifetime of perovskite LEDs based on features before T 80 .We believe that integrating this algorithm into LED lifetime testing systems will greatly improve the testing efficiency of perovskite LEDs.This method has good universality and yields good predictive results for quantum-dot LEDs T 90 lifetime by using features before T 98 .Therefore, we believe the strategy we developed here can greatly improve stability testing efficiency and facilitate the realization of highly stable perovskite LEDs.

Experimental Section
Data Collection: The data samples originated from red and near-infrared perovskite LEDs, featuring a perovskite composition of Cs a FA b PbI c Br d (0 ≤ a, b, c, d ≤ 3.1).The device structure comprised indium tin oxide (ITO)/metallic oxide/perovskite/poly(9,9-dioctyl-fluorene-co-N-(4-butylphenyl)diphenylamine) (TFB)/molybdenum oxide (MoO x )/gold (Au).Stability testing was conducted using a Keithley 2450 source meter, a Keithley 2000 electric meter, a photodetector (Thorlabs PDA100A), and a semiconductor chilling plate (Nanjing Ouyi Optoelectronics Technology).The Keithley 2450 source meter provided a steady current, while the semiconductor chilling plate accurately controlled the temperature around the perovskite LEDs.The photodetector was utilized to monitor the light intensity throughout the stability tests.Moreover, all tests were carried out in a nitrogen-filled glove box.Each set of devices tested yielded data on time, luminance, current density, voltage, and temperature, with current density and temperature being self-determined during the tests.By normalizing the relationship between luminance and time, stability degradation data can be obtained, as illustrated in Figure SI-3, Supporting Information.Our collected data samples encompass a range of half-life values from 0.6 to 137.2 h, covering samples from the most stable to those with poorer stability.Additionally, our database includes data samples obtained under various current densities and different temperature conditions (Table SI-8, Supporting Information).
Linear Model: Two linear regression models, LASSO [26] and elastic net (EN), [31] simultaneously estimate sparse coefficient vector for model fitting.Mathematically, the objective function of linear model is: where n is the total number of samples, α is a scalar between 0 and 1, ω is the coefficient vector, x is the input data, and y is the measured T 50 time.
The first term, 1 2n jjxω À yjj 2 2 is calculated by ordinary least squares.The equation of the second term, PðωÞ, varies on different regularization algorithms.For instance, LASSO: where jjωjj 1 is the l1-norm regularization of the coefficient vector.For EN: where jjωjj 2 2 is the l2-norm regularization of the coefficient vector, and ρ is the ratio.To obtain the best hyperparameter α for both models, we apply an iterative search within the parameter range of ½0.1, 1.0.
Support Vector Regressor (SVR): SVR [27] is an application of Support Vector Machine [36] to regression problem.The model is optimized by maximizing the width of the margin and minimizing the total loss.The convex optimization problem of SVR can be defined by: min s:t: where jjωjj 2 2 ¼ ω T ω need to be minimized, which is equivalent to maximizing margin,ω is coefficient vector, b is intercept term, x i is the input data, y i is the measured T 50 time, ξ V i and ξ Λ i are slack variables of the lower bound constraints and the upper bound constraints, respectively, ε is deviation range, and C is penalty coefficient.We iteratively seek hyperparameter C in a large range ½10 0 , 10 6 .The chosen kernel function in SVR is radial basis function, which is defined by: where x and z are the two input samples, and γ is specified by hyperparameter gamma.We experiment with the best model performance when gamma is 0.0001.Gradient Boosting Regressor (GBR): GBR [17,23] is an integrated machine learning model that unites multiple weak regression algorithm, is defined by: where M is the total train times, h m ðm ¼ 1, 2, 3, : : : , MÞ is estimators for weak regression function, is defined by: where Lðy i , F mÀ1 ðx i Þ þ hðx i ÞÞ is loss value, hðx i Þ is the residual that minimizes the loss value, y i is the measured T 50 time, and We systematically search the maximum depth of individual weak regression estimators in the range ½1, 9, which limits the number of nodes in the GBR.
Gaussian Process Regression (GPR): GPR [28,29] defines the probability of distribution on the objective function based on Bayes theorem.The equation can be defined by: f $ GPRðmðxÞ, kðx, x 0 ÞÞ (10)   where m(x) is the mean function and kðx, x 0 Þ is the kernel function.The kernel function is the core of Gaussian process, we explore the impact of several kernel functions on the GPR, including rational quadratic kernel, dot-product kernel, and white kernel.
Ensemble Model: Ensemble model is built on the voting regressor. [37]he voting regressor is an ensemble meta-estimator that fits several base regressors, which are LASSO, EN, SVR, GBR, and GPR in our experiment.Then it combines the predictions from individual regressors to produce the final prediction.We perform a weighted average for each model: where h i ðxÞ is prediction of the base model, ω i is weight for each model, and HðxÞ is the final prediction.We fix multiple weight hyperparameters according to the performance of each base regressor.The sum of weight hyperparameters is 1.Data Augmentation: Data augmentation is performed using the fast gradient sign method [32] in association with a deep neural network [33] model.The model is given the input x, the target y and the loss Jðθ, x, yÞ for training.The fast gradient sign method of generating adversarial examples can be defined by: where ε is a small hyperparameter that we set the value to 0.08, θ is the parameter of model and ∇ x is the gradient of x.In our implementation, the generated adversarial examples X adv are added to the original data X to establish a comprehensive dataset for comparison across machine learning algorithms.Kernel Density Estimation (KDE): KDE [38] is performed by placing a kernel function around each data point, and then these kernel functions are weighted and averaged to obtain a smooth probability density estimation curve, which is defined by: where x is the location of probability density to be estimated, n is the number of data points, h is the bandwidth, x i is each data point, and K is used as a Gaussian kernel function, which is defined by: where u denotes the distance from the center of the kernel function.
Model Training and Evaluation Metrics: All models are evaluated based on R 2 -score and root mean square error (RMSE), R 2 -score is defined by: R 2 À score ¼ 1 À P N i¼1 ðy i À ŷi Þ 2 P N i¼1 ðy i À yÞ 2 (15)   where y i is the measured T 50 time, ŷi is the predicted T 50 time, y is the average measured T 50 time and N is the total number of samples.RMSE is defined by: where all variables are defined as above.All models are trained for the 5-fold cross validation with training and test set in an 8:2 ratio.The average and the best result of 5-fold cross validation of R 2 -score and RMSE are as evaluation metrics.Optimal hyperparameters when models achieve best performance show in Table SI-9, Supporting Information.Experimental Implementation: The data processing is constructed under the NumPy, SciPy, Pandas, Seaborn, and Matplotlib packages, while all models are implemented in the Scikit-learn and Pytorch packages.

Figure 3
Figure 3 presents the best fitting results on the 5EML model for three different data input methods: Data A, Data B, and Data C, indicating that the Data B and Data C input methods are superior to Data A. Although Data B only contains seven features, its R 2 -score and RMSE reached 0.92 and 5.28, respectively.In addition, our 5EML model obtained the highest score among the six models.To obtain high-accuracy predictions with reduced testing durations, we compared the results of predicting T 50 using different testing durations.In our feature engineering analysis, we observed that feature closer to T 50 had higher correlation with T 50 , which may indicate its higher accuracy in predicting T 50 .However, this also implies the need for longer testing durations.Therefore, we predicted T 50 separately using different testing durations (T 90 , T 80 , T 70 , and T 60 ).Utilizing features extracted before T 90 , the prediction accuracy for T 50 was lower than 0.8 (Table SI-6, Supporting Information), indicating that features

Figure 2 .
Figure 2. Illustration of lifetime prediction by ML.

Figure 3 .
Figure 3. Observed and predicted operational times for the feature-based input with 5EML as algorithm.a) Data A, b) Data B, c) Data C.The X-axis represents the true T 50 time and the Y-axis represents the predicted T 50 time by the 5EML model.When the X and Y values on the graph align closely, indicating a proximity between the true and predicted T 50 times, it suggests a better model performance.
1, and Figure SI-2, Supporting Information).The resulting R 2 -score and RMSE values were found to be 0.995 and 3.125 (Table

Table 1 .
Prediction results of different input and algorithms using features from T 100 to T 80 . a)