Performance Prediction and Experimental Optimization Assisted by Machine Learning for Organic Photovoltaics

The improvements of organic photovoltaics (OPVs) are mainly implemented by the design of novel materials and optimizations of experimental conditions through extensive trial‐and‐error experiments based on chemical intuition, which may be tedious and inefficient for exploring a larger chemical space. In the recent five years, data‐driven methods using machine learning (ML) algorithms and the knowledge of known materials/experimental parameters are introduced to OPV studies to help build a quantitative structure‐property relationship model and accelerate the molecular design and parameter optimization. Here, these recent promising progresses based on experimental OPV datasets are summarized. This review introduces the general workflow (e.g., dataset collection, feature engineering, ML model generation, and evaluation) of ML‐OPV projects and discusses the applications of this framework for predicting OPV performance and experimental optimizations in OPVs. Finally, an outlook of future work directions in this exciting and quickly developing field is presented.

DOI: 10.1002/aisy.202100261 The improvements of organic photovoltaics (OPVs) are mainly implemented by the design of novel materials and optimizations of experimental conditions through extensive trial-and-error experiments based on chemical intuition, which may be tedious and inefficient for exploring a larger chemical space. In the recent five years, data-driven methods using machine learning (ML) algorithms and the knowledge of known materials/experimental parameters are introduced to OPV studies to help build a quantitative structure-property relationship model and accelerate the molecular design and parameter optimization. Here, these recent promising progresses based on experimental OPV datasets are summarized. This review introduces the general workflow (e.g., dataset collection, feature engineering, ML model generation, and evaluation) of ML-OPV projects and discusses the applications of this framework for predicting OPV performance and experimental optimizations in OPVs. Finally, an outlook of future work directions in this exciting and quickly developing field is presented. machine learning (ML) approaches have successfully resolved the difficulties of modeling the relationships between materials properties and complex chemical/physical factors. [4c,12] In recent years, ML techniques in conjunction with computational chemistry have been used to construct QSPR models to shed light on OPVs. [4c,13] Efficiency prediction accuracy has been raised to promising levels (e.g., r > 0.7) using different ML algorithms either in fullerene or nonfullerene-based OPV devices. [12] Some high-performing donor and/or acceptor OPV molecules suggested by ML-assisted virtual screening have been successfully synthesized and experimentally verified. [14] For example, Wu et al. [14a] synthesized three nonfullerene acceptors (NFAs) for PBDB-T-based OPVs with %11% experimental PCE, within 0.6% difference of the PCE predicted by random forest (RF) models.
Besides designing new molecules, the precise tuning of experimental conditions such as donor:acceptor (D:A) ratio, solvent additives, crystallinity, light type, etc. is equally crucial for the performance maximization of OPV devices. [15] In addition, experimental verification for the predicted leading candidates is also restrained by these factors as they are highly related to synthesizability, cost, and stability. The stability and cost, as well as performance, could be considered as weighting factors by industrial figure of merit (i-FOM) during the manufacturing optimizations to facilitate commercialization. [8] Moreover, essential factors such as synthetic complexity and cost of active layer (AL) donor polymers could be described quantitatively. [4a,16] Aforementioned progresses made it possible to use ML for discovering relationships between device performance and all relevant experimental parameters. For example, An et al. [17] built an ML-OPV model by considering the deposition densities (DDs) as additional experimental descriptors based on a high-throughput fabrication dataset, thus successfully helping to find the optimal experimental condition for the roll-to-roll-processed PM6/Y6/IT-4F device to achieve the highest PCE at 10.2% to date.
In this review, first, we briefly present the ML workflow including the dataset generation, feature engineering, model construction, and evaluation in OPVs. Then, classifications and descriptions of ML descriptors are discussed in detail. Subsequently, we show examples of ML studies on predicting the performances of OPV molecules. In addition, how ML can be used for optimizing the device fabrication is discussed. Finally, we provide an outlook on ML perspectives for molecular explorations and experimental optimizations of OPVs.

General Workflow for ML-Assisted OPV Studies
The general workflow of ML studies to accelerate the discovery of new OPV molecules and the optimization of experimental conditions is shown in Figure 1, and the main steps are briefly introduced.

Dataset Generation
The definition of a valid dataset is a prerequisite for any datadriven methodology. To predict the macroscopic performance of OPV devices, ML-OPV studies usually rely on a dataset with sufficient experimental data, and, accordingly, the quality and size of this dataset affect the ML performance significantly. [12c,18] Generally, experimental datasets are somewhat biased (bias in reporting the best combination of materials and tendency to improve over the current best) and this creates difficulties for objectively evaluating how well the ML model will perform. [19] Furthermore, the OPV dataset obtained from experimental results ( Table 1) is usually much smaller than in other ML applications and the construction of data-driven models in OPV cannot be performed with routine methodologies but requires careful validations.
In the earlier stage of OPV research, most devices used fullerene acceptors (FAs) due to their advantages of high electron mobility and isotropic charge transport. [20] Therefore, earlier ML works for OPV were usually based on FA-based datasets. Padula et al. [9a] and Sahu et al. [21] built two FA datasets composed of 249 and 280 D/A pairs of small donor molecules, respectively. Later, Sahu et al. further extended the small-molecule OPV dataset to around 300 data points. [9b,22] Experimental datasets for FAbased OPVs with polymer donors were also constructed by Nagasawa et al. with over 1,000 D/A pairs collected. [23] At the same period, the ternary OPV dataset with 124 data points was constructed by Lee. [24] In recent years, NFAs have become a major focus of research in the development of OPVs. [25] In contrast to the widely used FAs, the optical properties and electronic energy levels of NFAs can be easily tuned, [26] leading to a rapid increase in PCEs for NFA-based OSCs, with values now exceeding 19%. [3] Since 2019, a few NFA-based OPV datasets with data points around 100-600 have been built by several groups. [13a,14a,b,27] Furthermore, Lee et al. [27c] and Hao et al. [27d] collected 135 and 157 experimental data points to construct ternary NFA-OPV datasets respectively.
To make the ML prediction applicable for general OPV systems, the diversity of FA-and NFA-based D/A pairs has to be accounted for in one dataset. Lopez et al. [28] reported the Harvard Photovoltaic Dataset (HOPV15) containing 350 experimental data points with mixed FA-and NFA-based pairs. Later, Padula et al. [29] and Zhao et al. [30] constructed another mixed datasets with %320 and 566 D/A pairs respectively, proposing a set of standard criteria for literature searches, in an effort to insure the reproducibility of the data collection and remove any possible bias in the definition of the dataset. Sun et al. [14c,31] constructed donor molecule-based experimental datasets with data points around 1,700-1,800 collected from the literature, in which the acceptor effect was neglected. These searches have been performed manually. Automatic searches attempted in different materials discovery areas [32] proved difficult in this context. The difficulty in accessing experimental data also reflects in the digitalization of chemical compounds, for example, generating simplified molecular-input line-entry system (SMILES) from literature images is imperfect. [33] In a given D/A system, performances at different experimental conditions can be also collected to build datasets for ML training to optimize the experimental parameters. For instance, Du et al. [34] considered ten processing parameters (D:A ratio, concentration, spin speed, solvents additives and their volume, annealing temperature and time for AL, material in electron transport layer [ETL], ETL annealing temperature/time) for over 100 processing parameter variations during the automated PM6/Y6 device fabrications.

Feature Selection
Features (descriptors) are the variables used in ML models, usually measurable via computational properties or characteristics of an observable phenomenon. The widely used features in ML-OPV studies include structural, electronic, and device parameters, which will be introduced in detail in Section 3. Feature selection methods have been adopted in OPV applications [21,22,27a,30] owing to several reasons such as reducing the dimension and training time of the explored space, simplifying models, and improving accuracy.
The selection of the most informative, discriminating, and independent features minimizing the error rate by feature  selection algorithm is crucial for efficient ML tasks. In general, wrapper, [35] filter, [36] and embedded [37] methods are three main categories of algorithms scoring a feature subset or performing as part of the model construction process to propose new feature subsets with kinds of evaluation metrics. Accordingly, one of the most common approaches used in OPV field is the recursive feature elimination algorithm, [38] removing features with low weights and computing feature importance [39] for selecting the most informative ones. It can be noticed that some authors prefer not to perform any feature selection, but instead use intuition and the size of the dataset to set a fixed number of descriptors known for being important factors related to OPV performance. [14b,29] 2.3. Building QSPR Models by ML Algorithms QSPR models can be built for predicting specific targets (e.g., OPV performance, stability, and synthesizability) in OPVs on the basis of using a reasonable dataset, descriptors, and suitable ML algorithms. A variety of ML methodologies have been applied for OPV. They have been presented in greater detail in numerous reviews [4c,12,13b,40] and textbooks, and [41] therefore their mathematical details are not repeated in this work. A group of methodologies including support vector regression (SVR), [42] support vector machine (SVM), [43] and kernel ridge regression (KRR) [44] can be described as an enhancement of conventional linear regression (LR) [45] methods to include nonlinear dependency of the descriptors and observables. Popular algorithms based on decision trees that have been used in OPV include RF [46] and gradient boosting (GB). [47] The simplest possible algorithm, k-nearest neighbor (k-NN), [48] based on the distance between the neighbor configurations, has shown to be surprisingly effective in ML-OPV models. Conversely, artificial neural networks (ANN), [49] which contributed to the popularity of ML in the big-data project, are often less powerful in the presence of the dataset of limited size as common in OPV.

Assessment of Prediction Accuracy
For testing the prediction accuracy of ML algorithms, the dataset could be divided into the training set and testing set, some of which have additional external set for further validations. ML works can be performed with the leave-one-out (LOO), k-fold, or leave-one-cluster-out (LOCO) [50] cross-validation methods in search of the optimal hyperparameters of each algorithm. The prediction accuracy to the training/testing set is used to evaluate whether the built ML model is performing satisfactorily. Different papers report different measures and, sometimes, the algorithms are tuned to optimize different measures.
Model performance metrics such as r, root-mean-square error (RMSE), mean square error (MSE), coefficient of determination (R 2 ), mean absolute percentage error (MAPE), and mean absolute error (MAE) can be adopted for regression models, which are defined as follows Here N is the number of data points in the dataset; R i and P i represent the real and predicted value; R i and P i are the mean values for the real and predicted value, respectively. varðR i Þ is the variance of the sample data. In the ML-OPV studies, some authors are interested in classifying candidates into well or bad performing sets instead of improving the accuracy of quantitative predictions. The classification accuracy is used to evaluate the performance of a classification model. [12a] Accuracy (A) is the ratio of making a correct classification, which can be described as where T and F represent True and False, respectively, N and P represent negative and positive, respectively. Hence, TP is the number of positive samples properly classified. Similarly, TN is the number of negative samples that are classified accurately. The term FP represents the number of negative samples that are classified as positive, while FN is the number of positive samples that are classified as negative. In OPV research, FP (it takes an experiment to discover) could be tolerated, while the FN is likely going to remain undetected and it is a lost opportunity. Conversely, the error rate (ER ¼ 1ÀA) represents ratio of incorrect classifications. Moreover, the precision (P) represents the fraction of positive classes that are classified as positive and the recall (R) is the fraction of positive classes that are actually positive, both of which are alternative types for accuracy in binary classification. They are often averaged in a single metric, the F 1 score (the harmonic mean of precision and recall), which is another indicator of the classification accuracy. Furthermore, false positive rate (FPR) is the ratio of negative classes that are classified as positive.
In addition, the receiver operating characteristic (ROC) curve [51] plots the relationship between R and FPR. The area under the curve (AUC) of ROC can be interpreted as the proportion of correctly classified samples. Thence, the closer the AUC value is to one, the more the reasonable classification model is considered.

Description of Main Features
The features that are used to build ML methods are vastly different in terms of accuracy, cost, and availability and their choice influences both the accuracy of the method and its ability to be applied to a very large number of instances. We present them in this section grouped into three classes corresponding to different information they relay ( Table 2).

Topological Information
The topological information from the molecule itself (molecular weight (W )/volume (V ), number of π-electrons in donors (N elec ), number of unsaturated atoms in the main conjugation path (N atom ), etc.) is a kind of feature without any experimental measurements or computational evaluations, which could be quickly and cheaply obtained. Therefore, it plays an important role in exploring large chemical spaces.
In many cases, the SMILES of molecules are adopted as structural descriptors, which describe the structural species by short ASCII strings with letters and symbols, containing information on atoms, bonds, rings, aromaticity, and branches of a molecule. [52] The development of cheminformatics, which has shown great success in the drug discovery using molecular fingerprints, has also greatly promoted ML-assisted material discovery. [53] The fingerprinting procedures associating a structural fingerprint with each compound have been developed and tested in drug discovery. Similarity measures have been widely adopted in drug discovery with very low-cost methods (e.g., similarity-based regressions). [9a] The accurate and easily accessible programming language expressions provide possibilities for building a highaccuracy ML model without any computational data which need complicated calculations.
There is a comparatively smaller variation in functional groups in organic electronic materials with respect to drug-like molecules; hence, there is an opportunity of developing more specific fingerprints in the future. For a specific ML-OPV analysis, the ASCII strings can be converted to fingerprints with an array of bits with the same length. [14c] It has been reported that the fingerprints with lengths above 1,000 bits contain substantial chemical information for ML-OPV predictions. [14c] Simple "1-hot" binary vectors as indicator variables encode the acceptor molecules as 1 if present and 0 if absent in the ML model induced in the literature. Signature descriptors (a fingerprint whose elements indicate the number of each type of fragment that exists in each molecule) adopted in the ML studies can be generated for numerous materials within a short time and could generate fingerprints containing numerous fragments. [54] By pointing out the contributions of certain fragments to concrete properties, synthesis of new molecules or targeted modifications to existing materials become more approachable and fruitful for material scientists.

Electronic Structure Descriptors
More descriptors such as the electronic structure properties which do not depend only on the "shape" and (local) nonbonded interactions are considered in ML-OPV analysis going beyond topological descriptors. These electronic structure descriptors are directly influencing the microscopic optoelectronic processes in OPV and accordingly their quantitative performances. For instance, the simplest electronic structure parameters include energy levels of highest occupied molecular orbitals (HOMO) and lowest unoccupied molecular orbitals (LUMO), dipole moment (μ), and vertical ionization potential (IP(v)), which can be easily obtained from a ground-state quantum chemical (QC) calculation. [21,30] It is well known that the ability of photoabsorption is affected by the wavelengths (λ nm ) and the associated oscillator strength ( f ) of the photosensitizer molecules' excited states. To further consider the effect of detailed excited-state processes, the excited-state QC calculations can be carried out to provide other descriptors such as the parameters of λ nm and f we just mentioned and also the properties of other dark intermediate states (charge transfer ones and triplets). [9b,21] Parameters that affect the charge generation and charge dynamics such as hole-electron binding energy (E bind ), the energetic difference of LUMO of donor and acceptor (E DA LL ), reorganization energy (λ), etc. can be widely used as additional descriptors, which could be obtained from QC calculations. [9c] Polarizability (P) is strongly correlated with molecular V [55] and there is a significant correlation between λ and molecular size. [56] The corresponding calculations normally require a few central processing unit (CPU) hours and so they are suitable to deal with the dataset of %10 3 -10 4 molecules but they cannot be extended to explore very large number of hypothetical molecules.

Macroscopic Properties
Experimental parameters can also be used to develop data-driven models for OPV. Examples of available parameters during experimental setups are concentration, solvents, additives, D:A ratio, annealing temperature/time, thickness, spin speed, and root-mean-square roughness (RMS) of atomic force microscope images. In addition, synthetic complexity [8] and scalability factor [4a] could quantitatively characterize the synthetic cost, targeting the synthesizability of novel organic chemistries, which are useful in screening high-performing materials with low cost and easy synthesis. However, some important physical quantities in OPVs are challenging to characterize directly by experiments, such as solubility/miscibility and domain sizes, which play key roles in the film-forming process. It has been found that topology fingerprint encode very well properties related to lipophilicity (a feature related to miscibility, defined as the partition coefficient between n-octanol and water [57] ), which are also crucial for OPV. [30] Also, recent works indicated that solubility can be predicted by ML methods with cheap and easily accessible factors [58] (e.g., Hansen solubility parameters [HSP] could be predicted through ML models by molecular size, electrostatics, charge density, and structural information [27a,59] ). The aforementioned experimental parameters/properties, either obtained from direct measurements or theoretical modeling, are all adopted as descriptors for building a reliable QSPR model in the field of OPV.

ML-Assisted Performance Predictions of OPV Molecules
In this section, the recent research advances in ML-assisted performance predictions on OPV performance will be briefly overviewed. As mentioned in Section 2, the experimental OPV datasets are usually sorted into two main categories according to the types of the acceptor molecules. Therefore, here we first summarize different ML works based on FA-and NFA-based datasets respectively; then, recent ML predictions based on unified FA-/NFA-based datasets are introduced; subsequently, we discuss high-throughput virtual screening (HTVS) works based on the above ML models; finally, several experimental verifications are summarized.

FA-Based Dataset
Sahu et al. [21] identified 13 important structural and electronic structure descriptors ( Figure 2) to describe an FA-based dataset composed of 280 small donor molecules by an in-depth understanding of the microscopic mechanism of OPVs. Among them, one is the structural descriptor (number of unsaturated atoms), while others, such as polarizability, vertical ionization potential, and hole-electron binding energy, are related to the ground-and excited-state properties obtained by QC calculations. A range of ML algorithms including k-NN, RF, GB, and ANN were used to construct regression models for predicting PCE. Both tree-based models obtained remarkable prediction power with r at 0.79 and 0.76, respectively, as shown in Figure 2a,b. The importance of descriptors shown in Figure 2c,d suggests that the hole-electron binding energies are the most informative descriptors for GB and RF models. In addition to improving the overall performance of OPV, it is also necessary to meet other requirements for purpose-specific devices, such as high open-circuit voltage (V OC ) for solar-fuel energy conversion and high short-circuit current density ( J SC ) for solar window applications. A follow-up work [22] constructed ML models to predict other three important device parameters, V OC , J SC , and fill factor (FF). On the basis of the extended experimental dataset of 300 donor molecules, the prediction accuracy (r) of the GB model for V OC , J SC , and FF reached 0.67, 0.66, and 0.71, respectively. Padula et al. [9a] explored the possibility of using fewer physical descriptors but including more structural fingerprint descriptors in ML models based on an FA-based dataset composed of 249 D/A pairs with small donor molecules. Utilizing the Tanimoto similarity index between donors and Euclidean distance of electronic properties, they achieved high accuracies for the ML predictions of PCE. KRR and k-NN algorithms yield good prediction results ( Figure 3) using both topological and electronic descriptors (r ¼ 0.68 and 0.61 respectively). At the same time, Morgan fingerprints (r ¼ 0.68) were found to perform better than Daylight fingerprints (r ¼ 0.59) in KRR models.
Chen [60] reported an ML study based on the FA-based dataset composed of 1,000 experimental parameters for polymerfullerene-based OPVs [23] using SVM and RF algorithms with cheap descriptors. Results indicated that the PCE values of OPV devices are predicted by the chemical information of polymer donors as the only inputs with r > 0.60. In addition,  considering the promising ternary blended structure technology for OPV (incorporating third component in D/A blend) [61] but the great difficulty in exploring the huge combination space of three components, Lee [24] reported an ideal model of application of ML methods for ternary OPVs, by neglecting the synergetic effect. A dataset of 124 experimental devices and molecular energy levels was constructed to build ML models for ternary OPVs by adopting RF, GB, k-NN, LR, and SVR algorithms. It is demonstrated that RF models can also perform well for these ternary OPVs with r % 0.77 between the predicted and real PCEs.

NFA-Based Dataset
A major challenge in the construction of data-driven models for NFA is that they are often tested in combination with a variety of donors, thus complicating the analysis. A convenient approach is that attempted by Mahmood et al., [27a] who built a dataset of 283 experimental solar cells containing only the most common donor poly(3-hexylthiophene) (P3HT). As shown in Figure 4, both classification and regression ML algorithms were implemented with 3,000 molecular descriptors generated from the online chemical dataset. The classification accuracy of k-NN, SVM, RF, and ANN reaches 0.86, 0.88, 0.87, and 0.89, respectively. The regression analysis for PCE predictions suggests that SVM is the best model independent of the number of selected descriptors, whereas LR model is the best one for HOMO and LUMO energy-level predictions.
Wu at al. [14a] proposed a new approach to encode the chemical structure of polymers; hence, they were able to perform prediction of PCE for polymer/NFA heterojunction. The RF and boosted regression trees (BRT) showed better prediction accuracy with r at 0.70 and 0.71 and RMSE at 1.17 and 2.42, respectively. Recently, Wen et al. [13a] constructed QSPR models with molecular descriptors and parameters related to morphology (D:A weight ratio and RMS roughness of atomic force microscope images) for NFA-based OPVs with more polymers chosen as donors. As shown in Figure 5, the constructed ML voting model (linear combination of the predictions from several regression models) on the basis of the engineering of structural, electronic, and device descriptors for a dataset containing 351 D/A pairs showed the best prediction accuracy (r > 0.8). The adopted experimental descriptors in this study are useful to improve the accuracy; however, it is an ongoing challenge to use ML to design new materials. Additional optimization works have to be implemented to get reasonable values for these experimental descriptors in new materials. For example, they conducted theoretically inverse optimizations of device specifications (RMS and D:A weight ratio) for screened D/A pairs, which provide feedback to experiments.
Some authors have successfully applied the previously developed model for FA to NFA like Lee [27c] with RF approach to predict the PCE for ternary blends achieving R 2 > 0.80. Another possibility is to focus on predicting some more specialized device characteristic which is likely to be easier to model. In this vain, Malholtra et al. [27b] used RF, GB, SVR, and ANN algorithms to predict nonradiative voltage loss. The GB algorithm with HOMO þ LUMO þ E g þ RDKit/MACCS descriptors delivers overall better results (r % 0.86). Only molecular fingerprints as descriptors could produce good results (r ¼ 0.78 and 0.73 for GB and SVR) as well.

Mixing of FA-and NFA-Based Dataset
The most well-known OPV dataset containing both FAs and NFAs is the HOPV15 dataset containing data for 350 experimental solar cells, mostly (273) fullerene based, with a greater proportion of polymeric donors (220) and bulk heterojunction experiments (270). [28] In 2019, Paul et al. [62] applied extremely randomized tree learning models to train on HOPV15 to predict HOMO energy levels for donor molecules with structural information as inputs, and they got high accuracy for prediction (%MAE of 1.91% and 1.97%). Meftahi et al. [54] used the Bayesian regularized ANN with Laplacian prior (BRANNLP) algorithm to train on HOPV15 dataset with signature descriptors for donors and 1-hot descriptors (1 means interested motif existing in the molecule, 0 as absent) for acceptors. The suggested descriptors have the advantage of being chemically interpretable building blocks that could be easily accessible. Their results suggest reasonable prediction accuracy for photovoltaic properties. Moreover, the key motifs that contribute to potential properties have been identified by sparse feature selection, which is consistent with microscopic simulations. It was concluded that the choice of descriptors is much more important than the selection of ML algorithms.
Padula et al. [29] and Zhao et al. [30] also constructed unified FA-/NFA-based datasets with %320 and 566 D/A pairs   respectively, considering only bulk heterojunction and molecular donor and acceptor. Especially the latter work contains a greater portion of NFA, about 82% of the total, reflecting the greater importance of NFAs in recent years. Padula et al. [29] used several ML models (KRR, Gaussian processes regression [GPR], [63] SVR, and k-NN) for the OPV performance predictions considering electronic properties and structural information, which produced impressive accuracy (r ¼ 0.78). The reliability of the proposed model was verified by recently reported high-performance D/A pairs. Some of the screened promising D/A candidates in the dataset have been reported, again proving the reliability of predictions. The results suggest that more precise models could be obtained by widening the variety of experimental data. Zhao et al. [30] explored the effect of increasing the number of descriptors and the size of datasets on ML studies for predicting efficiencies of all-small-molecule OPVs. Descriptors used in this work could be classified into structural (fingerprints) and physical ones (energy levels, molecular size, light absorption, and mixing properties). Results indicate that a larger dataset can enhance ML prediction power to achieve a high prediction accuracy (r ¼ 0.73). Moreover, the addition of some descriptors (excited-state and miscibility properties) cannot significantly improve the prediction accuracy as they have been encoded in structural information, leading to minimal/redundant contribution. According to the discussions of the above three subsections, it can be found that, once the dataset is sufficiently large, diverse, and homogeneous (of the order of a few hundred experimental data points), a variety of ML algorithms can produce QSPR models with predictive accuracies within a broadly similar range (0.60 < r < 0.85) despite the very different selections of the descriptors.

Screening Novel Chemical Materials/Combinations
To identify new promising OPV materials, it is desirable to explore the vast chemical space for appropriate organic semiconductor molecules, combining existing and new fragments with the almost limitless possibility of further functionalization. To this end, HTVS provides an effective approach to search for potential candidates for a given target property. Through screening a large number of compounds using computer-driven  , and the oscillator strength were investigated for top candidates with predicted PCE larger than 9.0%. [65] If descriptors in ML models are only topological, the number of compounds that can be computed is almost limitless because the evaluation is extremely cheap and the problem is only the generation of the sample materials. However, an accurate ML model requires QC calculations and so the number of molecules that can be tested is of the order of thousands. In this way, the field of HTVS contributes to create a dataset of hypothetical molecules to be evaluated by ML methods.
Recently, the well-trained ML models combined with HTVS have been implemented to accelerate the discovery of OPV molecules. Meanwhile, generative models which are trained with molecular structures and their corresponding properties to generate new molecules with desired properties have been adopted for novel chemical compounds. [66] This is referred to as a kind of inverse design [67] aiming at discovering ideal molecules starting with desired functionality and ending in chemical space. In ML-OPV studies, researchers have generated novel polymer-based electron donors [68] and NFAs [9c,69] by generative models. The following examples like the work reported by Sahu et al. [9b] were more combinatorial (combining potential building blocks in many possible ways). They reported HTVS of 10 170 donor molecules building from 32 fragments and 10 possible arrangements of donors, π-spacers, end-capping units, and acceptors. 1,000 lead candidates were screened out by a gradient boosting regression tree (GBRT) model with predicted PCE larger than 7.5%. The importance of building blocks was identified by Z-scores; as a result, benzodithiophene (BDT), dithieno-benzodithiophene, and napho-dithiophene are identified as promising building blocks. Moreover, 126 promising candidates were screened out with a predicted PCE larger than 8% by ANN and GBRT models. Sun et al. [31] identified 15 important building blocks from 1,758 different donor molecules using the RF model with in-house-designed fingerprint descriptors, as depicted in Figure 6. Through screening 18 960 donor candidates, they discovered that 20 molecules have good OPV performances when paired with Y6 acceptor. 6,337 molecules in the virtual material library were identified as promising donor candidates with predicted PCE > 8% by the GBRT model, among which 20 candidates were selected to construct OPV devices with Y6 acceptor. They built a Y6-based OPV dataset containing 44 OPV devices aimed at producing a suitable ML model for Y6based OPVs. The ML results generated by GBRT yield acceptable prediction accuracy with RMSE of 2.20 and r of 0.74. There are 5 out of the 20 donor/Y6 pairs having predicted PCE larger than 15% (the largest PCE in the training dataset was 15.58%).
As seen in Figure 4, Mahmood et al. [27a] designed >3,000 small molecules of NFAs using various building blocks and 87 acceptors with predicted PCE > 7.5%. These molecules from a number of known fragments in OPVs were screened first by LR regression for HOMO and LUMO values and second by SVM regression for predicted PCE values. RF was chosen among alternative ML models to predict HSPs for the screened NFA molecule and identify the best "green" solvent for them. This example illustrates how ML can contribute in different ways to distinct aspects of the OPV design.
These HTVS studies considering the synergistic effect of D/A pairs have been reported recently. Wen et al. [13a] generated almost 2 million potential D/A pairs constructed from 9,963 donors and 194 experimental NFAs ( Figure 5). It should be noted that the descriptors are distinct for D and A; thus, the computational cost scales as the sum of the number of donor and acceptor molecules instead of their product. Only three acceptors with structural similarity to Y6 molecule were found in these 1,501 D/A pairs whose predicted efficiencies are larger than 14%. The performance could be further optimized through tuning experimental conditions. For instance, the predicted PCEs of one of the pairs are in the range from 13.75% to 16.25%, with the best D:A ratio <0.8 and RMS ranging from 0.8 to 1.7 nm, respectively.

Experimental Verifications for Novel D/A Pairs
In recent years, there have been a few successful experimental corroborations of the new suggested OPV molecules by ML models, hence showing predictive power of the ML-guided rational molecular design.
Nagasawa et al. [23] preliminarily selected 1,000 molecules with predicted PCE > 10% by Scharber's model from the website of CEP, which could be used as donors in OPVs. Then, 149 molecules were generated with predicted PCEs in the highest region from the dataset of 1,000 molecules assisted by the supervised ANN and RF algorithms. Subsequently, they chose one molecule for experimental validation after considering synthesizability and π-conjugation. However, the experimental PCE (0.52%) mismatched the predicted ones (5.0-5.8% for different side chains); the failure may be caused in part by the poor performance of their ML model and the neglect of processing parameters.
Lin et al. [70] predicted the efficiency of BO2FC8/m-ITIC-OR-4Cl pair by the RF algorithm, and the predicted result (11.2%) was found to almost perfectly match the experimental one (11.0%). At the same time, Sun Figure 7a) through images and ASCII strings on a dataset with 1,719 OPV donor materials to classify the materials into "low" and "high" performance (PCE < 3% were regarded as "low" and PCE > 3% was regarded as "high"). Ten designed donor molecules were chosen for sandwiched OPV devices as experimental validations (Figure 7b,c). As shown in Figure 7d, eight molecules were proper characterized into the correct class, meaning that the predicted PCEs are consistent with its experimental values with minor differences, indicating that with proper methodologies and solid datasets favorable ML-assisted predictions are feasible.
In 2020, the ML-assisted polymer/NFA combination screening for experimental validations has been reported by Wu et al. [14a] As demonstrated in Figure 8, over 32 million new D/A combinations were generated based on the structural fragments of polymers and NFAs in the dataset. Six new D/A combinations (with PM6 and PBDB-T as donor material) with high predicted PCE by RF and BRT models and easy synthesis were selected for experimental validations. Encouragingly, five out of six real PCEs resemble to the predicted ones. Specifically, predicted efficiencies of two out of three devices (13.18% and 15.71% predicted by RF model) are relatively close to experimental ones (10.52% and 13.33%) for PM6-based systems, but experimental and the predicted PCEs using the RF model of PM6/Y-ThCH3 are not at the same level (6.67% vs. 10.41%). As for PBDB-T-based systems, the predicted PCEs using RF model (11.49%, 11.64%, and 11.32%) are all close to the experimental ones (11.02%, 11.08%, and 11.19%). The deviation between the predicted and experimental value is inevitable as PCEs of OPV materials are sensitive to the processing conditions, materials purity, external environment, etc.
Hao et al. [27d] tested the reliability of ML models by experimental characterizations for the ternary OPV dataset. Their RF model is able to correctly classify PM6/Y6/IT-M combinations into the high class with PCE > 16% but however fails to distinguish highperformance PM6/Y6/IT-4C combinations. Kranthiraja et al. [14b] trained ML models on 566 polymer/NFA pairs with improved accuracy (r ¼ 0.85 for RF). The polymer donors in both datasets (1,203

ML-Assisted OPV Device Optimizations
The complexity of the OPV not only comes from various molecular designs including the choice of the donor and acceptor, but also from the device technologies such as chemical synthesis, photocurrent composition, film forming, and photostability. [71] Although in theory novel materials could be generated through ML models, it is challenging to characterize their synthesizability and experimental process. Hence, some of the latest ML approaches tackle experimental process optimization, and a brief summary ( Table 3) in this trend is included in given review. Exploring synthesizability and optimum experimental conditions for novel materials has aroused increasing interest. [72] The computer-aided synthesis planning (CASP) program combined AI with organic chemistry to improve the likelihood of experimental success [73] has been applied in drug or drug-like substances. [74] The AI-driven automated synthesis for novel functional materials is attractive and challenging. [75] ML methods are used to explore the performance maximum and stability in the experimental process to ease and speed up validation of novel chemical candidates. Data are more homogeneous (obtained  www.advancedsciencenews.com www.advintellsyst.com typically in the same lab by modifying some experimental conditions) and denser in these automated works. While ML models have made these predictions quite accurate, it should be noted that the tuning of experimental parameters is conventional engineering practice and well grounded in the field of optimization. [76] While all the examples seen in this section have only one (or very few) combination of D/A compounds, automatic experiments are much more challenging when the chemical components change. The scope of this review only focuses on several applications regarding efficiency, stability, and photocurrent. The involved fundamental definitions/concepts during high-throughput experimental processes can be found in the study by Rodriguez-Martinez et al. [15] The PCE of a given OPV system is influenced by many processing parameters. Cao et al. [77] applied SVM methods with a radial basis function to fit PCE, J SC , and V OC with total concentration, spin speed, and three different donor concentrations and increased the probability of finding a true optimum in optimizing OPV devices. The given approach can be used in searching for areas of interest of processing parameters for further optimization. Kirkey et al. [78] used the same method to find optimal experimental processing parameters for all-small-molecule OPV devices consisting of one donor molecule DRCN5T and different NFAs (ITIC, IT-M, IT-4F) for each device. During their analysis, topographical maps were produced for the processing parameters related to PCE to visualize the relationships between the parameter and PCE without prior knowledge of novel materials. When the device parameters are continuous variables, the model can be represented graphically as a cross section of the multidimensional function and the results are more easily interpretable.
An et al. [17] trained the RF regression model for targeting PCEs of 2,218 devices (fabricated with various compositions and thicknesses) using experimental DDs (referred to as DD, a feature that contains the thickness and ratio between components) of each material in the PM6/Y6/IT-4F ternary blends as descriptors. Predicted PCEs of all the combinations of limited DDs for PM6/Y6/IT-4F are shown in Figure 10a. The 2D graphs converted from 3D graph could clearly present different optimum compositions for different thicknesses (total DD [TDD] of %20, %40, and %60 μg cm À2 could be regarded as thin, middle, and thick films, respectively), which are useful for OPV devices with a specific thickness. Then, the predicted PCEs were presented with higher resolution; thus, more datasets were considered (8 000 000 datasets varying in experimental parameters without changing molecules). Datasets with PCE > 8% were filtered out and the absolute DD values were converted to relative compositions (PM6 and Y6 fraction), as shown in Figure 10b. Consequently, data points at each xy coordinate are shown in Figure 10c with the best printable formulation (BPF) at PM6/ Y6/IT-4F ¼ 1:1.22:0.17. The best efficiency formulation at PM6/Y6/IT-4F ¼ 1:1.08:0.27 at 28.25 μg cm À2 TDD was also found. The BPF solution for experimental validation obtained a PCE of 8.85% and the BEF solution showed a higher PCE (10.2%) from roll-to-roll-produced OPVs.
The relationships uncovered by ML methods are useful not only for optimizing the device performance, but also understanding the physical roles of the parameters, which are crucial for building an interpretable QSPR model. Recently, Rodríguez-Martínez et al. [79] used Bayesian and random decision forest algorithms combined with high-throughput experimental screening to rationalize photocurrent composition (referred to as J SC -vol%) space for multicomponent binary NFAs-based OPV systems. The combinatorial library consisting of 24 000 data points with a large number of combinations of thickness and composition (D:A ratio) was generated by blade coating and subsequent imaging heterogeneous film features, which could be ideal seeds for ML algorithms. As demonstrated in Figure 11, 23 descriptors including optical descriptors and electronic descriptors of donor and acceptor materials, respectively, and nine descriptors built from fundamental magnitudes were selected for 8,000 data points to predict J SC -vol% of eight D/A blends (each binary contains 1,000 data points in the first run) and they achieved impressive accuracy with MAE of 0.09. Despite the great prediction power of this ML method, it has some limitations including month-scale times for training, poor extrapolation, and Bayesian random decision forest; b) Bayesian neural network; c) spin speed, total concentration, donor fraction, concentration of additive; d) solution concentration, donor fraction, temperature and annealing duration; e) D:A ratio, concentration, spin speed, solvents additives and their volume, annealing temperature and time for AL, material in the ETL, ETL annealing temperature and time; f ) D:A ratio, thickness; g) weight fraction of materials in AL.
uninformative utilization of descriptors. Thus, they adopted RF algorithm training on eight D/A blends with 23 descriptors to improve prediction capability. By extrapolating a single RF model trained on unseen D/A blends, they obtained MAE < 0.20 in Figure 10. Predicted PCEs of 125 000 deposition parameters created by ML and extracted prediction data to find the performance trend and thickness tolerance depending on the composition. a) Predicted PCEs of all possible combinations of the PM6/Y6/IT-4F blend (up to 50 μg cm À2 DD of each material, 1 μg cm À2 resolution) created by ML (represented in the top-left illustration) and PCE variations in thin (TDD as %20 μg cm À2 ), middle (%40 μg cm À2 ), and thick (%60 μg cm À2 ) films depending on the composition. The PM6 fraction represents the DD of PM6. b) Deposition parameters predicted to be >8% PCE in a 3D parameter space. c) Counts of the devices in a 2D composition map. Reproduced with permission. [17] Copyright 2021, the Royal Society of Chemistry. Nevertheless, the proposed model fails in predicting J SC -vol% of P3HT/PC 60 BM binary due to the absence of highly semicrystalline systems in the training set. In fact, besides PCE, other device properties like V OC , FF, J SC , and stability [80] can be also modeled and optimized via ML methods. David et al. [71] used supervised learning in a sequential minimal optimization regression model (SMOreg) training on a dataset containing 1,850 entries of device properties (e.g., substrate type, environmental conditions, light type, temperature, and relative humidity) to predict stability and the initial PCE of OPV devices with r > 0.70. They provide methods for material identifications in terms of improved stability and top performance. Langner et al. [81] studied photostability for four-component AL blends (P3HT/PCBM/IDTBR/PTB7-Th or PBQ-QF), based on an extended ChemOS software platform, [82] which supports Bayesian neural networks (BNNs) ML-assisted experiment planning for automated experimentations. BNNs are trained on 1,041 experiments for PBQ-QF and PTB7-Th blend systems to predict the photodegradation measured for individual polymer blends (PBQ-QF and PTB7-Th blend system) by the different compositional distances of weight fractions for four components. The results produced correlation coefficients of 0.88 (0.87) on the PBQ-QF (PTB7-Th) blend system, indicating good prediction ability of the BNNs. In addition, the results suggest that PBQ-QF blends improved the stability over PTB7-Th blends.
Recently, Du et al. [34] provided the high-throughput automated platform, AMANDA Line One, which could achieve reliable screening of photostability and efficiency for PM6/Y6 OPV materials and devices within 70 h (Figure 12a,b). The processing parameters such as D:A ratio, spin speed, annealing temperature, additives, ETL material, and other processing parameters (Figure 12c) could be automatically investigated on the platform. The part of the platform used for layer deposition (Figure 12d), layer image by the camera, UV-vis absorption, current density-versus-voltage ( J-V ) measurements, and offline degradation testing (Figure 12e) could be automatically achieved. The device reproducibility (Figure 12f ) in terms of performance and optical properties is relatively good and consistent. As Figure 11. The photocurrent-composition prediction workflow for binary OPV blends is divided into three main blocks. First, the generation of parametric libraries by blade coating on functional devices in the form of lateral gradients in the AL thickness and the D:A ratio. Second, the high-throughput photovoltaic characterization by means of colocal Raman spectroscopy and photocurrent imaging, which serves to correlate the local device performance with the variation of the target features (thickness and D:A ratio). Third, AI algorithms are trained on the experimental datasets using intrinsic fundamental descriptors of the blended materials. In the last step, the AI models are exploited to make predictions of the photocurrent-composition dependence for materials in and outside of the training dataset. Reproduced under the terms and conditions of the CC BY license. [79] Copyright 2021, The Authors. Published by the Royal Society of Chemistry.
www.advancedsciencenews.com www.advintellsyst.com presented in Figure 12g, GPR analysis was performed on %100 variations (including variations in D:A ratio, concentration, spin speed, annealing temperature/time for AL and ETL, additives, additive volume, and ETL material) of PM6/Y6 OPV devices with measuring optical features as descriptors for predictions of photovoltaic parameters (PCE, V OC , FF, and J SC ). The results show that the ratio of RMSE train /RMSE test was between 0.9 and 1.2, implying good generalization. Also, RMSE of 10 meV for V OC is obtained, implying reliability of AMANDA Line One. The GPR analysis was also conducted for predicting photostability by optical features on 48 processing conditions with 84 OPV devices. They found that all devices with initial PCE > 12% and burnin losses <10% after 50 h from stability screening featured a high spin speed and annealing temperature of AL < 100 C. Owing to the generality, this kind of automated research line can be potentially extended to other research fields like PV technologies, photodetectors, and transistors, enabling self-driven experimentation feasible.

Conclusion and Outlook
ML models as discussed in this review have been increasingly integrated into the workflow of predicting properties, screening candidates, and optimizing devices for OPV systems. In general, many kinds of ML models are trained with a large dataset (Table 1) collected by numerous efforts from the experimental literature to probe the best regression/classification prediction accuracy. Simple structural information has shown great potential in OPV performance predictions. However, the microscopic electronic structure properties such as energy levels, absorption ability, and molecular size are important in explaining photovoltaic phenomena and accordingly building the interpretable ML models. Moreover, a dataset with sufficiently large size, diversity, and homogeneity is necessary for achieving a satisfactory prediction accuracy. Many of the reported works have achieved reasonable prediction accuracies and a few virtual screening works based on ML models with experimental validations have been run, fabricated by the robot using the same parameters with 36 normalized absorption spectra of AL films taken in the center area of each solar cell. g) Workflow for evaluating OPV materials in terms of efficiency and photostability with GPR-based data analysis. The spectra were deconvoluted into different components that were quantified by spectral peak center energy (C), peak width (W ), and peak area (A) of PM6 or Y6 ordered (o) and amorphous (a) domain contributions. Reproduced with permission. [34] Copyright 2021, Elsevier.
www.advancedsciencenews.com www.advintellsyst.com conducted with promising outputs. In addition, ML methods also help the optimizations of experimental process parameters successfully. To further improve the performance and stability of new OPV devices, a plethora of works could be conducted with data-driven methods. First, more attention should be paid to the extrapolation ability of ML models for OPV. Most of the currently proposed models can predict molecules with known units or blocks in the dataset. However, extrapolations on completely novel donor, acceptor, and D/A pairs during the predicting performance and device optimization process are highly challenging but very attractive.
Moreover, because of the small size of the experimental OPV dataset, which prevents the application of more powerful ML algorithms like deep learning, the choice of descriptors is crucial in ML-OPV studies, for building an ML model with better generalizability and interpretability. However, some computational or experimental parameters (such as solvent type, additives, crystallinity, D:A ratio, processing temperature, etc.) directly related with OPV performance are nontrivial to obtain. Thus, increasing the size of the dataset (especially improving the types of NFA structures) may improve the prediction accuracy but the data are collected by many laboratories and it is not possible to uniformly collect data on the miscibility/solubility of new materials, often available in small quantities in a single lab. For these reasons, it can be convenient to use computed properties that describe complex properties of OPV.
Furthermore, inverse design (combining ML and HTVS) of OPV materials with targeted properties both in donor and in acceptor can be further explored for chemical exploration in a large space. The developing of more technical fingerprints of OPV materials could accelerate this process, increasing chances to find completely new materials with purpose-specific properties.
In addition, experimental verifications of novel D/A pairs screening by HTVS under the help of ML models are highly desirable. The challenging problem for designing novel materials is the estimation of synthesizability and choice of fabrication protocols. Several reported successful examples already indicated the tremendous potential of data-driven methods in OPV performance predictions. The recent progress of utilizing self-driven laboratories combining ML-assisted virtual screening, automated synthesis, and high-throughput characterization opens a fascinating new window for the future revolution of the material development framework.