Computational functionality-driven design of semiconductors for optoelectronic applications

The rapid development of the semiconductor industry has motivated researchers passion for accelerating the discovery of advanced optoelectronic materials. Computational functionality-driven design is an emerging branch of material science that has become effective at making material predictions. By combining advanced solid-state knowledge and high-throughput first-principles computational approaches with intelligent algorithms plus database development, experts can now efficiently explore many novel materials by taking advantage of the power of supercomputer architectures. Here, we discuss a set of typical design strategies that can be used to accelerate inorganic optoelectronic materials discovery from computer simulations: In silico computational screening; knowledge-based inverse design; and algorithm-based searching. A few representative examples in optoelectronic materials design are discussed to illustrate these computational functionality-driven modalities. Challenges and prospects for the computational functionality-driven design of materials are further highlighted at the end of the review.


| The development of optoelectronic materials
Optoelectronic material is mainly understood to be concerning the transformation associated with light into electrical energy in the form of current flow through semiconductors and vice versa. Typical optoelectronic semiconductors allowing relatively easy modify the electrical conducting properties and transit electrons from the valence band to the conduction band under thermal or optical excitations thereby has been progressing rapidly as a very active field, both academically and industrially. 1,2 The semiconductor-based optoelectronic devices have not only enabled greater the information and electronic engineering but also provided a highly effective measure to alleviate the environmental crisis by clean energy technologies. [3][4][5][6][7] For example, the massive consumption of non-renewable fuels results in the release of varieties of greenhouse gases into the atmosphere, together with the destruction of the surrounding environment. 8,9 One promising solution to overcome the above issues is using alternative green energy resources, including power from solar light, which can minimize ecological harm when they are utilized. [6][7][8][9][10] Numerous methods to harvest the solar energy have been proposed, including photovoltaics (PVs) 6,11 and photoelectrochemical (PEC) techniques involving the generation of hydrogen or even other substantial fuels. 7,8,10,[12][13][14] Efficient PV or PEC processes rely on semiconductor materials that can harvest solar light to generate electrons and holes and then conduct them to surfaces so that they can participate in special chemical processes or be collected by electrodes to generate an electric current. 10,11,[15][16][17] To operate efficiently during the whole process, the materials used for each step must be optimized in a reasonable manner. Currently, PV techniques primarily depend on silicon in the form of a light absorber and several other II-VI and III-V compound semiconductors, such as "CZTS(Se), CdTe, CIGS, GaInP, GaAs, and InP." 11,15,[18][19][20] Regarding PEC technologies, numerous metal oxides or chalcogenides have also been explored, including "TiO 2 , ZnO, Fe 2 O 3 , WO 3 , Bi 2 S 3 , Sb 2 S 3 , CdS, and Cu 2 ZnSnS 4 ." 12,13,16,[21][22][23] PEC technology has not been commercialized yet, mostly because of a lack of components that can efficiently harvest visible light and use the produced hot electrons to reduce chemicals toward fuels efficiently, such as splitting water to H 2 . To alleviate the particular challenges hampering PEC technology, strategies have been devised to integrate active PV devices into PEC units, normally at a high cost and device complexity. 24,25 These hybrid devices, with the integration of different band edge levels of semiconductors, are used as photoanodes or photocathodes for water splitting. 24,26,27 The excellent optoelectronic properties of oxides have also been explored extensively in other aspects of industrial applications, such as transparent conductors (TCs). [28][29][30][31] These transparent conductive oxides (TCOs) exhibit high electrical conductivity (conductivity σ >10 4 S cm −1 ) and transparency (transmission T > 80%) to visible light, which is required in many applications, including "transparent electrodes for panel displays and PV cell components, low emissivity house windows, and transparent transistors." 30,31 The main reason for the excellent conductivity and transparency is the large optical bandgap, which allows oxides to be doped using different amounts of charge carriers, like holes and electrons. 28,32 n-TCOs are already present in many modern devices (eg, indium tin oxide), but p-TCOs have never been commercialized. This is because their carrier motilities lag an order of magnitude behind their n-type counterparts as a result of the localized oxygen p-type nature of the valence band. 30,33 This impedes many critical technological innovations involving more efficient organic and thin-film PV cell designs. 28,31,33 In addition to their application in energy harvesting, optoelectronic semiconductors are also essential for information and electronic technologies, for example, many significant breakthroughs have been achieved in a range of novel optoelectronic devices recently, including photodetectors, [34][35][36] light-emitting diodes (LEDs), [37][38][39] and lasers. [40][41][42][43][44] Hybrid halide perovskites, an example of emerging optoelectronic materials, have demonstrated excellent performance in many optoelectronic devices since a report of their use as a high-efficiency PV device in 2012. [45][46][47] The accredited power conversion productivity value in the National Renewable Energy Research laboratory (NREL) chart was up to 25.2% in 2019. 48 The remarkable progress of perovskite solar cells is primarily a result of their unique inherent properties: "high optical absorption coefficient, tunable bandgap over the complete visible spectrum, minimal defect density inside solution-processed films, together with high carrier freedom for ambipolar charge transport with long diffusion lengths in micrometer range in their films." [49][50][51] In the same trends, breakthroughs noted in perovskite LEDs also demonstrate outstanding performance. For instance, the perovskite LED device with an external quantum efficiency of ≈21% and a brightness of 0.6 Mcdm −2 has been reported. 37,38,52 Meanwhile, perovskite-based photodetectors have demonstrated high directivity, greater than 10 15 Jones, and switching times of a few microseconds, which is similar to that of commercial GaAs devices. 53,54 In particular, for heavy atomic compounds designed for X-ray detectors, the performance of CH 3 NH 3 PbI 3 shows the lowest detectable X-ray dose rate of 0.5 μGy air s -1 and has a sensitivity of 80 μC Gy -1 air cm -2 . This is four times higher than the level of sensitivity achieved in seleniumbased X-ray detectors that have been previously commercialized. [55][56][57] The inherent optoelectronic properties of perovskites can also be used for lasing due to their huge optical gain because of the high absorption cross section and radiative recombination, particularly for high-electron-density conditions. For example, microring lasers designed using layered perovskites showed amplified spontaneous emission along with a quality component as high as 2600 and exhibited a net gain four times higher than that of bulk single-crystalline perovskite. 40,58,59 Regardless of these previous efforts, the semiconductors that have been explored to obtain highperformance optoelectronics continue to struggle to meet the increasing requirements for various specific application ranges, particularly for low-cost, nontoxic, and longterm stable materials. Therefore, it is necessary to accelerate the emergence of a wide range of alternative optoelectronic materials to transform future societies toward a renewable, environmentally friendly, and highly information-oriented era.

| Background of optoelectronic materials exploited by first-principles methods
The development of materials science complies with how scientific disciplines and techniques tend to evolve over time. 60,61 Typically, the very ancient science was merely empirical, similar to metallurgical technologies, advancing through intuition-driven trial and error and also lucky accidents. Subsequently came the paradigm of empirical hypotheticals and generalizations. The development of materials science is guided by various "laws" in equation form; these include physical and chemical principles. For the majority of theoretic or unique scientific problems, the essential models are also too complex to have an inferential solution. The famous physicist Paul Dirac stated in 1929, "The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. Therefore, it becomes desirable that approximate practical methods of applying quantum mechanics should be developed, which can lead to an explanation of the main features of complex atomic systems without too much computation." 62 With the involvement of computers in recent decades, the next paradigm based on computational simulation has become popular. This has allowed mathematical solutions for some sophisticated physical and chemical processes using supercomputers. 61,63,64 The quantum mechanics determining the behavior of a material is formulated by the Schrödinger equation; this equation is definitely intractable for materials if resolved directly because the procedure involves on the order of 10 23 electrons. Density functional theory (DFT) can be used to understand the behavior associated with electrons in a material, making it possible to use approximate treatments for solving the Schrödinger equation. This theory was first simply offered in 1964 by Hohenberg and Kohn, who proposed that each ground-state quality of a manybody interaction can be treated as an instance of the imposed charge density that needs to be minimized with respect to energy. 65,66 Without directly solving the Schrödinger equation, these approaches can improve the estimation of the preliminary charge density by simple iteration. They used a single particle potential as an equivalent non-interacting system that has electron density approximated to that of the interacting real system. 66 Thus, the ground state energy can then be formally written as a function of the electron density. Aside from the external potential, the Hartree potential can also be treated exactly, and the remaining complex electron interactions need to be approximated to collect in an "exchange-correlation functional." 67 Although the exact formula of this "exchange-correlation functional" remains unknown, the local (spin) density approximation "L(S)DA" and generalized gradient approximation "GGA" to it based on the electron gas model and their extensions have yielded complete predictions for most classes of materials. 66,68,69 The "Perdew-Burke-Ernzerhof" and "Becke-88" functionals are two of the most widely used types. 68,69 These functionals can further improve accurate predictions of the basic physical parameters compared with most experiments. Advanced corrections use the model of van der Waals interactions, which are of particular importance in the prediction of the geometry with weak bonding interactions, such as lowdimensional layered materials. 70,71 DFT calculations can predict many important physical properties for various optoelectronic materials. Figure 1 outlines some of the key properties (eg, band structure, carrier effective mass, light absorption coefficient, dielectric constant, and exciton binding energy) in semiconductors that have been exploited for various optoelectronic applications. [72][73][74][75][76] The fundamental optoelectronic property regarding semiconductors is the band structures, which usually exhibit wave-vector dependent F I G U R E 1 Key physical properties in semiconductors that have been exploited for various optoelectronic applications band energies. The bandgaps determined from the calculated one-electron band structures usually are seriously underestimated. 77,78 It is still challenging to predict the band gaps accurately, which often requires additional complicated approximations, 79,80 such as GW diagrammatic approaches (expansion of the self-energy Σ in terms of the single-particle Green's function G and the screened Coulomb interaction W) using vertex corrections of self-energy. 79,81,82 In some cases, the band edge minima are not located at the sampled high-symmetry points. 83 Thus, it is necessary to analyze the band edge energy in the entire first Brillouin zone. In special heavy element systems, spin-orbit coupling effects remarkably reduce bandgaps if those heavy atoms contribute to the band edge states. [84][85][86] Electron-phonon coupling effect, such as zero-point renormalization-induced bandgap reduction, has also been considered in the prediction of the bandgaps of semiconductors. 87,88 High carrier mobility is essential for efficient charge conductivity, which is one of the basic requirements for PV/PEC and TCOs materials. 33,63,89 The carrier mobility has an inverse relationship with the effective mass of the charge carrier. Thus, the general approach to calculating effective masses is the analysis of energy dispersions near the band minimum. Typically, the effective masse tensors calculated by GGA for most simple band structures are acceptable, while serious errors occur for systems with localized states near the band edges. 90 Especially for those correlated materials, the localization affects the band dispersions, thus requiring correction of the deviations by the use of LDA+U and other functionals. 82,[90][91][92][93] To model the effective masses accurately for some special materials, the heavy and light band splitting, as well as spin-orbit coupling, also need to be taken into consideration. 90 Dielectric constants are also key factors in decreased carrier scattering and trapping, which are usually induced by point defects in typical insulators. 94,95 The ionic and electronic dielectric parts can be estimated separately using DFT perturbations. 96,97 A finite electric field can be utilized for calculation of the electronic parts of a dielectric constant. Usually, the "local field effects" can overcome the underestimated values. They can also be used in combination with hybrid functionals or a random phase approximation to improve the theoretical dielectric constants. 98 Optical absorption and emission are stimulated by the allowed optical transitions between the energy bands in k-space; otherwise, phonon assistance is required. In the optical transition process, the spin of the electron is maintained if the spin-orbit coupling is weak. Meanwhile, the quantum number of magnetism during the transition should be altered by one or show no change, while the quantum number of the angular momentum should transform by one. 99 For such selection rules, the allowed transitions for optical absorption can be estimated from the complex dielectric functions. Given that the above basic Kohn-Sham equations map the actual non-interacting electronic systems, further corrections are required to accurately estimate the exciton properties. These methods, which utilize GW approaches together with BSE (Bethe-Salpeter equation) calculations, are usually carried out to calculate the optical emission and absorption spectrum, as well as the exciton binding energies. 79,85,99,100

| Materials exploration strategies using computation
Before going further, we explain the specific concept of computational functionality-driven materials discovery.
The key parts in materials science are related to the following three basic attributes: constituents (as encoded by the elements), structures (crystallographic atomic coordination), and properties. The property features needed to design a certain optoelectronic device are well known, but a candidate material that can have all of such important properties is generally hard to identify. Specifically, it is difficult to search for a perfect material meeting all of the property requirements for the desired functionality, especially for some combination of two or more contradictory properties. For instance, the best PV materials call for strong optical absorption "featured having parallel flat bands that have a superior joint density states" and great carrier mobility "requiring dispersed energy bands that have light effective carrier masses." 101,102 Analogously, contradictory properties are needed to balance amongst the optical transparency "needs large bandgap" and also a carrier conductivity that "typically requires small bandgap" in TCOs. 30,31,89 These precedents imply that the particular electronic structure configurations for optimal effectiveness are already understood, but finding a material that includes all of these ideal attributes remains difficult. Therefore, the current challenge involves using a highly efficient computational approach to systematically identify a large set of target candidate materials, after which we can further uncover the "discrete material gene" of constituent-structure-property relations in the material attributive spaces. 60,64,[103][104][105] Using computational methodologies with highthroughput first-principles calculations, many candidate materials with diverse qualities can now be efficiently given consideration. We generalize a set of typical design strategies that can be used to speed up materials discovery from computations. A direct method of exploring target materials is to calculate the related properties by an exhaustive screening of the candidate material databases if these calculations are sufficiently accurate and fast. 63,[106][107][108][109] To identify the exact missing compounds, a supplemental procedure usually involves taking a given framework with all possible needed species to occupy the available lattice sites. 74,110,111 A screening funnel with multistep descriptive filters is designed to guide the highthroughput screening of the qualified structures meeting the complex properties. 75,[112][113][114] The second approach is knowledge-based material discovery, in which only those compounds close to the maximal functionality are investigated. 101,102,115,116 The particular exploration of candidate products typically considers special chemical constituents, along with crystal structures, that can be selected on the basis of a pair of overarching design key facts that have been derived from past theoretical research attempts. This approach driving a structure or material from the desired properties is inverted relative to the approach of starting with given material spaces and then calculating their functionalities. The third strategy is a mixture of algorithm-based searching in addition to first-principles calculations with respect to material structure. [117][118][119][120][121] Searching for ground-state structures from the compound composition is much tougher than standard simulation, as it demands global rather than local optimization to solve for a crystal structure that will meet the offered material stoichiometric prerequisites. Especially for a material with the target functionality, it is a multi-objective optimization process to find the structure and composition that provide multiple properties or a collection of descriptors. Some of the genetic and evolutionary algorithms, Monte-Carlo approach, and even particle-swarm methods enable efficient global searching for a structure together with constituent spaces. [117][118][119][120][121] In the following sections, we detail case studies in optoelectronic materials utilizing these design strategies. Our goal is not to be exhaustive, but rather to select current examples for discussion of the modalities of computational functionality-driven optoelectronic materials design. In silico computational screening uses a brute-force method to explore materials with desired functionalities in the event the element constituents and atomic structures are defined and if such properties are estimated with sufficient accuracy and speed. 122 This unique design approach can be widely used in selecting semiconductors with established properties and optimal functionality. High-throughput computational strategies combined with hierarchical filtration are often used and can accelerate the testing of candidates within material databases. Such filters correspond to the main physical properties that happen to be required for the numerous criteria at the maximum functionality, for example, "effective masses, band gaps, and absorption coefficients, dielectric constant, band edge levels, and thermodynamic stability." When exploring the new chemical constituents, it is often required to build the candidate structures by substituting atoms from known crystal structures. Thermodynamic stability against decomposition into competing phases and dynamic stability against lattice vibration should be further assessed for unreported materials. To illustrate how this technique works, we take several examples for which in silico computation screening was used to identify promising semiconductors for optoelectronic applications.

| Pb-free halide perovskites for solar absorbers
Hybrid halide perovskites have recently captivated wide interest as low-cost and efficient solar cells. 45,51 One serious issue with their use is their toxicity and poor long-term stability, which may hinder eventual practical applications. Zhang and coworkers have used an in silico screening approach for the detailed design of Pb-free halide perovskites. 50,74,123,124 They regarded both single atomic replacement in AMX 3 normal perovskites by modifying A: "Cs and other 9 form of organic cations," M: "Pb/Sn/Ge," and X "Cl/Br/I" individually; as well as double substitution of 2M into an M+M 0 pair in A 2 MM 0 X 6 double perovskites, where Cs was chosen for the A site, eight metal ions, that is, "Na/K/Rb/Cu/Ag/ Au/In/Tl," were chosen for the M site, two VA metals, that is, "Sb/Bi," were chosen for the M 0 site, and four halogen anions, that is, "F/Cl/Br/I," were chosen for the X site. These types of double perovskites are actually well known as "elpasolites." Figure 2 depicts the screening procedure for A 2 MM 0 X 6 double perovskites via the particular cation transmutation approach, which is inspired by the emergent chalcogenide solar energy absorbers involving transmutations from Cu(In, Ga)Se 2 to Cu 2 M(II)SnSe 4 (M(II) = Zn, Ba, etc.). Screening criteria include thermodynamic and crystal stability, band structure, bandgap, strong optical absorption efficiency, light electron and hole effective masses, low exciton binding energy, defect tolerance, and chemical substance toxicity. Several predicted stable compounds with potentially good PV performance have been experimentally synthesized and later characterized with respect to their optoelectronic properties. 126,127 Another computational screening used for the hierarchical identity of Pb-free halide compounds for solar absorbers was reported later by Nakajima and coworkers. 128 They computed compounds with ABX 3 and A 2 BCX 6 forms: "where the atomic A = MA, FA, or Cs; metal M/M 0 = Be, B, C, N, Mg, Al, Si, P, Ca, Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, Ga, Ge, As, Sr, Y, Zr, Nb, Mo, Tc, Ru, Rh, Pd, Ag, Cd, In, Sn, Sb, Ba, Hf, Ta, W, Re, Os, Ir, Pt, Au, Hg, Tl, Pb, or Bi; and halogen X = Cl, Br, or I." The total variety of chemical compositions was initially 11025 in the computational database. High-throughput first-principles screening of the candidate perovskites was performed in accordance with screening criteria such as optical absorption, carrier effective mass, and band alignment. They discovered 51 novel low-toxicity halide single and double perovskites that could be used as solar cells. Yang and coworkers later showed a high-throughput screening of hybrid halide semiconductors with robust stability and desired optoelectronic properties even beyond the perovskite structure. 129 By selecting 24 crystal structures that include perovskite and non-perovskites F I G U R E 2 A, Constituents of candidates for A 2 MM 0 X 6 double perovskite (where Cs was chosen for the A site, eight metal ions were chosen for the M site, two VA metals were chosen for the M 0 site, and four halogen anions were chosen for the X site). The blue squares mean the materials passed the screening (selected), and the dark ones mean the material did not pass (abandoned). The optimal nontoxic A 2 MM 0 X 6 perovskites satisfying all criteria are marked with red checkmarks. Reprinted from Ref. 74 Copyright © 2017 American Chemical Society. B, Elements forming halide double perovskites in the composition of A 2 BB 3+ X 6 . The light blue colored elements denote that at least one chemical has been produced with a double perovskite structure. The triangular shade tag indicates a site occupied element, according to the top index. Reprinted from Ref. 125 Copyright © 2016 American Chemical Society along with several standard organic cations, a large computational materials collection containing 4507 hypothetical hybrid halide compounds was constructed for further calculations. After calculation using this collection, they identified 23 contenders for LEDs and 13 hopefuls for solar absorbers.
From experimental studies, recent reviews have reported that at least 350 different double perovskites have been synthesized. 125,130,131 Elements in the periodic table that exist in known halide double perovskites are highlighted in Figure 2, 125 which indicates that seven elements are normally found as A-site cations, eight elements can certainly occupy the B + sites, 34 elements are found as B 3+ cations, and five elements can occupy the X sites. Multiplication of these numbers yields 9520 possible combinations. Although the majority of these halide double perovskites from different element combinations are unstable, there still exists a chance to discover many novel compounds exhibiting various optoelectronic properties. For PV applications, there is a need to determine how the matched optical gap for these new compounds will perform in a solar cell. For instance, Cs 2 AgBiX 6 has already been characterized by many groups. 57,126,[132][133][134] However, these particular materials have wide indirect bandgaps with weak optical absorption, thus making them undesirable for solar cell applications. 135 Another successfully synthesized compound is Cs 2 InAgCl 6. By replacing Bi 3+ with In 3+ , even though the compound is stable and shows a direct bandgap of 3.3 eV, the minimal direct optical transition from valence band maximum (VBM) to conduction band minimum (CBM) is parityforbidden; thus, the material is still incapable of being applied for solar cell use. 136 Regretfully, to date there is still no qualified double perovskite that has experimentally proven to be promising for highly efficient solar absorbers. 135,137 Nevertheless, these synthesized double perovskites have demonstrated excellent performance for special X-rays or photodetection. 40,57,127 Many other challenges need to be addressed before these kinds of new materials come to be competitive alternatives, though some optimism persists that lead-free perovskite might be a promising research direction for PV or other optoelectronics.

| Semiconductors for PEC applications
Since the discovery of TiO 2 as a stable large-band-gap photocatalyst in 1972 by Fujishima and coworkers, several breakthroughs have been carried out to identify water catalysts in this domain. 12,22,23,25 A total of 16 metal-oxide photoanode compounds within the desirable bandgap energy range (1.2-2.8 eV) have been reported. 75,138 Nevertheless, the performance remains far from satisfactory in these discovered metal oxides. The vast majority of these low-bandgap photoanodes, which include "Fe 2 O 3 , WO 3 , and BiVO 4 ," exhibited well-known intrinsic limitations relevant to rapid carrier recombination or poor PEC stability. An imperative breakthrough in this area requires the powerful discovery of stable, photoactive, and low-bandgap oxide compounds that can be used for practical PEC applications.
Recent high-throughput computational screening studies in this field have predicted a large number of materials for photocatalysis in two specific chemical subspaces: metal oxide perovskites and oxynitrides. 72,113,139 Castelli and coworkers demonstrated efficient screening of metal oxide perovskites based on the Gritsenko-Leeuwen-Lenthe-Baerends potential (so-called DFT-GLLB-SC functional) for bandgap calculations. 72,139 After the identification of oxide and oxynitride stabilities and bandgaps, the band edge positions were estimated empirically by the middle of the gap from the electronegativity. 140 Their work reduced a vast space of 5400 oxide/ oxynitride materials to only 15 promising compounds for solar water splitting. The screening identified known materials in these compounds (eg, AgNbO 3 , BaSnO 3 , BaTaO 2 N, SrTaO 2 N, CaTaO 2 N, and LaTiO 2 N) and predicted several novel compounds, including nine oxides and oxynitrides that require further experimental verification. Wu and coworkers presented another screening approach that directly computed the CBM and VBM from the first-principles calculation in an aqueous environment for nitrides and oxynitrides with diverse crystal structures. 113 As shown in Figure 3, using the properties of phase stability, bandgap, and band edge positions, 68 nitrides, 1503 ternary oxynitrides, and 1377 quaternary oxynitrides were selected as screening candidates. After the multi-step screening, many of the known photocatalytic materials in the process through the chemical space are generally reproduced. A total of 16 novel materials, including two ternary and eleven quaternary oxynitrides, were suggested by the assessment approach as promising photocatalysts.
Yan and coworkers designed a multiple-tier screening pipeline attempting to find photoanodes from the Materials Project (MP) to screen 174 VO 4 -based ternary vanadates. 75 The tier-2 screen removed the DFT calculated formation energy above the convex hull in the phase diagram and a coarse estimate of bandgaps to avoid unstable materials and even wide-gap insulators. These quantities were stored in the MP database and computed using DFT with Hubbard U corrections for metal cation d-states with the VASP code (Vienna Ab initio Simulation Package). As shown in Figure 3, the bandgaps and band edge energies are evaluated in the tier-3 screen using the hybrid functional (HSE) with a modified mixing value and the surface slab model with PBE+U calculations. This step is used to determine if the VBM fulfills the OER requirement. To validate these computational screening photoanode materials, combinatorial sputtering and annealing options were carried out whereby thin-film synthesis was attempted for each desired phase. This computational experiment joint approach showed the successful discovery of 12 watersplitting catalysts with an appropriate bandgap range, including 4 recently reported copper-based vanadates and eight additional metal vanadates, which demonstrated that the in silico high-through computation screening is a prolific technique for broad functional materials discovery. Knowledge-based inverse screening does not require the determination of many properties of varied compounds. The design of candidate materials usually takes into account particular chemical constituents and crystal structures that are selected using a set of overarching design principles. These physical or chemical principles are based on empirical rules or theoretical knowledge. The key step in the screening procedure is the use of relevant knowledge-based descriptors that guide the needed functionality, which may be defined by the electronic or atomic structural characteristics that can lead to the desired properties.

| Inverse design of highly efficient solar absorption material
The classic Shockley and Queisser limitation on predicting the PV cell efficiency depends solely on the bandgap of material, providing a very unreliable selection criterion for good PV materials with an optimal gap of~1.3 eV, even without considering the bandgap types. 141 These screening requirements or descriptor features have proven to be insufficient, as many materials with this bandgap retain low PV absorption coefficients. 142,143 These failed examples explained that: "some direct-gap materials might have a dipole-forbidden (DF) direct transition, which is much lower than the corresponding dipole allowed (DA) direct transition energy." Therefore, direct gap materials may not always be suitable for absorption. Conversely, indirect gap materials with a higher DA transition energy might also demonstrate efficiency. Yu and coworkers demonstrated a diverse partitioning of compounds into different optical types, then made a generalized selection metric "spectroscopic limited maximum efficiency (SLME)" that considered various optical transitions as well as non-radiative recombination loss. 144 As shown in Figure 4, using the SLEM method to test the I-III-VI chalcopyrite group materials documented in the Inorganic Crystal Structure Database (ICSD), they predicted high SLEM materials not simply range from the identified thin-film absorber materials like CuInSe 2 , CuGaSe 2 , and CuInS 2 but additional fewer researched materials that were documented experimentally which are ensuring PV absorbers. They further used the SLME to identify the general underlying physical principles (ie, "absorption design principles") for high-absorption materials. 101 For example: "in typical binary semiconductors such as GaAs, the absorption at threshold contributed by the transition from the anion p-like VBM to the cationslike CBM." However, in the ternary chalcopyrite: "that is, Cu-III-VI 2 (III = Ga, In and VI = S, Se), the absorption at threshold occurs mostly from the transitions from (Cu-d + VI-p) near the VBM to (III-s + VI-s) near the CBM." Because the III-s band is actually characterized by a rather dispersion, this idea contributes to a broad density of states (DOS) and a sluggish rise in absorption above the threshold. The energy dispersion that closes the CBM of chalcopyrites could be decreased, which would also lead to higher absorption if the III s-levels are localized in a composite with a higher Cu/III ratio. Such localization would increase the DOS on the CBM, leading to a matching increasing strength of optical absorption. Similar s-band narrowing strategies have been demonstrated in low-valence ionic compounds with ns 2 electron configuration. The SLME study of I-III-VI compounds suggests that: "when containing element III is low-valence Tl 1+ , the compound exhibits higher SLME values than those containing high-valence Tl 3+ because low-valence Tl exhibits a considerable Tl s-orbital density near the VBM and relatively flat p-like bands near the CBM." Considering d-orbitals for Cu largely form states in close proximity to VBM, the d-p to s-p transitions result in an increased absorption in those Cu-Tl 1+ -VI materials.
Based on the absorption design principles, the authors chose a "Cu-V-VI (V = P, As, Sb, Bi; VI = S, Se)" system with 30 compounds for further theoretical examination. Group V elements will take two frequent oxidation states, 5+ and 3+. It was shown that an equivalent or even higher SLME should be obtained in the material F I G U R E 4 "Spectroscopic limited maximum efficiency" (SLME) η versus the minimum bandgap E g for I-III-VI chalcopyrite group components with a thickness associated with 0.5 μm. Reprinted from Ref. 144 Copyright © 2012 American Physical Society at a low valence state. For high-valence V atoms, high absorption can also be obtained with an increased joint DOS connected with structural localization, giving rise to comparatively flat bands of V s-orbitals. These localization-induced strong absorbers are very well represented by Cu 3 -V-VI 4 compounds, which include many candidates showcasing SLMEs greater than 23%, with thicknesses of 200 nm. A recent research update on the promising chalcostibite family of absorber materials for thin-film PV solar cells concludes that CuSbCh 2 (Ch═S, Se) features a layered chalcostibite crystallographic structure. 145 The electronic and optoelectronic properties of these absorber materials feature a high DOS as well as strong absorption. In comparison with CuInGaSe 2 , these compounds showcase heavier effective masses and demonstrate serious charge transport problems, which suggests that: "the moderate hole density is set by compensation between copper vacancies and interstitials, whereas the charge recombination and carrier dynamics are determined by chalcogen vacancies and Cu-on-Sb antisites". 145,146 The device efficiencies of these types of chalcostibite PV devices are currently 1%-3% for CuSbS 2 and 3%-5% for CuSbSe 2 because of the short diffusion lengths of generated carriers and the additional detrimental voltage or current from the contact band offset.

| Inverse design of p-type TC materials
p-type TCs require both optical transparency and hole conductivity and may be difficult to design in comparison with their n-type counterparts. 31,33,147 These particular issues arise for the following reasons: the heavier effective masses of holes than electrons, the tendency to get self-compensation regarding p-type doping, and also the occurrence of localized small-polarons of which prevent beneficial hole conductivity. 148,149 The principles controlling p-type conductivity have established heuristics in wide-gap materials. 33,148 To obtain a substantial hole density for p-type conductivity, the host phase should be highly doped with shallow acceptors, possibly possess low ionization energy, and reduce the donor compensation. The concentration connected with acceptors is determined by way of the solubility of acceptors in various crystals, and the ionization energy of the acceptor is predominantly affiliated with the dopant quality. Hence, several design principles have been developed to address the above problems: 148,150,151 "(a) by non-equilibrium growth or co-doping techniques, it can raise the doping solubility and decrease the formation energy of acceptors; (b), reduce the ionization energy by design of shallow acceptors based on the valence orbital coupling; (c) raise the host VBM to reduce the acceptor ionization, such as present less electronegative anion alloy or make use of metal cations by using closed-shell orbitals regarding anti-bonding hybridization; (d) enhance the formation energy of compensating donors to shift the actual pinned Fermi level toward the host VBM, for example, find appropriate dopants and ambient in order to enlarge the formation energy of donor and overcome the particular doping limit." Peng and coworkers applied the inverse design approach (see the design flowchart in Figure 5) to search for ternary Mn-based p-type TCOs. 29 They show p-d coupling in compounds that contain high-spin Mn(II) or Fe(III) cations. Typically, the nature of the anti-bonding state in VBM is formed between the cation d-state and the oxygen p-state. This specific p-d repulsion enhances the valence band distribution and hence is considered to result in a small hole effective mass, "that is, the antibonding hybridization is t 2 for tetrahedral and e for octahedral coordination environments, whereas the nonbonding (heavier-mass) is e for tetrahedral and t 2 for octahedral." Meanwhile, the energy level of the VBM is increased closer to the vacuum level, thereby enhancing the p-type doping. Additionally, in the high-spin state of Mn(II), the large exchange splitting induces a large bandgap, which is combined with the spin forbidden internal d-d transitions, thus ensuring acceptable transparencies throughout the sufficiently thin film. After the initial design principles, compounds made up of "V(III), V(IV), Ti(III), Mn(III), Fe(III), and Cu(II)" ions were excluded because most of their bandgaps would be substantially limited by unfilled d-states with relatively low energy. Thus, the initial candidates only consist of 13 ternary Mn-based compounds from the ICSD. Then, the authors utilized the following metrics for down-selecting p-type TCOs from the initial checklist (see Figure 5): "(a) thermodynamic stability; (b) a sufficiently wide bandgap, ideally above 3 eV for transparency, or at least a sufficiently low absorption coefficient to allow acceptable transparency of a thin film typically on the order of 100 nm thick; (c) light hole effective mass; (d) absence of spontaneously formed hole-killer defects or hole selftrapping; (e) presence of spontaneously formed holeproducer defects." After the screening step, inverse spinel Mn 2 SnO 4 and spinel Cr 2 MnO 4 are the two remaining materials, which were then subjected to in-depth experimental and theoretical examination in the next step. Finally, they identified Li as a suitable acceptor-dopant for Cr 2 MnO 4 as a thermodynamically stable, wide-gap, ptype TCO S without hole trapping.
In addition to studies centered on doping metal oxides, Yan and coworkers searched for an alternative to the 18-valence electron ternary compound ABX, which contains A, B, and X elements in 1:1:1 stoichiometry. 152 As shown in Figure 5, this results in one such set of 18-electron V-IX-IV compounds, "check through" indicating earlier known compounds, (+) suggesting previously missing and after this predicted to be stable, and (−) signs denoting previously absent and now predicted to be unstable compounds. To set the design principles for researching p-type TCs in 18-electron ABX F I G U R E 5 A, Flowchart of the "inverse design approach"; B, a checkmark reveals that the property of the respective materials is suitable for the specified functionality of a p-type TCO, where circles represent non-optimal but still acceptable and cross-marks imply that this property is usually prohibitive. Reprinted from Ref. 29 Copyright © 2013. Wiley-VCH. C, ABX in V-IX-IV group compounds. The compounds with checkmarks have been documented, the substances marked by a plus (+) are predicted to be stable, and those with a minus (−) are unstable. D, Crystal structure model of TaIrGe. The small black sphere indicates the interstitial vacancy site, which is equivalent to the Ir site. E, Orbital interaction diagram of TaIrGe, representing 18-electron ABX compounds. F and G, The intrinsic defect formation energies in TaIrGe are calculated as a function of the Fermi level under the best p-type and n-type doping conditions in the phase triangle. Reprinted from Ref. 152 Copyright © 2015 Springer Nature compounds, the authors find a number of new structureproperty relationships in this family: "first, it was found that all 18-electron cubic half-Heusler ABX compounds containing two transition metal atoms (eg, TaIrSn and ZrIrSb), are semiconductors, whereas the cubic 18-electron half-Heusler compounds with one transition metal atom (eg, AlNiP) are metals; second, it was found that the ABX compounds with heavy X atom tend to be stable in cubic half-Heusler structures, hence can potentially have wide band gaps, for example, the predicted ABX compounds in groups IV-X-IV, IV-IX-V, and V-IX-IV with heavy X elements Sb, Bi, Sn, or Pb are all stable in a cubic structure, (see the crystal structure in Figure 5)." In contrast, they find that: "ABX with light X atoms (ie, O, S, Se, N, P, As, C, Si, and Ge) tends to have non-cubic structure and thus are often metallic, because for light X carry ionic charges and thus strongly repel each other." Thus, the heavy X atom compounds are the targeted choices. Based on the design principles, the authors identified the following sets of wide-gap compounds in the groups of cubic, two-transition element half-Heusler ABX compounds: "from the IV-IX-V group containing TiCoSb, TiRhSb, TiIrSb, ZrRhSb, ZrIrSb, and HfCoSb; from the IV-X-IV group containing HfPtSn, whereas from the V-IX-IV group containing TaIrGe and TaIrSn." The all-metal atom half-Heusler insulators, in sharp contrast from many other heavy-atom compounds, are generally narrow-gap semiconductors or semimetals. As illustrated in Figure 5, the T 2 (Ta, d) state couples with the T 2 (Ir, d) state, whereas the E(Ta, d) state interacts with the E(Ir, d) state. This kind of covalent interaction with two transition metal atoms gives rise to the bandgap opening. After the development of candidate p-TCOs, the authors further examined the defect formation energy and ion transition energy for the donors or acceptors. As shown in Figure 5, the selected Ge-on-Ta antisite defect with a (-1) charge state is the main acceptor, which shows the ionic transition level only 0.21 eV above the VBM. Further experimental characterization demonstrated the TaIrGe to be a stable p-TCOs with an assessed bandgap and remarkable hole mobility at room temperature.

| Algorithm-based optoelectronic materials searching
Prediction of the crystal structure from the chemical composition is more complicated than standard material simulation. Typical first-principles material simulation can directly predict properties from the inputted structure. 153,154 However, the structural search is an inverse prediction problem that requires global rather than local optimization to find a stable crystal structure that meets the design target. 119,155 Thus, the crystal data mining and cluster expansion methods are beyond the scope of structural search techniques. The data mining approach allows extremely swift predictions of fixed structures and optimal properties but relies on databases of crystal structures; thus, it is unable to predict new crystal structures. 110,111,156 Cluster expansion begins with the known structure and permits prediction of the ordering of the atoms with varying temperatures. 157,158 Both approaches involve considerable amounts of pre-existing structural information. In particular, material exploration with multi-functionality is a multi-objective optimization process to find the structure and composition that can provide multiple properties or a collection of descriptors. Usually, this can be done by merging a structural search, such as genetic or perhaps particle swarm optimization (PSO) evolutionary algorithms, in combination with high-throughput calculations of properties. 120,121,159 The most common procedure using the evolution algorithm includes five steps: [118][119][120][121]155,158,160,161 "(1) generation a large number of initial structures; (2) local structural optimization; (3) evaluation of fitness; (4) generation of new structures; (5) check convergence." Candidate structures at each generation are collected into groupings after optimization. Often, the continued optimization treatments proceed by producing successive generations. In the Genetic algorithm (GA), offspring structures are typically developed by way of the mating operator, which in turn cut from each of two-parent structures and joins these individuals together in order to create an offspring structure. 117,122 The mating operator is productive given it moves local structural motifs from parents to offspring, and also the total energy is basically determined by the local structure. As a result, the structural traits associated with low energy to be able to pass in the population, and also causing the high-energy traits to perish away. GAs are definitely not always on a searching space with preset dimensionality. This is important considering that lots of atoms in the lowest energy structure is not referred to early, and also the thermodynamic ground states of particular compositions are unable to be supposed. 162 In comparison, the PSO algorithm will begin which has a set (swarm) involving randomly generated structures (particles) and next transfers the particles from the swarm through the solution space. 120 Especially, some sort of chemical structure will be moved by a vector that is the weighted sum of three parts: "the particle's previous shift, the difference between the current position and the best position previously seen by the particle, and the difference between the current position and best position seen by the entire swarm of particles," where the weight loaded for the latter-mentioned two parts are taken with a uniform distribution. Using this approach, the swarm of particles progressively converges to the global minimum. Oganov et al have put together GA with the PSO to form two variation operators. 163 The particle is either to be mutated (to mirror any haphazard move) or perhaps participates with inheritance having its best-known position (to simulate PSO shift directions in all these positions). However, one negative aspect is always that the minimum structure found by these evolution algorithms is hardly proved to be the real global minimum structure. 119,163 Many types of evolutionary algorithms have been successfully applied in materials design: the most popular codes are the "USPEX (Universal Structure Predictor: Evolutionary Xtallography)" by Oganov-Glass and Wang's "CALYPSO (Crystal structure AnaLYsis by PSO)." 118,121,164 2.3.1 | GA discovery of optical allowed direct-gap Si/Ge semiconductors Despite many years of research, Si-based materials with a strong optical transition at the band-minimum remain scarce. 165,166 There has been some interest in combining optical and electronic functions in the same wafer while maintaining the extraordinary know-how developed for Si. The limitation of Si is that it is an indirect-gap material. The optical transition at the band-edge includes a phonon for the conservation of crystal momentum; hence, photon absorption remains a fairly unlikely activity, requiring somewhat thick films. 167 An effective approach for converting Si-based materials into direct-gap optical materials with strong light-absorber/emitter properties may involve combining the effects of the band-structure by both nanostructuring and alloying. 166,168 Zunger and coworkers utilized a combination of GA and band-structure calculation to explore Si/Ge materials with the explicit target of a direct-gap and optically active material. [168][169][170] The GA emulates the biological evolution process simply by constructing and improving a population associated with superlattices according to the possibility and their relative conditions for the purpose of light emission at the band-edges. 117,159 New candidate superlattice "offspring" are generally generated from the preceding population by replacing random sets regarding layers in the superlattice between a two-parent "crossover," and by randomly changing Ge layers into Si coatings with a reversal parent "mutation." In each generation, the most detrimental individuals in the previous population are replaced via the offspring, thereby driving the population as a whole toward a global optimum by means of survival of the fittest. To determine the fitness, that is, the optical transition probability, the authors computed the particular dipole matrix for the transition between the VBM and CBM at Γ for each superlattice. The GA search revealed a number of intermittent sequences with high dipole matrix elements, for example, "(SiGe 2 Si 2 Ge 2 SiGe n ) grown on (001) Ge rich substrates with Si 1-x Ge x (x ≥ 0.4)," which shows a strong dipole-allowed direct band-minimal transition. To further understand the enhanced inter-band coupling, the CBM orbital characteristics of the superlattice were unfolded in the Brillouin zone, as analyzed in Figure 6. It was revealed that the superlattice's CBM is really a hybridization involving both Δ and Γ states. The CBM is usually localized over the silicon-rich part, whereas the VBM is delocalized across the Ge part.
Further studies were carried out by Zhang and coworkers using a mixture of GA and a semiempirical pseudopotential Hamiltonian, intended to describe typical electronic structures, to explore thousands of coaxial core/ multishell Si/Ge nanowires with [001], [110], and [111] orientations. 171 They explored the "magic sequences" for the core as well as for specific Si/Ge multishells, which could offer a direct bandgap and good oscillator sturdiness. The preliminary calculated results show that [001] and [110]-oriented NWs showcase direct bandgaps, in contrast to [111]-oriented NWs, where the bandgap is indirect. Typically, the VBM is folded from the Γ 8 regarding bulk Ge; thus, it is always positioned at Γ h of the NW Brillouinzone. In contrast, the CBM is collapsed from the six Δ evalley states of bulk Si, so its site in the NW Brillouin zone varies according to the NW orientation. Thus, "For [001]oriented or [110]-oriented NW, four or two of the six Δ evalley states are involved in the confinement plane and folded to Γ e forming a direct bandgap; in contrast, no Δ evalley state is involved in the confinement plane of [111]oriented NW, and thus all the Δ e -valley states are folded to the off-Γ e forming an indirect bandgap." Figure 6 illustrates the exact evolution connected with fitness together with the GA search for [001]-oriented Ge core/multishell NWs, which again reveals the fact that the best individual candidates can be found within fewer than 50 generations and they remain up to the 250th generation when new individuals still appear. The best GA results suggest that: "the [001]-oriented configuration has a composition formula [Ge 5 ][Ge 1 Si 2 Ge 1 Si 2 Ge n ] (n is the number of rest shells), which exhibits two orders of magnitude enhancement of oscillator strength (1.9 × 10 -1 ) by comparison with same-size single-shell (Ge-core)(Si-shell) NW (4.5 × 10 -3 ), and three orders of magnitude enhancement by comparison with same-size pure Si (4.1 × 10 -5 ) and Ge (2.2 × 10 -4 ) NW; while for [110]-oriented NWs, it was found two best configurations with comparable significantly enhanced oscillator strength: that is, [Ge 5 ][Si 1 Ge 3 Si 2 Ge n ] and [Ge 5 ] [Ge 4 Si 2 Ge n ]." As a consequence, the calculated absorption spectra of the best GA NWs show much stronger intensity, by more than 1 order of magnitude, compared with other NWs.

| Evolutionary algorithm search for Sn-based oxides optoelectronic materials
Sn-based oxide materials, such as SnO 2 and SnO, have attracted interest because of their wide range of possible applications within optoelectronics, PVs, and photocatalysis. 28,31,147,172 The differences between tin-based oxides, considering both crystal structure and oxidation state, cause divergent electronic properties: "SnO 2 is an n-type semiconductor (bandgap of 3.6 eV) which can be used as a TCO, whereas SnO is a p-type semiconductor with indirect (0.7 eV) band gaps (direct band gap of 2.7 eV), and thus has been applied as thin-film transistors." 30,31,147 These tin-based oxides may be easily transformed into each other by means of changing their unique oxidization and reduction conditions, which gives rise to many intermediate phases between SnO 2 and SnO. 173 Several metastable structures have been reported with compositions of Sn x O y , and it is unclear if those previously suggested hypothetical and experimental structures are in global minimum ground states.
Wang and coworkers demonstrated a good theoretical search for stable Sn─O compounds using the evolutionary algorithm for global optimizations by USPEX. 164 They started their search from SnO, that is, the ratio R = [Sn 2+ ]/([Sn 2+ ]+[Sn 4+ ]) = 1, and then steadily reduced the main concentration involving Sn 2+ to R = 0.5, corresponding to Sn 2 O 3 . These predicted ground-state intermediate compounds are layered and are made from symmetric SnO 6 octahedra, along with distorted lithargelike SnO units; see Figure 7. Each predicted Sn x O y structure is built from a mixture of three basic layer styles: distorted SnO, Sn 3 O 4 , and Sn 2 O 3 . Single SnO sheets for Sn 5 O 6 as well as Sn 7 O 8 could be peeled off as well as reconstructed into a new bulk layered phase. Thus, these observations suggest that these new tin oxide crystal support frames for Sn x O y involve stacking the basic components with an arbitrary request. The determined formation enthalpies relative to the main energies associated with α-SnO and also rutile SnO 2 imply that particular formation regarding Sn 2 O 3 , α-Sn 3 O 4 , Sn 4 O 5 , Sn 5 O 6 , Sn 7 O 8 , Sn 9 O 10 , and Sn 11 O 12 is thermodynamically more stable than the physical mixtures from the parent stages of development. Theoretical X-ray diffraction (XRD) spectra are in agreement with the available experimental powdered ingredient XRD data, which indicates that the structures proposed by the evolutional algorithm are Inspired by the small ionization potential of Sn(II), Xu and coworkers studied a few stoichiometric Sn(II) phosphates because the phosphates include very stable phosphate polyanions that might provide great transport along with the anion spinal. [175][176][177] The Zintl compounds that may assist in the awareness of a large number of materials were selected, and they can additionally supply sufficient freedom for doping. The low valence state of Sn suggests the possibility that Sn s-states may contribute to the VBM, which could supply additional connectivity between the phosphate groups and even improve p-type conducting. In addition, metal vacancies would be p-type dopants in these compounds, and the high stability of these polyanions may well prevent the creation of compensating defects.
The search for new compounds was accomplished based on first-principles DFT calculations combined with the PSO algorithm implemented in the CALYPSO program code. The PSO technique avoids the use of complicated variation operators; thus, it is different from the GA technique. 118,120 This technique requires only chemical compositions for the given chemical stoichiometry to forecast stable or possibly metastable structures at proposed external conditions, based on an efficient global minimization of free energy surfaces joining total-energy calculations. The primarily devised geometrical parameter that allows the elimination of similar structures during evolution has been implemented to increase the structure search productivity. Additionally, the symmetry limitation typically imposed in structure creation enables the actual realization of diverse structures and leads to appreciably reduced search space together with optimization variables, thereby increasing the global convergence. The PSO algorithm has been successfully utilized for investigating many regarded systems: "for example, elemental, binary, and ternary compounds with various chemical bonding settings such as metallic, ionic, and covalent." The success of this algorithm demonstrates the reliability of this methodology and even illustrates the particular promise regarding PSO as a major method of crystal structure prediction.
The authors initially investigated a group of original compounds, Sn n P 2 O 5+n (n = 2, 3, 4, 5). Except for a known metastable α-phase Sn 2 P 2 O 7 , all of these predicted intermediate compounds are stable with respect to decomposition into the binary ingredients SnO and P 2 O 5 in the convex hull plot; see Figure 7. The sizeable negative formation enthalpies are primarily caused by the PO 4 tetrahedra (connected by a corner contributed O regarding the pyrophosphates) and interstitial Sn bonded by the O of the phosphate anions, indicating their substantial chemical stabilities. The authors searched ground state structures for better joined Sn(II) phosphates (n = 1), that is, SnP 2 O 6 . By choosing the stoichiometry that can be expected to give preference to longer pyrophosphate chains, this technique could search for compounds with better connectivity. Electronic structure calculations for these compounds reveal that these phosphates have large band gaps above 3.2 eV, which means that they are transparent to visible light. As shown in Figure 7, several phosphates exhibit relatively low hole effective masses (2-3 m 0 ), comparable with the electron masses. This suggests a promising bipolar conductivity depending on doping. Combining the relatively large bandgap, low carrier masses, and high chemical stability, it suggests likely achievable optoelectronic applications for these Sn(II) phosphates, for example, as p-type TCs.

| Bridging the gap between computational prediction and experimental synthesis
The aforementioned discussions incorporate the overall design infrastructure and basic workflow for the computational functionality-driven design of optoelectronic materials. In some cases, such as the in silico screening of semiconductors, standard DFT may be used to predict many of the important properties of a material before it truly is selected for synthesis. 75,114,129,139 The development of an accurate and efficient computational method for a diverse class of compounds, especially for transition metal oxides with strong correlation effects and self-interaction errors, is urgently needed to increase the reliability of high-throughput computations as the vital step in the actual screening. 66,79,92,100,178 This kind of simulation procedure is closest to a "virtual screening" approach, for which the screening calculations are a cheap as well as scalable proxy for testing. However, in other scenarios, a possible regrettable outcome may be the theoretical prediction of new materials that will be impossible for experimentalists to make. 125,137,179 It is therefore important to evaluate the decomposition energy of new compounds with respect to the known phases and even assess the dynamic stability from the phonon stability. However, there remains a great demand to emphasize reliability and to build up better solutions to bridge the gap between computational prediction and experimental synthesis. Hence, the computational model of materials will have to add inflexible filters to prevent the frivolous prediction of pointless unstable compounds. 179 For instance, stability analysis can be examined by the construction of hypothetical competing phases deriving from known structure types in material databases. Those extra calculations can narrow down the list of intriguing materials to those that are most likely to be stable and realizable. Another solution is using a global minimization method. As opposed to confining the minimization space of the basic structure form, one can enhance the suspected structures that are not constrained by symmetry and the computational cell to a fixed number of atoms. 180 This approach seems reliable, but it is very costly to search the ground state for all possible competing phases. It is important to emphasize the bias of the calculated results and the intuitions at what exact level of correctness should be relied on. Sometimes metastable materials with negative decomposition enthalpy (<100 meV atom -1 ) can still be made under off-equilibrium conditions. 153,158,181 With respect to such occasions, one can continue to use DFT in a qualitative way for tutorial experiments toward practical results, especially for illustration of exotic physical or chemical tendencies.
We have to point out that even with a computational screen step, the possible final results are bounded by basic decisions regarding the chemical space and structure type delimited by past research studies. This method for exploring new materials with special functionalities may sometimes lead researchers away from the promising direction. For example, the hybrid halide perovskite technique as a major recent progression in PVs has quickly developed from an initial 10.9% efficacy to 25.2% in 7 years. 45,48 Such compounds were more or less neglected in previous computational studies, which in turn had focused separately on inorganic or simple organic materials. The preliminary slow improvement in these perovskites was to some extent due to the early analysis direction, which actually followed semiconductors with tetrahedral skeletons and even gradually increased its efficiency through a time-consuming feedback contour, rather than checking a broad variety of compounds to locate those that fulfill at least the expected properties. It is thus imperative to improve the traditional approaches for the comprehensive collection and recognition of materials.
Promising, well-designed materials depend on close and even iterative agreement between theory and experiment. For instance, the experiment often has to tell theory which original assumptions, for example, the detected parameters or the form of possible defects, have been attained under procedure conditions. Theory can enlighten experiments on previously unsuspected structures or compositions, in addition to promising uses. This cooperation for the discovery of materials forms a communal feedback group. These integrative computational and experimental approaches are required for high-throughput materials design and also interpreting novel trial and error discovery. For example, combining algorithm-based structure searching with additional experimental diffraction data has provided truly successful techniques for solving crystal structures. [182][183][184] In addition, known crystal information is crucial regarding understanding the performance of materials, especially when there is insufficient experimental evidence. Therefore, a combination of computational and experimental prediction of structure and properties could well be a very mutually beneficial approach.

| Computational design of semiconductors beyond crystal structure
Computational functionality-driven material design can be used not only to determine the perfect crystal stability and properties, but also for realistic complex structural forms such as those of lattice defects, reconstructed surfaces, and interfaces. 185,186 The band energy levels near a particular surface or band offset between interfaces are crucial for the design of various devices. 181,186,187 Particularly for the rapid growth family of two-dimensional (2D) materials, which can appear in different polymorphs that are sometimes sufficiently near in energy to be experimentally observable. [188][189][190] For these 2D materials, the main structural variation between the polymorphs can critically impact the exact properties of the compound. Computational tools dependent on DFT give the predictive power to enable the computational finding, characterization, and design of 2D materials and also provide the necessary input and guidance to help experimental studies. [191][192][193][194] For example, based on the structures and electronic properties of monolayers, Zhang et.al have proposed the process for the design involving type-II heterojunctions from the large screening of layered compounds, effectively offering the space separating of holes and electrons. 195 These computational strategies permit often the exploration of possible routes to enhance the photocatalytic activity of 2D materials by using physical stress, bias potential, carrier doping, and also pH. 195,196 The practical simulation of surfaces or interfaces can be a complicated concern because a multitude of attainable configurations is available in terms of surface orientation and reconstruction. Tentative studies searching these atomistic configurations without former experimental knowledge require global structure optimization techniques. The formation of point defects and dislocations is also critical to the material performance, for example, instilling conductivity in insulators or introducing recombination centers within crystals. 148,150,185 Often, the formation of deep-level defects is considered undesirable, while shallow defects are generally preferred for efficient transport. This is the reason why the fundamental part of semiconductor technology involves a complicated purification process to eliminate unwelcome impurities and the deliberate influx of ideal impurities. Usually, it is difficult to access these defects by means of experiment on the atomistic and electronic levels. Thus, the actual computational technique can complement experimental characterization. With the progress of the electronic structure method and also relevant high-throughput computational procedures, defect characteristics are currently starting to provide important contributions to semiconductor screening. For example, Zakutayev et al have generalized the electronic characteristics in defect-tolerant compounds that exhibit unusual antibonding states at the VBM. 102,197 They explain that defect-induced dangling bonds will likely appear as resonances within the bands if the primary anti-bonds occur at the VBM, causing the bandgap to be free of serious traps. These defecttolerant semiconductors, for example, partially oxidized posttransition metal compounds containing a lone pair of electrons (that do not participate in bonding), have excellent optoelectronic properties, such as a large dielectric constant and low carrier masses, and are suggested to be part of the promising PV materials.

| Data-driven accelerating materials discovery
In silico material screening is already under development, which in turn contributes to a better perception of the constituent-structure-property relationship and allows researchers to optimize the properties of materials. In addition, a large range of theoretically forecasted materials are collected within computational materials property databases: "including the AFLOW, 198 Novel Materials Discovery (NOMAD), 105 Automated Interactive Infrastructure and Database for Computational Science (AiiDA) project, 199 the Open Quantum Materials Database (OQMD) and the Materials Project." 106,200 These computational databases contain various calculated properties and physical parameters, including the chemical enthalpies, band structures, modulus of elasticity, piezoelectric coefficients, and even phase diagrams. In comparison with engineering the materials database by computation, a more challenging process is to develop effective material descriptors that will rapidly censor target compounds. In principle, the rollout of the descriptors is based on some deep abstraction of the accessible physical and chemical properties of the desired material. 76,201,202 Simply, this technique is a method that can discover cutting-edge materials with the help of big data strategies. Within the last few decades, data-driven approaches have used surrogate machine learning models that make it easy for rapid predictions based solely on previous data, as opposed to being guided by experimentation or perhaps by computations in which basic equations are explicitly sorted. 103,[203][204][205] Because they are more affordable and efficient, data statistical analysis and machine learning methods have proven to be useful resources for determining materials properties that might be difficult to determine or calculate using old-fashioned methods.
Data-driven approaches that attempt to deal with these issues contain a couple of steps, aimed at quantitative estimation. [205][206][207] Step one is to numerically symbolize various items in the data set. Following this, each of the input cases would have already been transformed into a string of numbers, that is, "fingerprints." This specific critical action requires essential expertise and also knowledge of the exact material. The appropriate choice of fingerprint is determined by different physical levels depending on the issue under investigation and the finely detailed requirements. For example, if the objective is to understand the issues regarding a macroscopic event (eg, mechanical modulus) that requires less critical accuracy, then a rough level can be incorporated, for example, the common attributes for atom species in the material, or even other attainable higher-level structural features, such as grain boundaries and sizes. However, if the goal is to predict specific qualities at a sub-atomic level of precision across an entire materials compound space, like the dielectric properties for insulators or the bandgap for semiconductors, the fingerprint should include information and facts related to the atomic-level detail encoding their controlled electric properties.
The second stage constructs a mapping regarding the fingerprinted input and the desired property, and it requires little knowledge of domains. Numerous algorithms are able to establish the mapping as well as generation of surrogate types, such as "linear regression, kernel ridge regression (KRR), decision trees, and deep neural networks." [208][209][210] Selecting learning algorithms depend on the category of accessible data. For instance, if the desired property is a continuous range (eg, "bulk modulus"), it requires a functional mapping, such as regression. Concerns can also include discrete expectations (eg, "crystal structure"), which are considered to be classification issues. One representative study was carried by Lu et al, who have employed six different regression algorithms, "that is, gradient boosting regression (GBR), KRR, support vector regression, Gaussian process regression, decision trees regression, and multilayer perceptron regression," to know the relationship in between features and targeted property in prediction of hybrid perovskites for PV applications. 211,212 They evaluated the performance of each model by three prediction indexes and found that GBR algorithm gifts very trustworthy results as compared to any other five algorithms. In the feature mapping with bandgaps, they evaluate 30 initial features via the GBR algorithm. Fourteen most crucial features are selected out and construct as an optimal feature set. The new feature set contains: "structural features (tolerance factor, octahedral factor) and elemental properties of A-, B-and X-site ions (total number of ionic charge), p orbital electron, ionization energy, electronegativity, electron affinity, ionic polarizability, sum of the s and p orbital radii, ion radii, the highest occupied molecular orbital and the lowest unoccupied molecular orbital of A-site cations." This model discovering solution utilizes a "last-place elimination" feature selection treatment determined by the GBR mode, it could obtain the efficient predictions with DFT accuracy very quickly, but also works in a tiny dataset. 211 It is essential to conduct a rigorous statistical analysis of the learning process. Often, the central concern is crossvalidation of the assessing dataset, which in turn attempts to make certain that a model designed based on a classic data set can actually handle a new case. 213 Indeed, an exciting new prediction that falls in or just out of the domain of the main dataset can be quantified using the uncertainty. Bayesian methods, for example, "Gaussian process regression," provide a natural pathway intended for estimating the uncertainty in the prediction. [214][215][216][217] "The mean and variance of these Bayesian approach predictions are the most likely predicted value and the predicted uncertainty, respectively." 218,219 Often, the uncertainty enables the continuous and progressive increase in the efficacy of a surrogate model. The uncertainty can be used to progressively make improvements to the prediction model. At any given point of an iterative fitting practice, a number of brand-new candidates are actually predicted to acquire certain attributes with uncertainties. Therefore, the tradeoff between them is "exploiting the results by choosing to perform the next computation (or experiment) on the material predicted with optimal target property or on a material where the predictions have the largest uncertainties." 210,[218][219][220][221] This adaptive design procedure allows the predictions to progressively move away from exploitation and toward exploitation as the predictions become better and the associated uncertainties typically shrink. Representative studies have employed this adaptive learning design framework to accelerate the search for NiTi-based shape memory alloys, 210 lead-free piezoelectrics, 206 and high-efficiency LEDs at high current densities. 222

| CONCLUSIONS
We critically discussed the developments in optoelectronic materials exploration by computational functionalitydriven approaches and attempted to extract three universal computational design strategies, that is, direct in silico computational screening, knowledge-based inverse design, and algorithm-based materials searching. We surveyed a few representative examples of the design of novel PV/PEC and TCO materials using these methods integrated with high-throughput computation. Although not all case studies could be featured, these computational studies embody the overall design principles and efficient computational schemes for the successful identification of functional semiconductors. Similar processes and test procedures are desired for application with many other materials, which usually necessitates different computational levels as well as screening descriptors. In the foreseeable future, it would be ideal to not only predict the novel material, but additionally to develop guidelines for material synthesis conditions from the computational approach, coupled with the experimental response loop. In particular for realistic materials, disorder and extended defects are still challenging and consuming for firstprinciples modeling. Using the development of highthroughput approaches (eg, computations, experimental synthesis, and characterization) together with available databases, we count on a growing number of novel applications of these kinds of integrated material design ideas to accelerate materials development. How to cite this article: Liu Z, Na G, Tian F, Yu L, Li J, Zhang L. Computational functionalitydriven design of semiconductors for optoelectronic applications. InfoMat. 2020;1-26. https://doi.org/ 10.1002/inf2.12099