Explainable Artificial Intelligence Approach to Identify the Origin of Phonon‐Assisted Emission in WSe2 Monolayer

The application of explainable artificial intelligence in nanomaterial research has emerged in the past few years, which has facilitated the discovery of novel physical findings. However, a fundamental question arises concerning the physical insights presented by deep neural networks; the model interpretation results have not been carefully evaluated. Herein, explainable artificial intelligence and quantum mechanical calculations is bridged to investigate the correlation between light scattering and emission in a WSe2 monolayer. Convolutional neural networks using light scattering and emission data are first trained, while expecting the networks to determine the relationships between them. The trained models are interpreted and the specific phonon contribution during the exciton relaxation process is derived. Finally, the findings are independently evaluated through quantum mechanical calculations, such as the Born–Oppenheimer molecular dynamics simulation and density functional perturbation theory. The study provides reliable fundamental physical insight by evaluating the results of neural networks and suggests a novel methodology that can be applied in materials science.


Introduction
The tungsten diselenide (WSe 2 ) monolayer is a promising candidate for future semiconductor applications [1][2][3] due to its outstanding optoelectronic properties. [4,5] The optical properties of WSe 2 are dominated by comprehensive phonon-exciton interactions, particularly the subsequent energy relaxation pathways, which play an important role in charge-carrier transport. [5,6] Phonon-induced relaxation processes in WSe 2 monolayers have been investigated using various spectroscopic methods, [7,8] and it has been suggested that the LA(M) phonon acts as a carrier relaxation assistant. The zone-edge acoustic phonon facilitates efficient energy relaxation (2p/2s ! 1s) in the WSe 2 monolayer with minimal loss of coherence, [9] and it appears in the form of periodic Raman peaks that have energy distances of 15 meV. [10,11] In contrast, the recent theoretical calculations indicate that the out-of-plane A 0 1 mode promotes chargecarrier recombination in the pristine WSe 2 monolayer, [12] and pump-probe experiments revealed that the phonon mode of the periodic Raman peaks can be generated for A 0 1 (Γ) phonon mode through an anharmonic process. [13][14][15] In particular, because the early stage of charge carrier relaxation has an ultrafast timescale (100 fs), [16] it is difficult to directly observe the optical signatures using a specific analytical technique. Therefore, to reveal which phonons dominantly contribute to carrier relaxation, a comprehensive analysis of the hidden phonon-exciton correlation while considering a large amount of correlative spectroscopic data (e.g., one-to-one corresponding Raman and photoluminescence images) is required.
The objective of deep learning (DL) is to extract knowledge from data by training deep neural networks without explicit human instruction. [17] DL enables high-throughput analysis in materials science using large amounts of experimental data. [18,19] In many cases, the output of the DL model provides scientifically The application of explainable artificial intelligence in nanomaterial research has emerged in the past few years, which has facilitated the discovery of novel physical findings. However, a fundamental question arises concerning the physical insights presented by deep neural networks; the model interpretation results have not been carefully evaluated. Herein, explainable artificial intelligence and quantum mechanical calculations is bridged to investigate the correlation between light scattering and emission in a WSe 2 monolayer. Convolutional neural networks using light scattering and emission data are first trained, while expecting the networks to determine the relationships between them. The trained models are interpreted and the specific phonon contribution during the exciton relaxation process is derived. Finally, the findings are independently evaluated through quantum mechanical calculations, such as the Born-Oppenheimer molecular dynamics simulation and density functional perturbation theory. The study provides reliable fundamental physical insight by evaluating the results of neural networks and suggests a novel methodology that can be applied in materials science.
intuitive results that can be readily applied to practical tasks. For example, DL models can quickly and accurately determine the number of layers of 2D materials and efficiently detect abnormal atomic configurations using transmission electron microscopy images. [20,21] In this case, the expected output of the model can be compared with the actual experimental data obtained via analytical measurements, and researchers are more interested in the practical aspects of DL models. The advantage of this approach is that the application of DL models can facilitate (or accelerate) the experiment-oriented process, which can otherwise be laborious and time-consuming, as it requires a routine laboratory workflow for the collection of preliminary data.
In addition, DL can be used to obtain scientific insights from experimental data. For example, Yang et al. reported that the concentration of valence electrons is the most important component for improving the hardness of as-cast high-entropy alloys. [22] Recently, Lu et al. reported that lattice deformation is a more important factor than the doping effect in the emission process by interpreting a machine learning model. [23] In these cases, the models are trained to comprehensively consider the input and output of the data and are expected to discover hidden correlations inherent in the data. Also, the interpretation of the trained model provides evidence of new physical findings in materials science by evaluating the contribution of the input data during model prediction or by observing the changes in the output data while the input data varies. These methods are also called explainable artificial intelligence (XAI), and they include a class activation map, layer-wise relevance propagation, and Shapley additive explanations. [24][25][26] However, a challenge has arisen in the application of current XAI; for the reliable discovery of physical insights, a careful evaluation of the XAI results is required to be followed. [27] In other words, it is crucial to evaluate the XAI results using reliable external methods although its evaluation is difficult.
Herein, we introduce a new methodology by bridging XAI and the first principles, that is, a combination of inductive and deductive analyses. To the best of our knowledge, our study is the first to use quantum mechanical calculations as an external evaluation process in addition to XAI to discover novel physical insights in the field of materials science. In this work, we utilize XAI and density functional theory (DFT) to investigate the origin of the phonon-assisted relaxation pathway of hot excitons. We first measured the Raman scattering (RS) and photoluminescence (PL) spectra and then preprocessed the data using an in-house fitting algorithm, which automates the labor-intensive fitting process. Subsequently, convolutional neural networks (CNNs) that predict PL features from the corresponding RS spectrum were trained to comprehensively consider the phonon and carrier interaction. We then interpreted the trained DL models using intuitive XAI methods, partial dependence plot (PDP), [28] and individual conditional expectation (ICE), [29] which can directly describe the correlation between the RS and PL data. By interpreting the DL model, we discovered that A 0 1 phonon is strongly related to the charge-carrier emission in the pristine region. Finally, we confirmed the coherence of the correlation between the pristine A 0 1 phonon and carrier emission using DFT molecular dynamics and linear response theory calculations. Figure 1 illustrates the overall process of our XAI-applied spectroscopic study. First, we obtained the RS and PL mapping data from several monolayer WSe 2 flakes. Each mapping data contains thousands of spectra, and we treated each pair of RS and PL spectra corresponding to the same location as a singledata instance. Furthermore, each RS and PL spectrum is a mixture of Lorentzian or Gaussian distributions, that is, a combination of multiple peaks, which are known to be related to specific physical properties. To analyze the physical properties of WSe 2 , it is necessary to separate the spectrum into multiple peaks and examine the peak parameter values. The process of separating the peaks from the original spectrum is known as fitting or deconvolution. Because a Lorentzian or Gaussian distribution can be represented by three parameters, that is, position (photon energy) p ∈ ℝ, full width at half maximum (FWHM) w ∈ ℝ, and height h ∈ ℝ, the result of the deconvolution is a set of these three parameters fðp i , w i , h i Þg n i¼1 , which can fully represent the original spectrum using a Lorentzian or Gaussian mixture with n components.

Pipeline of XAI-Applied Correlative Spectroscopy
However, realizing the deconvolution of a large number of spectra is challenging for human researchers because it is a laborious task. To consider the RS and PL spectra comprehensively, we developed an automated deconvolution algorithm based on the gradient descent method. [30] Our algorithm accurately deconvolutes thousands of RS and PL spectra in a few minutes, thereby enabling a high-throughput analysis of the WSe 2 monolayer. In this study, we considered eight Lorentzian and one Gaussian to fit the RS and PL spectra, respectively, following previous studies; [31,32] our algorithm outputs a vector of RS parameters r ¼ ½p i ; w i ; h i ∈ ℝ 24 (three parameters per peak) and a vector of PL parameters p ¼ ½p; w; h ∈ ℝ 3 . Figure 1a presents an example of the fitted RS and PL spectra.
In the next stage, the deconvoluted spectra were used to train and evaluate the CNN models, as shown in Figure 1b. First, we randomly split the data into a training and a validation set in the ratio of 7:3. We then trained the CNN models using the training set to predict the PL p when the corresponding RS r was provided. Here, r is converted into the length-l spectrum R ∈ ℝ 8Âl , which independently describes eight Lorentzian and is fed to the model. After the training process, we evaluated the trained model using the validation set. If the prediction accuracy of the validation set is significantly lower than that of the training set, we can conclude that the model memorizes the training set instead of learning physical insights from it. We repeated the training process by varying the model architectures with different hyperparameters (e.g., the number of training epochs) and observed the same tendencies in the training results. Examples of the model interpretation using PDP and ICE are presented in Figure 1c. The fundamental idea of both interpretation methods is to observe the changes in the prediction while perturbing the input data. The degree of change that occurs during the perturbation indicates the relationship between the input www.advancedsciencenews.com www.advintellsyst.com variable (e.g., a specific RS peak parameter) and the output variable (e.g., the PL peak parameter). Thus, the PDP and ICE evaluate the contribution of each RS peak when the model predicts the PL. For example, if an output variable (e.g., a PL feature) increases, while an input variable (e.g., A 0 1 phonon mode) increases on average during the perturbation, we can interpret that there exists a proportional relationship between these two variables. The PDP shows the average contribution of the input variable to the output variable for the entire dataset (i.e., the entire measurement of the WSe 2 flake). In contrast, ICE exhibits a local (i.e., per-spectrum) correlation between the input and output variables.
We repeated the above procedure for several WSe 2 flakes and conditions and derived the results described in the following sections. Furthermore, we evaluated our XAI findings through a DFT-based external analysis, which is independent of the XAI. Detailed explanations of our fitting algorithm, CNN model architecture and training procedure, and interpretation methods (PDP and ICE) are available in Section 4.

Result of XAI-Applied Correlative Spectroscopy
In this study, we trained CNN models using the experimental results obtained from resonance (2.33 eV) and nonresonance (1.96 eV) conditions to consider the exciton-phonon coupling differences, which vary depending on the excitation energy. In the case of the resonance condition, two different chemically grown WSe 2 monolayers (star-and triangle-shaped, as shown in Figure 2a,d, respectively) were used to train the models. For each experimental data, we split the data into two subsets: training and validation sets. We trained the models using the training set and tested them with both the training and validation sets to determine whether the models were overfitted. On comparing the mean average percentage error (MAPE) and R 2 of the two subsets, we assumed that the models trained on the resonant WSe 2 monolayer found the physical relationship between RS and PL ( Figure S2a,c, Supporting Information). In addition, the images of the PL prediction (Figure 2b,e) accurately describe the nonuniform PL properties derived from various defects on the surface of the chemically grown WSe 2 monolayer. In contrast, the nonresonant WSe 2 monolayer tends to be overfitted ( Figure S2b,d, Supporting Information). We assume that the models simply memorized the training sets because the exciton relaxation pathway is mediated by many long-wavelength phonons instead of specific optical phonons in the nonresonance condition. Therefore, we assumed that the models trained on resonant WSe 2 monolayer data learnt physical insights about the RS and PL and selected this configuration for further analysis.
Subsequently, we applied the XAI to the selected DL models (resonance condition) to determine the hidden correlations between the RS and PL. To examine the contribution of each phonon mode, we drew the PDP using the DL model on all phonon modes of the WSe 2 monolayer as parameters (shown in Figure 3a,b). As briefly introduced in the previous section, the   Figure 1. Overview of XAI-applied correlative spectroscopy. a) Experimental results of the RS and PL of the WSe 2 monolayer are presented in contour images, and both spectra were calibrated and deconvoluted using an in-house automatic fitting algorithm. b) Resulting fitted RS and PL spectra were used to train the CNN models. The various phonon modes in the WSe 2 monolayer were considered to analyze their independent effects. c) Trained CNN model was interpreted by using XAI methods, PDP, and ICE.
www.advancedsciencenews.com www.advintellsyst.com PDP shows changes in the PL prediction when the specific phonon mode varies. The gray region of each spectrum represents the range of 95% of the ground truth PL distribution (y-axis), and the A 0 1 phonon mode describes the widest region of the PL intensity data. This indicates that the A 0 1 phonon mode is the dominant factor for the phonon-assisted carrier relaxation pathway among the various phonon modes in the WSe 2 monolayer. The slightly increasing PDP curve of the A 0 1 phonon mode indicates a positive relationship between the RS and PL intensities. Because the RS and PL intensities imply the frequency of exciton-phonon interaction and the number of emitted photons, the aforementioned proportional relationship is strong evidence that A 0 1 phonon dominantly contributes to the relaxation process of the hot exciton.
The vacancy of the chalcogenide atom (V se ) and its oxygen substitution (S O ) are the most frequently observed in chemically grown WSe 2 , and they suppress or increase the RS and PL processes due to lattice distortion and doping effects. These defectinduced physical properties appeared in the inhomogeneous RS and PL images of the star-and triangle-shaped WSe 2 monolayer flakes (Figure 3c,f,g,j). In general, the presence of V se suppresses the first-order optical phonon modes, and PL quenching effects occur due to nonradiative recombination from localized energy states in the vicinity of the conduction band edge. In the case of S O , the intensity of the RS also decreases due to structural deformation, as in the case of V se , but the exciton peak increases because the intrinsic n-type charge is compensated. We plotted the spatially resolved PL ICE slope (Figure 3e,i) to evaluate the degree of proportionality of the A 0 1 phonon according to the type of defects. The inhomogeneous ICE slope of the A 0 1 phonon mode image means that the DL model strongly considers the specific regions of the WSe 2 monolayer because the value of the PL ICE slope indicates the average change in the PL prediction. Thus, the correlation between the PL intensity and A 0 1 phonon is stronger in the bright region of the ICE image (i.e., large ICE slope). To examine the reason for the inhomogeneous slope of the ICE, we performed a comprehensive comparison using not only the A 0 1 phonon (Figure 3d,h) but also the PL (Figure 3f,j) images with the slope of the ICE. The A 0 1 phonon mode exhibits a high ICE slope in the region with strong intensity (pristine dominant), whereas the ICE slope is relatively low in areas with strong PL (S O rich) due to the charge compensation effect. Consequently, we confirmed that the DL model strongly considers the pristine A 0 1 phonon mode due to the higher ICE slope value than the defective regions such as V se and S O . These pristine-dependent properties provide additional insights that the A 0 1 phonon mainly contributes to the charge-carrier relaxation process instead of the LA(M) phonon because the in-plane acoustic phonon is a representative defect-related indicator with a long recombination rate caused by the defect-induced zone-edge renormalization effect. [33,34] Thus, the pristine A 0 1 phonon dominantly contributes to the efficient charge-carrier relaxation process under resonance conditions as compared to longitudinal acoustic phonons.

Evaluation of XAI-Applied Correlative Spectroscopy via Quantum Mechanical Calculations
Quantum mechanical calculations provide a theoretical basis without empirical elements and are used to explain physical properties by modeling actual circumstances. [35,36] Several recent studies have reported that DFT-based molecular dynamics and www.advancedsciencenews.com www.advintellsyst.com linear response theory facilitate the comprehensive study of charge-carrier dynamics and phonon fluctuations for various defective structures. [37,38] In this work, to efficiently evaluate the contribution of the previous pristine A 0 1 phonon, we performed Born-Oppenheimer molecular dynamics (BOMD) and density functional perturbation theory (DFPT) calculations on three different structures (Pristine, Se vacancies, and oxygen substitution structures, which are shown in Figure 4b; detailed calculation information is described in Section 4.6). The normalized velocity autocorrelation functions are obtained using the BOMD calculation as its Fourier-transformed functions (also known as spectral intensity) enable the identification of the phonons that promote the nonradiative relaxation process of excited carriers, resulting in the energy loss as heat. Figure 4a illustrates the spectral intensity of pristine (shaded gray) and defective WSe 2 with a V se (green) and S O (red). Two strong phonon modes (220 and 270 cm À1 ) exist in the spectral intensity of the pristine structure, and the frequency of 270 cm À1 corresponds to the A 0 1 phonon mode with Raman shift of 250 cm À1 . [39] From the spectral intensity of the V se and S O -containing structure, the A 0 1 À related peak located at the 270 cm À1 peak is reduced, which indicates that the A 0 1 phonon mode induces a excited carrier relaxation in the pristine WSe 2 . Subsequently, we performed the DFPT on pristine and defective structures to confirm the defect-dependent variation of the RS intensity. The presence of the defect induces a structural deformation, which reduces the intensity of the pristine Raman modes (e.g., E 0 , A 0 1 ) and produces abnormal phonon modes (shown in Figure 4c). The noticeable decrease in RS intensities within the defective structures (V se , S O ) corresponds to the correlation described earlier, which is associated with the interpreted DL model and A 0 1 phonon mode. This not only proves that the previous PL ICE is related to the strong A 0 1 phonon mode but also that the pristine A 0 1 phonon contributes to efficient carrier relaxation processes because the defect interferes with the electron-phonon interaction. Throughout the theoretical calculations, we Figure 3. Interpretation of the trained models using XAI methods. The PDPs are shown in a,b). The grey regions in the PDPs represent 95% of the real data (PL intensity). Among all the phonon modes, the A 0 1 phonon mode best describes the PL feature. c,d,f,g,h,j) RS and PL mapping images of chemically grown WSe 2 monolayer are shown, and all colour bar units are divided by 10 3 values. e,i) To confirm the spatially resolved gradients, the ICE curves are visualized in an image form. On comparing the RS and PL images with the ICE, we can observe a positive /inverse correlation between them.
www.advancedsciencenews.com www.advintellsyst.com evaluated evidence of pristine A 0 1 phonon contributions for the hot exciton relaxation process, which is consistent with the results of the XAI method.

Conclusion
In this study, we investigated the correlation between light scattering and emission by a WSe 2 monolayer using XAI and quantum mechanical calculations. The measured spectroscopic data obtained from the monolayer WSe 2 were used to train the DL models. To obtain comprehensive insights into the physical mechanisms of the phonon-assisted charge-carrier relaxation pathway, we interpreted the DL model using PDP and ICE. As a result, we revealed that the pristine A 0 1 phonon mode is strongly related to the PL features, which indicates that the A 0 1 phonon mainly governs the hot exciton relaxation as compared to other phonons. We further evaluated the defect dependence of the optical phonon mode using quantum mechanical calculations, such as DFT-based molecular dynamics and linear response perturbation theory. Through the application of BOMD calculation, we were able to establish the role of the A 0 1 phonon in facilitating charge-carrier relaxation, while the utilization of DFPT calculation confirmed the notable reduction in the A 0 1 phonon mode within defective structures. Our work not only provides a fundamental understanding of the early stages of carrier relaxation in WSe 2 monolayers but also demonstrates an advanced methodology for nanomaterial research.
Substrate Preparation: A 300 nm-thick oxide layer deposited on a Si substrate was treated with a 0.5 M NaOH aqueous solution for 30 min, followed by rinsing with deionized water. The treated SiO 2 /Si substrate was coated with the W-precursor solution at 3000 rpm for 1 min.
Growth of Monolayer WSe 2 Flakes: A two-inch quartz-tube-equipped twozone furnace system was used to grow monolayer WSe 2 flakes. A quartz boat containing Se pellets and a W-precursor-coated SiO 2 /Si substrate were loaded into the center of the upstream zone (zone 1) and downstream zone (zone 2), respectively. The temperature of each zone was increased to 385 and 765°C for 7 min and maintained for 10 min. Subsequently, the quartz tube was cooled naturally to room temperature. The entire growth process was conducted under Ar and H 2 atmospheres with flow rates of 550 and 5 sccm, respectively, at atmospheric pressure.
Measurement: The synthesized monolayer WSe 2 was transferred to a SiO 2 /Si (300 nm) substrate using the wet transfer method before performing the RS and PL measurements. Both the RS and confocal PL mapping data were measured using a multifunctional microscope system (NTEGRA, NT-MDT, Zelenograd Co., Russia & WITec Alpha 300, Oxford Instruments, Germany). The excitation sources were 532 and 633 nm, and high-magnification objective lenses (100x, NA = 0.9) were used. Throughout the experiments, gratings with 150 and 1800 grooves per mm were used for the PL and RS, respectively.
Fitting Algorithm: The fitting process was started by classifying the background noise and signal of the actual WSe 2 flake. We assumed that the flake was not placed at the corner of the mapping image and computed the mean  www.advancedsciencenews.com www.advintellsyst.com μ and variance σ of the spectra measured from the corner. Subsequently, we removed the signal if the mean of the signal was less than μ þ k Â σ. Thus, if the average of an arbitrary signal was equal to or greater than μ þ k Â σ, it was considered to originate from the actual WSe 2 flake. Next, we eliminated the noise and conducted baseline corrections for each spectrum. To detect noise, we computed the first-order difference in the spectrum. If the difference was larger than the threshold, the corresponding signal was removed and the removed values were linearly interpolated. We divided the spectrum into several areas by a specific wavelength and Raman shift and applied different thresholds to each area. Finally, we performed baseline correction by subtracting a piecewise linear function from the original spectrum. Similar to noise removal, we set several areas along the spectrum and defined the piecewise linear function by connecting the average signal values near the boundary of each area. It should be noted that k and the boundaries of the areas were empirically selected and were different for each type of spectroscopy and WSe 2 flake.
After the preprocessing, deconvolution was conducted based on the RAdam optimizer. [30] Starting from the initial peak parameters, the RAdam optimizer searches for the optimal parameters by minimizing an objective function. The objective function was the mean-squared error between the original and fitted spectra. We also introduced two regularization techniques to increase the accuracy of the deconvolution. We enforced that the height values should be positive numbers and regularized the position of each peak such that it did not shift beyond the initial value. The initial peak parameters were selected based on previous studies. [31,32] In the case of RS, fitting was done separately for the first-order and second-order Raman regions (183-300 and 314-410 cm À1 , respectively) because the intensity of the former region was much higher than that of the latter region. Hyperparameters, such as the number of iterations and the learning rate, were also empirically selected and were different for each flake and area. Finally, we skimmed the fitted results and removed the abnormal results or refitted them with other hyperparameters.
Training the Convolutional Neural Network Model: Both the deconvolution algorithm and training code were written in Python and PyTorch. [40] Figure S1, Supporting Information, shows the model architectures used in this study. Basically, the models consisted of ten 1D convolutional layer blocks and three linear (i.e., fully-connected) layers. Each convolutional layer block ('conv block' shown in Figure S1a, Supporting Information) is a stack of a convolutional layer, leaky rectified linear unit (LeakyReLU; α = 0.01), and instance-batch normalization. [41] We also tested a residual block ( Figure S1b, Supporting Information), [42] which consisted of three convolutional layer blocks and one LeakyReLU (α = 0.01) for evaluating the effect of model architectures. After two or three blocks, we used average pooling with kernel size and stride of two to reduce the size of intermediate features. The average pooling layer after the last convolutional block output a feature whose shape was (b, c, l), where b is the batch size, c is the channel size of the last convolutional layer, and l is the feature length. To guarantee a fixed-size feature vector for the linear layers, we used an additional average pooling layer that makes l equal 4; therefore, the shape of the resulting feature was (b, c, 4). The feature was flattened before the first linear layer. The linear layers, except for the latest one, used ReLU activation and dropout [43] with a probability of 0.5.
We used the Adam optimizer [44] with a learning rate of 0.001. Each training and interpretation were conducted on a single NVIDIA GeForce RTX 3090 GPU with a batch size of 128. We set the number of training epochs to range from 1000 to 10 000, and the training results showed the same tendency (see Figure S2, Supporting Information). We split the entire dataset into training and validation sets, and the ratio of the training to the validation set was 7:3. Subsequently, we trained the models using the training set and tested the model on both the training and validation sets. By comprehensively considering the qualitative (scatterplot) and quantitative (mean absolute error and R 2 ) results obtained from the trained models, we determined whether the model was overfitted and selected configurations (e.g., resonance condition) to be interpreted. For the selected configurations, we trained each model again with the entire dataset (both the training and validation sets) and interpreted the resulting model. We used the model using the convolutional layer blocks (not residual blocks) with training epochs of 10 000 for drawing the figures in the manuscript except for Figure S2c,d, Supporting Information.
Model Interpretation: One of the mandatory criteria for interpreting the DL model is high prediction accuracy. When we trained DL models that predicted PL features with RS features for several WSe 2 flakes, the accuracy of PL intensity prediction was satisfied with the criterion, while that of PL position and FWHM did not. Thus, we focused on explaining the correlation between RS features and PL intensity, where the DL model showed superior accuracy in this work. However, it would be too naive to conclude that there was no correlation between RS features and PL position or FWHM. Nonetheless, our observation was that the correlation between RS features and PL intensity was significant.
In order to analyze the correlation between RS features and PL intensity, we used PDP and ICE. The basic idea of PDP [28] and ICE [29] is to analyze how the model prediction changes when a portion of the input changes in order to evaluate the contribution of each input variable to the model prediction. The analysis of both methods began with the selection of the input and output features to be evaluated. We assumed that f θ was the model parameterized by θ, X ¼ fx d g D d¼1 is the set of input features where x ¼ ½x 1 , x 2 , : : : , x n ∈ ℝ n is n-dimensional input feature, and D is the number of data. To observe the changes in the model prediction when varying the i-th input variable of j-th data x ðjÞ i , we computed the PDP as follows: In this work, the input X is the RS data obtained from the WSe 2 flakes. Under the same condition, the ICE is computed using Here, ε is a small number for perturbation. Because the ICE is computed for each spectrum, we obtained Figure 3e,i by assigning the average slope of the ICE curve to each pixel value of the corresponding location. In detail, we computed two ICE values by setting ε to AEσ i , the standard deviation of x i in X and computed the pixel value as follows: Theoretical Calculation: Quantum mechanical simulations were performed using the plane-wave method (CASTEP) as implemented in the BIOVIA Materials Studio platform. [45] During the calculation, the WSe 2 monolayer was modeled and geometry optimized with a local density approximation (LDA) functional. To achieve sufficient computational precision, the self-consistent function, the convergence tolerance of the force criteria, the sampling of k points, and the cut-off radius were implemented as 1.0 Â 10 À10 eV atom À1 , 0.01 eV Å À1 an actual spacing of 0.02 Å À1 , and 500 eV, respectively. The norm-conserving pseudopotential and Koelling-Harmon relativistic treatment were used for the entire calculation process. [46,47] Subsequently, we added a 20 Å vacuum slab, representing an isolated layered structure in a 3D periodic system. The proposed optimized pristine WSe 2 monolayer matched well with a previous report (our result: a = 3.268 Å, d WÀSe ¼ 2.518 Å; reference: a = 3.34 Å, d WÀSe ¼ 2.55 Å). [48] To identify the effects induced by the Se vacancy (V se ) and oxygen substitution (S O ), we used the size of the supercell structure (4 Â 4), and quantum mechanical calculations were performed on the V se -, S O -implied WSe 2 monolayer structure.
We performed Born-Oppenheimer molecular dynamics simulations [49] using a microcanonical ensemble (constant number of atoms N, volume V and energy E) at 273.0 K with a timestep of 10 fs. In these simulations, we used the LDA functional for electron-exchange correlation, a plane-wave www.advancedsciencenews.com www.advintellsyst.com basis kinetic energy cut-off of 500 eV, and only the Γ-point in the Brillouin zone to enhance the simulation speed. Linear response theory was used to calculate the phonon and Raman tensors of each structure. [50] The electric eigenvalue and phonon energy tolerances were set as 1.0 Â 10 À10 eV atom À1 and 1.0 Â 10 À5 Å 3 , respectively. After determining the Raman activity of all the normal modes, the Raman intensities were obtained by calculating the tensors based on the exposed light wavelength (2.33 eV) and temperature (300 K) with a Lorentzian smearing of 5 cm À1 .

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.