Physics‐Informed Machine Learning for Inverse Design of Optical Metamaterials

Optical metamaterials manipulate light through various confinement and scattering processes, offering unique advantages like high performance, small form factor and easy integration with semiconductor devices. However, designing metasurfaces with suitable optical responses for complex metamaterial systems remains challenging due to the exponentially growing computation cost and the ill‐posed nature of inverse problems. To expedite the computation for the inverse design of metasurfaces, a physics‐informed deep learning (DL) framework is used. A tandem DL architecture with physics‐based learning is used to select designs that are scientifically consistent, have low error in design prediction, and accurate reconstruction of optical responses. The authors focus on the inverse design of a representative plasmonic device and consider the prediction of design for the optical response of a single wavelength incident or a spectrum of wavelength in the visible light range. The physics‐based constraint is derived from solving the electromagnetic wave equations for a simplified homogenized model. The model converges with an accuracy up to 97% for inverse design prediction with the optical response for the visible light spectrum as input, and up to 96% for optical response of single wavelength of light as input, with optical response reconstruction accuracy of 99%.

Optical metamaterials manipulate light through various confinement and scattering processes, offering unique advantages like high performance, small form factor and easy integration with semiconductor devices.However, designing metasurfaces with suitable optical responses for complex metamaterial systems remains challenging due to the exponentially growing computation cost and the ill-posed nature of inverse problems.To expedite the computation for the inverse design of metasurfaces, a physics-informed deep learning (DL) framework is used.A tandem DL architecture with physics-based learning is used to select designs that are scientifically consistent, have low error in design prediction, and accurate reconstruction of optical responses.The authors focus on the inverse design of a representative plasmonic device and consider the prediction of design for the optical response of a single wavelength incident or a spectrum of wavelength in the visible light range.The physics-based constraint is derived from solving the electromagnetic wave equations for a simplified homogenized model.The model converges with an accuracy up to 97% for inverse design prediction with the optical response for the visible light spectrum as input, and up to 96% for optical response of single wavelength of light as input, with optical response reconstruction accuracy of 99%.
These systems usually comprise ultrathin array of periodic subwavelength structures, called meta-atoms that mimic atoms in ordinary materials. [10,11]These meta-atoms have multidimensional design parameters that are challenging to be determined by inverse design problem.
The inverse design of metamaterials using semi-analytical methods such as rigorous coupled-wave analysis (RCWA), or numerical methods such as finite difference time domain (FDTD) and finite-element method (FEM), is computationally expensive due to the immense size of the design space. [8,9]hese methods often involve iterative searches and parameter sweeps through a multidimensional space, making the design process time-consuming.Additionally, when these approaches are used, the design geometry is often limited to simple design parameters, which does not capture the full potential of the optical metamaterial design.Few studies have also used evolutionary methods like genetic algorithm to facilitate metamaterial design. [12]15][16] Furthermore, these algorithms necessitate the constant execution of physics-based simulations to evaluate the objective function, adding to their computational demands and making them less efficient for designing complex metamaterial systems. [17,18]ecent advancements in computer architectures, algorithms, computer hardware, and the availability of large datasets have enabled the use of deep learning (DL) methods to solve various mechanics problems, including the inverse design of materials. [19,20]DL algorithms utilize multiple layers of nonlinear transformations to learn complex functions from data. [21]owever, despite its success in solving complex problems, DL still faces several issues, such as high data burden, lack of robustness, and difficulty in interpretation due to learning high-dimensional complex functions. [22]Moreover, DL solutions often face convergence issues when multiple valid solutions exist, as it is deterministic in nature.The EM wave equations governing the relationship between design and optical response are highly nonlinear, and even a relatively small error in the design prediction can result in considerable deviation in the optical response.Additionally, the one-to-many mapping from response to design results in nearly identical optical responses being produced by different design structures, which further complicates the inverse problem solution using DL methods and renders it unstable and continuously dependent on initial conditions. [23]hus, solving the inverse problem by DL methods require adoption of either 1) a probabilistic approach instead of deterministic model or 2) a regularizer to constrain the network parameters to a specific domain.
DL-powered inverse designs for metamaterials have adopted generative models like generative adversarial network (GAN) and variational autoencoder (VAE). [24]GANs were employed by Liu et al. [25] and Jiang et al. [26] to predict structural images for a given transmission spectra.These networks can generate novel structural patterns, which can provide insight into new structures beyond human intuition built on experience and knowledge.VAE, a semi-supervised strategy, has been utilized by Ma et al. [27] and Liu et al. [28] VAE models have a decoder that reconstructs the structure geometry from compressed latent variables.The encoder learns the parameters of the distribution of the latent variables.The decoder output generates multiple candidate designs for a spectrum, from which a single design is chosen based on fabrication requirements.
The tandem model architecture developed by Liu et al. for optical metamaterial design aims to predict structure design that leads to proper reconstruction of optical response. [29]The model concatenates an inverse DL model architecture, which predicts structural design from response data, with a pretrained forward DL model that predicts optical response from the predicted design.The model then minimizes the difference between the input response spectrum and the corresponding spectrum produced by the predicted design parameter.This method functions like a pseudo autoencoder model, requiring fewer parameters to train and achieving stable convergence.
Another approach uses gradients from training the forward model for inverse design.Starting with an initial design, the loss is calculated as the difference between the chosen design's reconstructed spectrum and the actual spectra.Gradients of the loss function with respect to design parameters from the already trained forward model are used to guide design updates through iterations to minimize the loss.They enable exploration of large number of designs simultaneously and thus can yield multiple designs for the inverse problem.32] A comparison of model performances of tandem neural network (tandem-NN) with conditional GANs and conditional VAEs for inverse design of metamaterials is performed by Ma et al. [33] .In this study, it is seen that the tandem model slightly outperforms the generative models in terms of target reconstruction for low degree of freedom structure.Since tandem networks have a relatively simple architecture, they are able to capture the response-design relationship with less data requirement [17,34] and easier hyper-parameter tuning [35] than the generative models.
However, the tandem model introduced by Liu et al. [29] lacks a constraint on the design parameter in the loss function.Consequently, the inverse model is learned by minimizing the optical reconstruction cost function, which can result in design parameters that are far from the ground truth.[38] To address this limitation, researchers have used hyper-parameter-driven design loss, serving as a penalty term, along with the reconstruction loss term in the model's loss function.This addition enhances network robustness and ensures that the retrieved parameters closely align with the dataset. [36,38]Moreover, the design loss inclusion facilitates the consideration of any fabrication prerequisites when applying inverse design in practical applications.This approach introduces a small regularization loss, guiding predictions toward at least one potential design parameter within the training dataset.Although this may slightly impact convergence due to a marginally higher loss for alternate candidate design parameters, the concurrent presence of the reconstruction loss aids in maintaining algorithmic stability.
Furthermore, the tandem model is a data-driven DL model whose internal working is often difficult to comprehend.DL models have multiple hidden layers with intricate activation functions, making it challenging to fully comprehend the learned relationships.Therefore, DL models are often referred to as "black box" models.Moreover, DL models require a large amount of data and have poor generalization, meaning they may not perform well on inputs outside the range of the training set. [23,24]A naive solution, like, simplification of the DL model or using simple linear/nonlinear ML algorithms with less parameters to learn, would not work in practice.Even though the simplified model is easier to explain, the approximation power of DL is lost which decreases the accuracy of the model.Another approach to tackle the issues involves increasing the design space by data augmentation strategies.However, this too requires producing new labeled data which is resource intensive.
Recently, scientists have adopted a method to improve generalization power of DL models while reducing the large data requirement and producing scientifically consistent predictions.Domain knowledge, like, the governing physics are integrated in a DL model by a variety of methods, which include incorporation in the loss function, residual modeling, and initialization of model parameters or architecture of the DL model. [2]][41][42][43] Research conducted by Lu et al. [44] and Pestourie et al. [45] employ neural networks in conjunction with a low-fidelity physics solver (i.e., simplified physics model) to alleviate data requirements and enhance computational efficiency.The incorporation of physics principles ensures the preservation of the conservation laws and symmetry requirements.In our work, we aim to leverage this multi-fidelity model, which combines a low-fidelity physics-based model with a data-driven neural network for the purpose of inverse design of optical metamaterials.This model has the capability to predict unique design parameters that align with the physics principles integrated by the simplified physics model.This integration enhances the interpretability of the predictions, offering an alternative to "black box" DL models.Moreover, the combined output of the neural network and the simplified physics model can closely match the results obtained from high-fidelity or full-feed physics simulations. [45]o integrate physics into the DL model, we utilize two approaches: 1) physics-informed loss (PIL) function, which includes a constraint based on the governing physics in the model's loss function, and 2) physics-informed design of architecture, where physics-based features are embedded into the neural network design via intermediate layers.Specifically, we solve Maxwell's EM wave equations for a simplified homogenized structure to obtain the physics knowledge.We penalize the final design predictions that are not physics consistent in the loss function, or guide the design toward physics consistency through the architecture.This physics knowledge acts as a regularizer during the training of the DL model which reduces the search space of the model parameters.Hence, we predict design parameters that are explainable without decrease in prediction accuracy with less labeled data.Furthermore, the DL models have more generalization power for out-of-sample scenarios.Thus, the addition of physics in the DL model makes them models more lucrative to domain scientists.Our study demonstrates that the physics-informed DL models outperform the purely data-based DL method in terms of design prediction and reconstruction accuracy.
The rest of the paper is organized as follows.Section 2 discusses about the representative structure of an optical metamaterial and the underlying mechanics that this structure follows.Section 3 describes the specific DL components used for inverse design as well as the integration of physics in the DL model.Section 4 evaluates and characterizes the model performances with concluding remarks discussed in Section 5.

Physics-Based Model
In this section, the geometry of a unit optical metamaterial cell, the governing physics of optical metamaterials, and the simplification of the structure by homogenization are discussed.

Description of Optical Metamaterial Structure
Metasurfaces have various types of structure including spherical shell meta atoms, [46,47] stratified medium of metal, and dielectric, [48,49] and square or circular split ring resonator.[52] These structures effectively modulate the incident light, such that the outgoing light waves have the desired amplitude and phase. [50]The gratings can have a variety of designs ranging from simple structures like cylindrical or rectangular to complicated structures like gyroid inspired by scales of butterflies, bow, H, or cross. [51,53,54]n this study, rectangular gratings are analyzed that are periodic in the x 1 -direction and homogeneous in the x 2 -direction.These gratings are stacked on an insulating polymer film on a substrate, as depicted in Figure 1a, with the stacking taking place along the x 3 -direction.The incoming light propagates along the x 3 -direction with normal incidence.The design parameters of this metamaterial system consist of the width of the grating, w, the period of the grating, p, the thickness of the grating, t 1 , and the thickness of the polymer film, t 2 .The substrate and grating material are made of metal, specifically gold (Au) due to its high absorption coefficient, [55] and the dielectric film consists of an insulating substance, such as liquid crystal elastomer (LCE), with tunable properties.Such structures are easy to manufacture while being able to produce a wide range of optical responses. [51]This plasmonic device is representative of a structure that manipulates light through various light confinement and scattering processes. [10,11]The parameters that vary during optical metamaterial response modeling include the parameters of the incident light, such as the wavelength, angle of incidence, and polarization, as well as the design parameters, material properties (e.g., refractive index), and optical response.

Governing Physics
Optical metamaterials follow Maxwell's EM equations, Equation ( 1) and (2), where E, H, ω, μ 0 , ε 0 , and ε are electric field, magnetic field, angular frequency of the wave, vacuum permeability, vacuum permittivity, and relative permittivity, respectively. [56]These equations are solved for a particular design to obtain response in the form of electric field and magnetic field with an incident EM wave of λ 0 wavelength.
The optical response is computed as the ratio of the resultant field intensity to the incident field intensity, which determines the reflection, transmission, or absorption of the incident wave.To solve the inverse design problem using a physics-informed machine-learning approach, the machine-learning algorithm is guided by the consistency of the response obtained by solving EM wave equations for the predicted design.However, solving these equations for a complex structure with high dielectric contrast, such as the one shown in Figure 1a, is computationally expensive and requires numerical or semi-analytical methods.
To enable inexpensive computation and incorporate physics as a guide for the machine-learning model, we use a simplified structure that homogenizes the top layer.This approach reduces the design parameter space and makes it easier and faster to calculate the response by solving EM equations.The homogenization results in an effective material property that approximates or "averages" the property of the original metal gratings in the top layer, in accordance with effective medium theory (discussed in Section 2.3).This simplification allows the response to be calculated using analytical solutions and seamlessly incorporated as guiding physics for the DL algorithm.

Simplified Physics Model Through Homogenization
Effective medium theory simplifies the design of complicated structures, which provides computational efficiency and straightforward calculation of the forward equation.When the conditions for the effective medium theory hold, the resultant simplified structure has an equivalent response as the original complex structure.The effective medium theory in conjunction with analytical calculation of the optical response has been used in many studies of photonics, [57][58][59][60][61] elasticity, [62] and acoustics. [63]e incorporate the physics by considering a stratified medium, wherein, the top layer, containing gratings, is homogenized in accordance to effective medium theory.We utilize this simplified model to introduce a scientific consistency penalty in the DL algorithm, which reduces the search space to design prediction that is consistent with the governing physics.A homogeneous layer is considered with effective refractive index n eff that is intermediate between n au and n air .
The effective index is dependent on the polarization of the incident light.We consider normal incidence, with the incident light travelling in the x 3 direction with transverse electric (TE) mode of wave propagation where (E ⊥ K) where K is the incident light's wave vector and E is the electric field of the EM wave.The effective refractive index takes into consideration the fill factor ( f ) which is the volume fraction of metal present in the top layer.In TE mode, considering λ 0 ≫ p, and E as approximately continuous across the boundary, from discontinuity of H across the boundary, [60,61] we get the effective refractive index in Equation (3).
When the dimensions of the structure are comparable to the wavelength of the incident light, diffraction effects become significant and impact the optical properties.However, in this regime, the aforementioned homogenization theorem cannot be used to calculate the effective optical response. [60]Rytov et al. [64] derived transcendental wave equations for the TE mode of propagation of plane waves in an infinite periodic layered medium, given by Equation (4).The solution of this equation involves higher-order refractive indices.For λ 0 ≫ p, the tangent term in Equation ( 4) can be truncated, and the homogeneous effective index in Equation ( 3) can be recovered.However, for design parameters that are mostly sub-wavelength, i.e., λ 0 > p and not λ 0 ≫ p, the second-order solution given by Rytov et al. [64] which truncates the tan x series at the cubic term, is more appropriate, as shown in Equation ( 5).
After homogenizing the top grating layer of our representative structure in Figure 1a, the metamaterial structure gets converted to a stratified medium with effective properties in the top layer.This homogenization is depicted in Figure 1b.
Transfer-matrix method (TMM) is used to solve EM equations in a multilayer system subject to a uniform incident field. [65,66]he field in the medium is divided into two components, the forward (transmitted) component and the backward (reflected) component.The amplitudes of the field across an interface (say from material A to B) are related by the Fresnel transmission (t AB ) and reflection coefficient (r AB ).The phase shift across the medium (say, B) is controlled by a factor composed of the wave number (k), refractive index of each medium (n B ), and the thickness of the layers (t B ).The system-transfer matrix is defined by combining these factors for each interfaces and mediums, which determines the amplitude of the field in each layer.The reflection/absorption/transmission coefficient is computed from the elements of the transfer matrix.
Our physics-informed machine-learning algorithm examines the consistency between the predicted design's reflection coefficient calculated from the aforementioned simplified method, and the reflection coefficient corresponding to the true physics, which is the input to the DL model.This would guide the algorithm and predict design parameters, which is consistent with the governing physics.

Physics-Informed DL for Inverse Design
In this section, we discuss the physics-informed DL framework that we have developed to solve the inverse design problem.The DL framework consists of a tandem architecture that has an inverse model in conjunction with a pretrained forward model. [29]Training this network to predict design parameters ensures that the predicted design reconstructs the optical response.Furthermore, we introduce the simplified physics which is used as a constraint to guide the DL model to produce scientifically consistent results.This simplified physics is applied to an effective structure which is obtained by homogenizing the complicated grating layer of the predicted metamaterial design.The physics equations are then solved to obtain the optical response, which is compared to the initial input optical response to guide the DL model.The simplified physics is incorporated as a part of the loss function of the DL model or as one of the layers in the DL model architecture.The overall structure of the method schematic is depicted in Figure 2. We use the physics-informed DL approach to facilitate inverse design for a 1D periodic grating metamaterial structure, as illustrated in Figure 1a.Nevertheless, it is interesting to note that this methodology can also be extended to 2D periodic grating, multilayer optical metamaterial structures.

Components of the Physics-Informed DL Model's Architecture
In this subsection, we introduce the physics-informed DL architecture (PIA) model.We formulate the inverse problem, introduce the reconstruction constraints, and the physics information to enable proper physics-consistent prediction and optical reconstruction of the predicted design.
For our study, we consider the inverse of the EM equations as the target function.Given input and output pair of response and design, fR i , D i g n i , we train a DL model to learn the relationship between R i and D i .For the forward EM equation, The DL model is trained to approximate the inverse function, f À1 using training data that follows, As discussed in Section 1, due to the ill-posed nature of inverse problems and high nonlinearity of the EM equations, a plain inverse model cannot ensure prediction of design parameters with accurate reconstruction of optical response.Hence, we develop a tandem architecture wherein a DL model approximating the forward EM equation is appended with the inverse DL model, such that, the output of the inverse model, D, is fed into the forward DL model.
The forward DL model approximates the function that calculates R from D. The loss of the forward model is defined as the difference between predicted response from the forward DL model, R recon , and response produced by true design, R. The forward model is trained such that the difference between R and R recon is minimized.
The tandem model architecture consists of the inverse DL model in conjunction with the pretrained forward DL model.
After training the forward model, the weights of the model are frozen, i.e., they are considered as non-trainable parameters during the training of the tandem model.The inverse model predicts the design parameters and this output is fed into the pretrained forward DL model, as depicted in Figure 3, to ensure proper reconstruction of optical response.The entire structure is trained to minimize the design prediction error (deviation of predicted design from the inverse model, D, from the ground truth design, D) and the reconstruction error (deviation of the response reconstructed by the predicted design, R recon , from the input response, R input ).
We introduce the physics-based constraint to guide the tandem DL architecture for better design prediction.The physicsbased constraint is calculated by solving for the optical response from the physics equations introduced in Section 2 from the predicted design parameters.The optical response calculated is then compared with the true optical response.This model is named PIL.However, the simplified physics model is an approximation of the true model and the effective homogenization principles are only followed by design parameters that have λ 0 =p > 2. Thus, we cannot incorporate the physics-based constraints for every design parameter in our domain.
To tackle this issue and to ensure that the physics knowledge for all observations is utilized, we assign the design parameters that follow the simplified physics model to an intermediate layer in the neural network (NN) model.The intermediate layer outputs the design parameters that produce response as per the simplified physics-based model.The intermediate layer predicts the thickness parameters, t 1 ð , t 2 Þ, and fill factor, ( f ), from which the effective refractive index is calculated according to Equation (3).The optical response from the simplified structure is calculated from the physics equations and compared to the true response for the physics-based constraint term.It's crucial to emphasize that the design parameters are not expressly bound; rather, they are configured to yield the authentic optical response following the 1D TMM equations.As previously highlighted, the true response fluctuates with p, but varying p can yield an identical f, and the identical f through the simplified physics response would yield the same response for different p, which would deviate greatly from the true response.Consequently, the predicted f

Training of the Physics-Informed DL Model
Having discussed the architecture of the physics-informed model, in this section, we introduce the training of the DL model by describing the following: 1) objective function that consist of data-based and physics-based loss, 2) training the DL model by a non-convex optimization technique and backpropagation.

Loss Function of DL Model
To train our physics-informed DL model, we designed an objective function, Equation (6), that is to be minimized to learn the optimal model parameters.The objective function of the inverse model minimizes the loss associated with the design parameter, which is the first term in Equation ( 6).The tandem model architecture adds a reconstruction loss which is the second term in Equation ( 6).This term calculates the deviation of the optical response produced by the predicted design, D, from the desired response, i.e., difference between the reconstructed response, which is the output of the pretrained forward model, R recon , and the input optical response, R input .
The physics is incorporated in the third term in Equation ( 6) as a penalty for final output layer or intermediate layer for PIL and PIA, respectively.The penalty is the difference between input optical response, R input , and the response calculated by solving EM equations for a simplified design structure, R physics .The simplified design corresponds to homogenizing the top grating layers of the representative structure.For PIA, the final layer design output is only used for the data-based loss function term and the intermediate layer design output is used for the physicsbased loss function term.Whereas, for PIL, the final layer design output is used for both the data-based and physics-based loss function terms.
The weightage of each loss term is a hyper-parameter.w data 1 and w data 2 are chosen based on grid-search hyper-parameter tuning that gives best model performance on the validation set.w phy is proportional to λ 0 /p which ensures that more weightage is given to the physics-based loss function term when λ 0 ≫ p and the effective medium theory of the homogenization hold true.

Training the DL Model
DL models typically consist of multiple layers of neurons with each layer consisting of hundreds of neuron.The number of layers and neurons depends on the type and the complexity of problem that is to be solved.Since we are dealing with a regression problem with continuous input, we chose <10 layers of neurons and each layer has a few hundred neurons. [21,67]ach neuron has a nonlinear transformation of the input variables, or, an activation function, to approximate the relationship between input and output variables.The most common activation function used for this problem is rectified linear unit (RelU), [21,68] sigmoid and hyperbolic tangent (Tanh). [13,26]owever, RelU activation function, max 0, z k ð Þ, often encounters the "dying RelU" problem where the neuron associated only outputs zero. [69]This can be solved by using variants of RelU, e.g., smooth and continuous function, sigmoid-weighted linear units (SilU), [70] which gets rid of the point of inflection and has nonzero slope segments.SilU activation function has the functional form of sigmoid function multiplied by its input z k σ z k ð Þ [70] .We use SilU as the activation function for most layers as it has an advantage of nonsaturation of gradient while being smooth and continuous at all points.We also use Tanh as the activation function for some layers as they preserve negative inputs and have strong gradients, which lead to big learning steps and faster convergence.
The model parameters of the physics-informed DL model are determined by minimization of the objective function defined in Section 3.2.1.The objective function is optimized by backpropagation with "AdamW" optimizer, which uses the adaptive momentum technique along with weight decay.This optimizer is chosen because momentum helps in faster convergence and weight decay provides additional regularization to prevent overfitting.The physics principles are leveraged as additional regularization terms that steers the learning of the model parameters such that the model predictions are consistent with the governing physics.

Evaluation
In this section, we discuss the evaluation of our developed algorithm and the description of the dataset used.To model the complicated relationship between the design parameters and the EM response, there are a number of variables that must be considered.They can be grouped into the following categories: parameters of incident light: wavelength of the incident light; design parameters: width (w), period (p), thickness (t 1 ) of metal grating, and thickness (t 2 ) of polymer; and optical response: reflection and absorption of the incident light.
Since we used an absorbing material like gold as the substrate and the metal grating, the metamaterial structure only absorbs and reflects the incident light.Since the transmission of light is not considered, the complex reflection coefficient and absorption of light are the input optical response to the DL model.Absorption of light is the intensity attenuation as light passes through the material.The reflection coefficient has real and imaginary components.The complex reflection presents the existence of phase shift between incident and reflected EM waves.Due to the conservation of energy, the amplitude of reflection coefficient and absorption coefficient follows Equation (7).
Wavelength of light is considered in the visible range.For this study, we have developed our model for different input cases-fixed single wavelength, variable single wavelength, and multiple wavelengths.For fixed single wavelength model, the optical response-reflection and absorption-for a fixed wavelength value is considered for inverse design prediction.For variable single wavelength model, the optical response-reflection and absorption-particular wavelength, and material information is provided as input to the inverse DL model.For multiple wavelengths model, the reflection and absorption for 40 equidistant wavelength points in the wavelength range 500-700 nm are input to the inverse DL model.The standardized optical response for a particular set of design parameters is given in Figure 4.The design parameters span the dimensions within the range of 50-500 nm for period of the grating, 30-200 nm for width of grating, 10-60 nm for thickness of grating, and 10-100 nm for thickness of polymer film.

Dataset Preparation
The dataset is prepared by a semi-analytical approach that solves the EM equations called RCWA.RCWA offers a computationally efficient and numerically stable method to provide exact solutions to Maxwell's equations for multilayer periodic structures. [37,71,72]In this algorithm, infinite periodic structures are calculated with Fourier harmonic basis.For the purpose of this study, ten Fourier harmonics orders are considered.The 2D RCWA is performed that consider stacking along x 3 -direction and periodic grating along x 1 -direction.The inputs to the RCWA model are design dimensions, material parameters, and parameters of incident light.The model solves the forward model and provides the reflection and absorption coefficient.
In the variable single wavelength model, we recorded a total of 60 000 observations for TE polarization.This dataset included the optical response measured at 40 evenly spaced wavelength points ranging from 500 to 700 nm.Each wavelength point was evaluated for 1500 unique design parameters.We used these 1500 distinct design parameters along with their corresponding optical response spectra measured at 40 different wavelength points as the dataset for the multiple wavelength model.For the fixed single wavelength model, we collected 40 000 observations for TE polarization at a fixed incident wavelength of 450 nm.The dataset was divided into a 75-25% train-test split, and within the training data, a further 75-25% train-validation split was applied.This resulted in a total of 22 500 observations used for training, 10 000 for testing, and 7500 for validation in the fixed single wavelength model.In contrast, for the variable single wavelength model, we utilized a larger dataset, which consisted of 33 750 observations for training, 15 000 observations for testing, and 11 250 observations for validation.

Model Description
We implemented our physics-informed DL model using Pytorch.As discussed in Section 4.1.1,the dimensions of the design parameters have different ranges.Hence, to avoid biases, we normalize the data to min-max scale that ensures data is between 0 and 1.To tackle nonlinearity, we used a mix of nonlinear activation function like SilU [70] and Tanh.The output layer of our model has sigmoid activation.This is to ensure that the output ranges between 0 and 1 in accordance with the range of the variables considered in the model.All objective functions are computed using mean absolute error (MAE) that calculates the L 1 deviation.We use MAE loss to avoid heavy penalty of outliers.The simplified physics model is introduced as a regularizer to train the DL model.The efficacy of the simplified physics for the metamaterial structure in Figure 1a is evaluated in Figure 5.We observed that our dataset validates the expected behavior of the homogenized model on comparison with the full-feed simulation.The spectra graphs of Figure 5b denote that as λ 0 /p decreases, the absorption of the incident wavelength spectra for the homogenized model increasingly diverges from the full-feed simulation.Figure 5b shows that the absolute difference in the computed absorption coefficient from homogenized model and from full feed-model (y-axis) decreases with the increase of λ 0 =p (x-axis).Therefore, our dataset corroborates the limitation of using effective medium theory-based physics model, which restricts the homogenization of a periodic optically responsive structure to large values of λ 0 /p.Thus, it is necessary to introduce a weight (w phy ) for the physics-based model, which is proportional to λ 0 =p and is set to 0 when λ 0 =p < 2 in the PIL model.

Results and Discussion
We evaluate our physics-informed DL model for both fixed single wavelength, variable single wavelength, and multiple wavelengths of incident light.For the fixed single wavelength model, the entire model is trained on a fixed wavelength of incident light.For variable single wavelength, the model predicts design parameters given optical response and wavelength information for a single wavelength of incident light.For multiple wavelength, the model predicts design parameters given the optical response for a spectrum of wavelength of incident light in the visible range.

Overall Performance of the Tandem Physics-Informed DL Model
We evaluate our physics-informed DL models to solve the inverse problem for the dataset described in Section 4.1.2.For all three aforementioned input cases, the model is able to perform inverse design with proper reconstruction of optical response.The inverse design model for fixed single wavelength trained with λ ¼ 450 nm converges with %7% MAE.For variable single wavelength and multiple wavelength, our model converges with %4% MAE and 2.6% MAE, respectively, for inverse design.Furthermore, our model is able to achieve <1% reconstruction error for single wavelength model and %3% reconstruction error for multiple wavelength models.The comparison of performance for inverse design and reconstruction by our PIA model with tandem-NN and PIL model is depicted in Figure 6.
We validate and quantify the expected behavior from various studies [29,73] of the simple feed-forward NN and tandem-NN model using our dataset.We observe that for single wavelength response input cases, Figure 6a,b, the reconstruction error significantly decreases on using tandem-NN model.As the design error is also a component of the objective function (Equation ( 6)), the tandem-NN model improves the reconstruction while maintaining the accuracy of the inverse design.
In spite of the improvement in performance in tandem-PIL, physics knowledge is not utilized during training of all design parameters to prevent erroneous prediction as the simplified physics deviates substantially from the true governing physics.Thus, to make better utilization of the simplified physics model, we develop the PIA model.In this model, we embed the simplified physics-obeying design parameters into an intermediate layer of neurons in the DL model thus enabling us to model

Performance on Fixed Single Wavelength Input
For fixed single wavelength, we see that PIA performs best in terms of reconstruction of optical response and design prediction error.In Figure 6a, we see that tandem-NN outperforms vanilla NN by achieving a three-fifths reduction in response reconstruction error.This improvement is expected as a simple feed-forward neural network does not perform well for inverse design problems.The reconstruction error is improved by one-fourth from the tandem-NN model by using the PIA model.The inverse design error is seen to reduce by addition of physicsbased constraint with comparable error values for PIL and PIA as seen in Figure 6d. Figure 6a,d depicts the reconstruction error and design prediction error for the DL models for light with wavelength of 450 nm incident on the structure.Inverse design is more challenging for shorter wavelength due to emergence of multiple diffraction orders that increases complexity of the underlying physics.Consequently, we present the result of our model being evaluated on 450 nm incident light for all fixed single wavelength evaluations.However, it is to be noted that the same model architecture is used for inverse design for higher wavelengths within the visible light region as well.

Performance on Variable Single Wavelength Input
PIA performs design prediction with lowest reconstruction error for variable single wavelength input.Figure 6b,e depicts the optical response reconstruction error and design prediction error for the DL models for a single wavelength of light incident with wavelengths between 500 and 700 nm. Figure 6b shows that tandem-NN outperforms vanilla NN by achieving a one-fourth reduction in response reconstruction error, as anticipated given the challenges faced by basic feed-forward neural networks for inverse design.Reconstruction error improves by one-tenth from the tandem-NN model by using the PIA model.Inverse design error reduces by a one-tenth in value due to addition of physicsbased constraint with comparable error values for PIL and PIA as seen in Figure 6d.

Performance on Multiple Wavelength Input
The predictive power of the physics-based inverse design and response reconstruction is analyzed for multiple wavelength input, i.e., wavelength spectrum response as input.In Figure 6f, we see that physics-informed DL improves on design prediction accuracy over data-based DL, reducing the prediction error by one fifth.However, we also note that the introduction of tandem-NN does not cause any improvement in reconstruction of optical response unlike the other single wavelength input cases.The result is so because, it is not likely for different designs to produce identical response for all optical parameters for each wavelength value in the spectrum.Since there are multiple optical response parameters (complex reflection and absorption coefficient) for each of the 40 wavelength points, the dataset we considered do not pose a nonunique inverse design problem for the response of a wavelength spectrum scenario.Hence, the introduction of the tandem-NN model does not increase reconstruction accuracy for the multiple wavelength case as seen in Figure 6c.
By comparing Figure 6f,e, it is noted that the inverse design loss for multiple wavelength input is lower than the inverse model with single wavelength information as input.This is due to the higher level of information fed as input to the multiple wavelength model, as optical response information for 40 wavelength points is the input as opposed to optical response of one wavelength point in the single wavelength cases.
We also validated the performance of the PIA model in reconstructing the optical response.As seen in Figure 7e, the optical response generated by the design predicted by our physicsinformed DL model approximates the optical response calculated from full-feed RCWA simulation for the true design very closely.The misfit between the reconstructed response spectrum and the input is the slight increase in variance of the reconstructed spectrum.The increased variance can be characterized due to the use of two DL models (PIA for inverse design and forward DL for prediction of response), as it is a characteristic of DL model to overfit/have high variance results.However, this difference is small (%3%) and the trend of the response is captured by the PIA model.

Robustness of the PIA Model
We analyze the ability of the PIA model to perform design prediction for low data and out-of-training sample cases.
In Figure 7a, we observe that our PIA model is able to predict design parameters with very less training examples.The reported evaluations in Figure 6 are based on DL models trained with >20 000 data points.It is well known that DL models perform better function approximation and generalization with increased amount of training data.However, since collecting data involves expensive forward computation, we add external knowledge to limit training data requirement without loss of model performance.The introduction of the physics provides a source of knowledge of the behavior of the function we aim to approximate.The addition of physics allows for better generalizability with low data dependency.Figure 7a shows that for the model trained on fixed single wavelength of 450 nm, the physicsinformed DL model (PIA) performs better consistently with lower prediction error than data based model for less training examples.The difference of error grows with decreasing number of training samples.Therefore, addition of physics for inverse design DL model decreases the data burden and improves test accuracy.In addition, we evaluated the model's ability to predict test examples beyond the range of the training data using the fixed single wavelength model trained on 450 nm incident light.The generalizability of the model was tested on design parameters, grating period (p), polymer thickness (t 2 ), and input parameter, incident light absorption (A), which were outside the range of values used to train the DL model.Figure 7b-d displays the model's performance for lower p and t 2 values and higher A values than those in the training set.The results show that, in all cases, PIA has a lower design prediction error than the data-based model.

Model Characterization
We analyzed the importance of the variables, as shown in Figure 8a, in the forward problem.This could provide a sense of weighing the error for the prediction of each design parameter in the inverse model based on the variable importance.The importance is computed from the response prediction error where each design variable does not contribute in the network to calculate the response.We analyses the importance of design, light and material parameters.For forward model, the average error is %0.4%.We observe that the presence of either refractive index information or the wavelength of the incident light is sufficient for the forward model to perform as good as the full model.However, if both the incident light as well as material parameter information is not provided, the model cannot predict the response properly, recording an average loss of %5.36%.Similarly, it is seen that the design parameters are paramount to calculating response, with the most important parameter being thickness of the film (MAE %13%) followed by period of grating (MAE % 6.5%), thickness of metal grating MAE (%4.2%), and width of grating (MAE % 2.3%) In this study, we reported inverse design prediction results by the DL model for all the variable in the design space.That is to say, we considered that all the design parameters are unknown and the entire parameter set is to be determined that properly reconstructs response.However, in practice, during the design or fabrication of the photonic structures there are often constraints about multiple design parameters.In this case, not all design parameters need to be predicted by the inverse model.If the design information about a subset of design parameters are fed in, it reduces the uncertainty of prediction of the model and the target space, thus reducing the prediction error.We corroborated that in Figure 8b, for variable single wavelength input, which shows that as we fix each design parameter for prediction model, the design error for the other variables reduces.If two parameters are to be predicted-thickness t 1 , t 2 or material grating design, w, p, the design error is lower than the full design prediction and single design parameter prediction is even lower.Also, it is noted that the prediction of design parameters-period of grating, p and thickness of metal bars, t 1 are the most difficult.

Conclusion
In summary, to enable efficient and scientifically consistent design of metasurfaces in a supervised learning setting, we introduce a physics-informed machine-learning model.This model leverages the power of neural networks to uncover the interdependence between device topology and optical response.To address the issues arising when using a deep neural network to solve inverse problems, such as nonunique predictions and high data burden, we propose a tandem architecture that predicts explainable and scientifically consistent design parameters while accurately reconstructing the optical response.By combining the forward and inverse models in the tandem architecture, we overcome the issue of nonunique prediction.Additionally, the inclusion of a physics-based penalty reduces data burden and increases generalizability while ensuring scientifically consistent prediction.To compute this physics-based term, we simplify the metamaterial structure into a stratified medium and solve for the optical response analytically.However, as our model deals with data where the dimensions are approximately equal to the wavelength, we restrict the physics-based loss function to subwavelength observations.To make use of physics knowledge for all observations, we developed a PIA that includes physicsconsistent design parameters as intermediate neurons.This approach drives the model toward scientifically consistent prediction without constraining the final design output to be physics consistent.Our proposed model achieves high accuracy in design prediction and optical response reconstruction, including up to 96% accuracy (4% MAE) for design prediction, 99.5% accuracy for the reconstruction of optical response for variable single wavelength input, and 97% accuracy in design prediction for multiple wavelength input.

Figure 1 .
Figure 1.Approximation of a) the heterogeneous layer in the metal-polymer-metal structure, b) with a layer having "average" property, that simplifies computation.

Figure 2 .
Figure 2. Overview of the components of the physics-informed deep learning (DL) model.R input is the input optical response and the design, D, predicted from the inverse model.The predicted design, D, is then fed into a pretrained forward model to ensure proper reconstruction of the optical response, R recon .The structure of D is then simplified according to the homogenization principle and the physics equations are solved to obtain R physics that ensures physics-informed learning.

0
by the DL model is tailored to generate the true response through simplified physics calculations.This adjusted f 0 diverges from the genuine f since it generates distinct responses for varying p values.After the intermediate layer, the NN consists of a few layers that learn the true design parameters, i.e., they produce response as per the true physics model.This portion of the NN learns the change in design parameter due to introduction of the grating in the top layer.The final output predicts the width (w) and period (p) of gold bars, and thickness of gold bar (t1) and film (t2).Hence, in this method, the physics is incorporated in the DL architecture and can guide the DL by using simplified physics yet not constraint the final design as per requirements of the simplified model.This modification enables the use of physics knowledge for all the design parameters.The DL architecture of the PIA model is depicted in Figure 3.This model is called PIA.The initial portion of PIA learns the inverse function consistent with the physics of the homogenized model and guides the later part of the model to learn the effect of introducing rectangular metal blocks with air gaps instead of homogeneous metal block.

Figure 3 .
Figure3.Physics-informed DL architecture (PIA).The "Design (Physics) layer" includes intermediate variables that are consistent with the underlying physics.These variables are calculated based on the design parameters that result in a certain response, as predicted by the electromagnetic equations for the simplified homogenized structure.The "Output (Design) layer" then uses these intermediate variables to predict the original complex design parameters.

Figure 6 .
Figure 6.Design and reconstruction accuracy: a,d) model trained for incident light of fixed single wavelength; b,e) model trained for incident light of variable single wavelength; and c,f ) model trained for incident light of multiple wavelengths.a-c) Reconstruction accuracy improves on using tandem neural network (tandem-NN) model instead of inverse NN model.Addition of physics-based constraint in loss (PIL) does not change the reconstruction error but improves further on using PIA of DL instead of PIL.d-f ) Design accuracy improves on addition of physics-informed loss.Design accuracy remains approximately same for PIA and PIL with increase of reconstruction accuracy.

Figure 7 .
Figure 7. Robustness of physics-informed learning versus data based learning: a) with decrease in number of training examples, PIA model performs increasingly better than tandem-NN.b-d) Generalization power of PIA is better than tandem-NN when the design parameter (period of grating, thickness of polymer) or input parameter (absorption of incident light) is out of range of the training samples.e) The PIA model has the ability to reconstruct the absorption response for a wavelength spectrum.

Figure 8 .
Figure 8. a) Analyzing importance of variables in the forward modeling of optical response from input parameters.The figure shows the error in prediction of accurate optical response when any of the design, material, or incident light parameter is absent.b) The prediction power of the inverse DL model increases when partial information of design parameters is given.