Deep‐Learning Assisted Polarization Holograms

Multiplexing holography with metasurfaces using different degrees of freedom of light has enabled recent applications in display and information processing. In terms of polarization‐multiplexed holograms, the most general form is an arbitrary Jones matrix profile in storing the maximum amount of information. It requires a relaxation to bianisotropic metasurfaces from a conventional single‐layer implementation of nanostructures, but it will also complicate both the inverse design of the nanostructures and the hologram generation algorithm. Here, an integrated neural network approach, being extended from the recent DeepCGH algorithm, is developed to obtain metasurface structural profiles directly from independent holograms from an arbitrary set of polarizations to another, with maximally four different co‐ and cross‐polarization conversion channels. Such an information‐driven approach enables designing complex polarization holograms directly from an existing metamaterial library without detailed physical knowledge on the constraints, and can be extended to other multiplexing holograms to further facilitate an efficient usage of the information stored on a metasurface.


Introduction
Recently, computer generated holograms (CGH) have found practical applications with metasurfaces due to their superior capabilities in storing huge amount of information.By shining laser light on these metasurfaces, the stored information can then be revealed as optical holograms with designed amplitude, phase, or both amplitude and phase profiles. [1,2]Furthermore, there have been a series of development in metasurfaces to is expected to add the cross-polarization flexibility and further increase information capacity.Nevertheless, to design these polarization holograms of multiple channels, CGH techniques can be adopted by modifying the Gerchberg-Saxton (GS) algorithm.[23][24][25][26][27][28][29] However, such an approach is not easily scalable to more complex geometries (to break t xy = t yx ) or to more variety of nanostructures.To overcome such issue and to further enhance our design capability of polarization holograms, it is beneficial to adopt a machine learning approach, which can be generic for the sake of more complex information stored on metasurfaces or larger capability in multiplexing holograms.A machine-based approach, focusing on the information rather than the physics perspective, does not need the case-by-case extension of the GS algorithm.The automation on the part of inversely designing structures [35][36][37][38][39][40] from required phases has been recently adopted in wavelength-multiplexed metasurface holograms with a hybrid of neural network and evolution strategy optimization approach, [41] and polarization-multiplexed metasurface with an end-to end framework to facilitate full exploitation of the prescribed design space and push the multifunctional design capacity to its physical limit. [42]n the other hand, the GS algorithm for hologram generation can also be replaced by an unsupervised neural network, being called the DeepCGH algorithm. [43]The deep-learning-based algorithm gives a higher accuracy and faster hologram generation than the GS algorithm.
In this work, we develop an integrated deep neural network to implement both hologram generation and inverse design of the metamaterial nanostructures at the same time as a generic solution to metasurface holograms.Taking the bianisotropic metasurfaces as an example for achieving the general form of polarization holograms, the integrated network is able to design the metasurfaces profile to obtain independent phase holograms for the four combinations of polarization conversion channels.By integrating the hologram generation and inverse design components into the network, our proposed method does not need the case-by-case optimization for each target configuration and can generate the metasurface designs more automatically and efficiently.As the constraints on the Jones matrix elements are now simply hidden in the accessible geometries of the nanostructures as a library, the approach can be extended to other scenarios for metasurfaces with more DOFs, such as orbital angular momenta and diffraction orders.It can also be applied to situation that the incident or the outputting polarizations are not orthogonal but with arbitrary specifications.

Jones Matrix Library for a Family of Bianisotropic Metamaterials
To have a higher chance to generate very different Jones matrix elements, we start from a family of bianisotropic metasurfaces.We note that bianisotropic metasurfaces have been previously shown to give asymmetric LP conversion in forward and back-ward incidence or equivalently t xy ≠ t yx for the same side of incidence. [44,45]48][49][50] Here, the bianisotropic structures we used as template are made of silicon fins (green color, with permittivity 12) within a glass matrix (cyan with permittivity 2.25) in a square lattice (with periodicity a = 1000 nm), as shown in the upper panel of Figure 1a.The silicon fin has a three-layer structure with the top and bottom layers (both with height h = 250 nm) being L-shaped and a square pillar (of tunable width w and thickness t = 125 nm) to connect the two layers.The two L-shape bars break the mirror symmetry in the z-direction by generating bianisotropy while the pillar in the middle enhances the polarization cross-coupling between the two layers.The lack of the center pillar will cause lower transmission amplitudes in the cross-polarization channels and less covering range of phase in the co-polarization channels (please see Section S3 and Figure S3, Supporting Information for more details).The size of the silicon structure in the y-direction is l = 637.5 nm.The vertical pillar is shifted by Δx and Δy relative to the two vertical middle planes of the unit cell (dashed line frames).This structure is defined as "right-handed."The "left-handed" structure, flips its handiness by making a mirror operation with the plane x = y, which is shown in the lower panel of Figure 1a.
Next, we perform full-wave simulations (COMSOL Multiphysics) to obtain the Jones matrix elements of the right-handed structure when we scan the geometric parameters.The results are shown in Figure 1b with normal incidence (along positive z-direction) and a fixed wavelength of 1550 nm.The scanning ranges of the geometric parameters in the full-wave simulations are: w from 187.5 to 237.5 nm (in steps of 5 nm) and both Δx and Δy from −125 to 125 nm (in steps of 12.5 nm).From the interpolated results, we can see that the argument of each Jones matrix element in the LP (e.g., t xy means the transmission coefficient from incident y-to x-polarization) can cover the full range of 2π, and also we have t xy significantly different from t yx now, for the sake to generate the most general form of polarization holograms.As the argument of t yy does not change too much for the whole family, we add a parameter θ, the orientation of the nano-structure, by rotating the structure in the counter-clockwise direction.In this case, the Jones matrix of the rotated (right-handed) structure is where cos sin sin cos and the Jones matrix before rotation (with the phases shown in Figure 1b) is J w x y t w x y t w x y t w x y t w x y The superscript "RH" on the Jones matrix elements indicates they are defined for the right-handed structure.The whole library also consists of the left-handed structures (with superscript "LH") with their Jones matrix related to the one of righthanded structures as in order to flip the phase relationship between t xy and t yx due to the fixed handiness of the structure.Now, both the lefthanded and right-handed structures with different geometric parameters of w, Δx, Δy, and θ form the whole metamaterial/ nanostructure library whose structures can be used to generate holograms in the next stage.We have four continuously tunable geometric parameters, on purpose from the bianisotropic structure, to control the target phase-space: four transmission phases in different polarization channels.However, whether the library is enough to browse the whole phase-space is still not clear in this stage but the question can be accessed in the later stage with the integrated neural network.For more details, we have also shown the the amplitudes and the phases of the Jones matrix at a fixed w = 202.5 nm (and θ = 0°) in Figure S1 (Supporting Information).In fact, the arguments of the Jones matrix elements cover the full 2π range through local resonances, which can also be revealed through the fluctuation of amplitudes.It is found that on average, t xy and t yx have a smaller amplitudes than t xx and t yy .Such amplitudes will be also taken into account as a constraint in the whole algorithm in generating holograms.For completeness, the Jones matrix in circular polarization (CP) basis can be expressed from the one in LP basis as in order to generate polarization holograms in the CP basis.

Integrated Deep Neural Network for Complex Polarization Holograms
With the nanostructure library in place, we move forward to establish an integrated deep neural network to design metasurface holograms.The schematic of the network is shown in Figure 2, which integrates an existing DeepCGH network [46] in designing scalar holograms (only with a profile of transmission phase) with an inverse-design component of nanostructures.In essence, the whole network turns an input target hologram of amplitude {B ki } into output geometric parameters {w, Δx, Δy, θ, R/L} of all the nanostructures (a total number of n 2 ) on the metasurface.Fourier transformed (F ) to the reconstructed hologram B ′ { } ki in the reverse process.A primed notation here is for the reconstructed quantities.The whole network, the weights of the farfield predictor network and those of the encoder network, are trained in an unsupervised notion with the loss function being the geometric error (GE) between the target and reconstructed holograms: , the same loss function adopted in the existing DeepCGH network [46] which is now extended by the incorporation of the nanostructure library: the encoder does the inverse design from Jones matrix elements to geometric parameters and the decoder, a forward deep learning-based surrogate network, is actually pre-trained with supervised learning using the full-wave simulation results based on Figure 1, before the training of the integrated network.
Hereafter, we call this extension with the inverse-design (ID) component (the orange dashed box) as DeepCGH-ID network.
For the decoder, it is constructed with 4 fully hidden linear layers with structure 7-200-200-100-100-8.We note that for cyclic variable θ, we replace it with {cos θ, sin θ} and discrete value R/L by a one-hot vector {1,0}/{0,1} for data representation to facilitate training.These account to the input dimension of 7 together with w, Δx and Δy.The four Jones matrix elements with real and imaginary parts constitute to the output dimensions of 8.The activation functions of the first five linear layers and the last layer are exponential linear units (ELU) function and hyperbolic tangent (tanh) function, respectively.300k sets of data are generated from full-wave simulations (around 41 h) with 234k sets of them for training data, 26k sets of them for validation data and the remaining for testing data.Mean square error (MSE) is used as the loss function in training with Adam optimizer and learning rate 0.0005.In the testing phase of the decoder, the Pearson correlation coefficients (PCCs) between the ground truth and the predicted Jones matrix elements (including real and imaginary parts) are all higher than 0.999, indicating the validity to replace the full-wave simulations (COMSOL) in calculating Jones matrix elements by the pretrained decoder to obtain a higher computational efficiency for the training of the integrated network in the next stage.
From the perspective of deep learning, autoencoder latent space that couples the encoder and decoder together, represents a low dimensional projection from the training dataset.The latent space constitutes the set of all possible geometric parameters of the metasurfaces.The corresponding training data feeding into the encoder and out from the decoder are the required Jones matrix phase profiles.The encoder (four hidden linear layers in our implementation) solves an inverse problem by projecting training data onto latent space or converting the phase profiles to the geometric parameters.Similarly, the decoder solves a forward problem by expanding latent space into training data or mapping geometric parameters to Jones matrix elements.For the inverse design problems of metasurfaces, there may exist multiple structures that can generate nearly the same optical responses.Unlike the forward network (decoder), it is hard to train the inverse network (encoder) directly in a supervised learning approach, due to the presence of labeled data with conflict to make the convergence difficult.By combining an encoder and the pre-trained decoder together as tandem-like architecture, the non-uniqueness issue can be mitigated. [51]In our scheme, such autoencoder is then further embedded into the integrated network (DeepCGH-ID), which evaluates the loss on the hologram quality directly.Then, the training is carried out to train the far-field predictor network and the encoder all together.The advantage of this furtherembedding approach is that the entire integrated network is trained directly based on the goal of getting the best-required  holograms, and the inverse function (encoder) is optimized at the same time.More details about the DeepCGH-ID network are shown in Figure S2 and Table S1 (Supporting Information).On the other hand, unlike the conventional GS algorithm in generating holograms, the current approach has constraints applied on both amplitude and phase of the metamaterial library, so that the constructed holograms during training/ algorithm are more realistic.Only using the phase profiles in the training process will reduce the quality of the holograms (please see Section S4 and Table S2, Supporting Information for more details).

Performance of the Integrated Deep Neural Network
Here, we generate 500 configurations of polarization holograms of dice patterns, with nine white dots to be turned on or off randomly and independently.Each set of polarization holograms consist of 4 independent holograms, which iterates 4 combinations of polarization conversion channels.450 of them are used as the training data and 50 of them for validation data for the integrated network.In fact, we do not need a very large set of data for the integrated network as the network has a capability for extrapolation, as shown later.Each hologram of a polarization conversion channel has a size of 64 by 64 pixels.Then, we can test both interpolation and extrapolation capability of the network.In the training process, the learning rate is initially 2 × 10 −6 and exponentially decays to 2 × 10 −7 at the end of the training.Two independent integrated networks are trained separately (but with the same nanostructure library) for generating the polarization holograms for LP and CP basis.After the network is well trained, we select two different configurations (dice and numbers patterns) to test the network as the first step, which are not included in the training data, as shown in the first row in Figure 3a,b.By feeding the test data to the whole network, the metasurface designs (geometric parameters) of the two configurations of target holograms can be obtained from encoder as the output.Finally, the generated holograms are calculated based on the Fourier transform of Jones matrix profiles selected from the nanostructure library using the geometric parameters.
Figure 3 shows the testing results of the two sets of polarization holograms.The first and the second (third) row list the target holograms and the generated holograms in the LP (CP) basis.Each column represents one of the four polarization channels with the first (second) orange arrow drawn on the same figure to denote the analyzing (incident) polarizations.The generated polarization holograms are clear and have low crosstalk between the holograms of different polarization conversion channels (please see Section S6 and Table S3, Supporting Information for more details).The corresponding PCCs between the generated and the target holograms are all larger than 0.74 in the LP (CP) case.The transmission efficiency for the holograms of each dice pattern (numbers pattern) in the LP case (from left to right) are 39.2%, 7.8%, 14.9%, and 45.0% (36.7%, 8.6%, 19.4%, and 42.0%), respectively.The lower transmission efficiency of the holograms in the cross-polarization channels in LP basis is due to the smaller transmission amplitudes of the cross-polarization elements (see Figure S1, Supporting Information).The generated metasurface design and the distribution of its geometrical parameters are shown in Figure S4 (Supporting Information).For the polarization holograms in CP basis, the transmission efficiency for each channel of the dice patterns (numbers patterns) are 22.2%, 31.4%,29.0%, and 21.5% (21.6%, 32.0%, 29.0% and 20.8%), respectively, showing less differentiation between co-polarization and cross-polarization channels.On the other hand, we can also observe that the quality of the generated holograms for the number patterns are only slightly inferior than those for the dice patterns, with slightly smaller PCCs.These results show the extrapolation capability of the integrated network, as the training data are in a similar style to those of Figure 3a but distinctly different from those of Figure 3b.For more general testing, we have also generated 100 configurations of the polarization holograms of dice patterns, again with nine white dots to be turned on or off randomly and independently, as testing data.For LP holograms, the mean PCCs for the four channels (xx, xy, yx, and yy) are 0.87, 0.82, 0.81, 0.89, respectively.For CP holograms, the mean PCCs for the four channels (LL, LR, RL, RR) are 0.84, 0.85, 0.89, 0.82, respectively, confirming the results shown in Figure 3a.This may speed up the whole designing process when a large number of metasurface designs are needed.Now, we compare the DeepCGH-ID network to a conventional approach based on GS algorithm if we are given a generic nanostructure library.In this case, the conventional approach is set by using the GS algorithm to obtain four independent transmission phase profiles (Jones matrix) required on the metasurface.Then, at each location of nanostructure on the metasurface, we choose an optimal one from the library to fit all the four transmission phases as closely as possible.We train separately an encoder-decoder pair in Figure 2, i.e., training the encoder with the previously pre-trained decoder to minimize the MSE between the input phases (to the encoder) and output phases (from the decoder).Then the encoder can be used for such optimization and we call such overall process the "GS+Encoder" approach.The results of comparison in terms of PCCs are shown in Figure 4a for the previous 8 holograms of different polarization conversions about the dice patterns.In this case, the PCCs of the generated holograms from DeepCGH-ID method are higher than "GS+Encoder" on average.On the other hand, we now start to turn off some of the available geometric parameters in the designs.Figure 4b shows the results when we turn off w by setting it to a constant 202.5 nm (i.e., Figure S1, Supporting Information) without choice now while Figure 4c shows the results when we further turn off θ by requesting it to be always zero.The PCCs decrease for both methods as less number of geometric parameters are available for designing the metasurface structures while the "GS+Encoder" method has PCCs falling off more significantly.Such trend can be explained from our starting point of the whole design process.We have actually taken a "complication" approach in choosing a complex structure (in Figure 1) to guarantee significant effect of bianisotropy hence very different values of Jones matrix elements against the geometric parameters and among the four different elements.The mild difference between the DeepCGH-ID and the "GS+Encoder" actually reveals that our choice of structures is complicated enough to browse the whole phase-space of the four transmission phases using four geometric parameters.So the global optimization approach (in the integrated network) gives some advantage.Our machine-assisted algorithm can cope with the complexity of the design phase-space.When the number of available geometric parameters decreases, we do not have enough DOFs anymore to browse the phase-space by individual nanostructures but we will need collaborative effect from different nanostructures to construct the polarization holograms.In these cases, the DeepCGH-ID method shows PCCs much higher than the other approach.We note that the DeepCGH-ID method is generic as only the library needs to be constrained and we do not necessarily need to have the knowledge whether the library is rich enough or not.In addition to the "GS+Encoder" method, we also compared DeepCGH-ID with another conventional approach without machine learning.We obtain the required phase profiles from GS algorithm first, and then select the most matching structures for each unit cell from the material library with the lowest error (please see Section S7 and Table S4, Supporting Information for more details).DeepCGH-ID generates the polarization holograms with higher PCCs on average than this conventional approach, and also 40 times faster than the GS algorithm.
This "complication" route is to guarantee the algorithm to be able to find out suitable designs to fulfill specific functionality.In the current work, we make this "complication" route possible by developing an integrated deep neural network to directly work with given a generic nanostructures library.In practice for a more general consideration, we may expect there is an existing library of metamaterial/nanostructures that can be fabricated using existing facilities without difficulty.Then, the complexity will lie on the different possible structures instead of a single complex structure used in this work.Our proposed deep learning method is not only a surrogate solver (the decoder in our whole model), since the network provides both hologram generation and inverse design capabilities, as such, it requires no prior physical knowledge.And it can also be easily extended to other structures and situations.For example, our approach is applicable in such a situation by just replacing our existing library of bianisotropic nanostructure with another library (please see Figure S3, Supporting Information for more details).Further extension of the current approach can also go in the direction to generate holograms with more specifications, e.g., going from phase holograms to amplitude plus phase holograms, and going from polarization multiplexing to orbital-angular-momentum multiplexing.In Figure S5 (Supporting Information), we also demonstrated the generation of the vectorial holograms with both required amplitude and polarization behaviors, by controlling the complex farfield profiles, rather than only the far-field amplitude profiles (please see Section S8 and Figure S5, Supporting Information for more details).Compared with the conventional physicalprinciple approaches, the deep learning method requires less physical knowledge to achieve the metasurface design, by just adding the phase consideration in the training process.
Finally, we discuss polarization conversion channels more complicated that the one demonstrated.For example, we can have polarization conversion channels xx, Rx, yy and Ly for the four holograms.In this case, we only need to modify the previously pre-trained decode by adding an additional linear layer for change of basis.Figure 5 shows the target patterns on the first row and the generated ones on the second row.While we have shown the possibility on generalizing the algorithm to more complicated situations, further investigations can be performed on the optimal choice or more general choices of the polarizations.

Conclusion
In conclusion, to fully utilize the polarization holograms, we start with a possible metamaterials or nanostructures library as the template.By varying the tunable geometric parameters, a rich enough library guarantees a large range of the phases and the difference between the cross-polarization elements in the Jones matrix.We developed a machine-assisted approach to accelerate the metasurfaces design process by integrating both the hologram design procedure and the inverse design of nanostructures into the same neural network.Without detailed physical knowledge on the nanostructure constraints, this integrated network method can generate metasurface designs directly from the required complex polarization holograms.Our information-driven (machine-learning) approach enables a more systematic and automated process in designing metasurface holograms with high quality and efficiency.Our approach can be extended to more degrees of freedom in the multiplexing holograms, e.g., OAM, or to metamaterial structures other than the one used in this work in straightforward manner.

Figure 1 .
Figure 1.Bianisotropic structural units.a) Schematic of the unit cell with the silicon-made "fin" structure embedded in glass in a square lattice of periodicity a = 1000 nm.The structure has two layers of L-shape bars (width w and thickness h = 250 nm) connected by a square pillar (width w and thickness t = 125 nm) in between.The square pillar has a shift of Δx and Δy from the vertical middle planes of the unit cell (the dashed frame).All bars have width w.The total length of the "fin" along the y-direction is l = 637.5 nm.The upper panel shows the right-handed structure while the lower panel shows its mirrored structure (with mirror plane x = y) defined as the left-handed structure.b) The phases of the Jones matrix in LP by scanning parameters w, Δx, and Δy (with dimensions in nm) obtained from full-wave simulations at normal incidence along z-direction and at a fixed wavelength of 1550 nm.

Figure 2 .
Figure 2. Integrated deep neural network to design metasurface holograms: turning input target holograms {B ki } (upper green box) to the output geometric parameters of the n 2 nanostructures on the metasurface ({w, Δx, Δy, θ, R/L}) (the cyan box).Subscript k iterates the polarization channels (xx, xy, yx, yy or LL, LR, RL, RR for LP or CP holograms).Subscript i iterates the pixels on the hologram.The decoder is pre-trained with supervised learning to turn geometric parameters to the Jones matrix elements (transmission coefficients) from the full-wave simulation results based on Figure 1.The integrated network trains the far-field predictor network and the encoder network with unsupervised learning.The encoder-decoder pair (orange dashed box) is an autoencoder variant to do inverse design of nanostructures from Jones matrix elements.It is further embedded into the integrated network as an autoencoder in turning target holograms to geometry and to reconstructed holograms with loss function being the reconstruction error.

Figure 3 .
Figure 3. Testing results of two sets of polarization holograms on a) the dice patterns (an interpolation example) and b) the numbers (an extrapolation example).The first row lists the target holograms.The second (third) row lists the generated holograms in the LP (CP) basis.Each column shows one of the four polarization channels with first (second) arrow indicating the analyzing (incident) polarizations.Horizontal (vertical) arrow means x (y) polarization.Clockwise (counterclockwise) arrow means right (left)-handed CP.PCCs between the reconstructed and the target holograms are shown in upper left corner of each hologram.

Figure 4 .
Figure 4. Comparison between DeepCGH-ID and "GS+Encoder" methods against different number of geometric parameters.The horizontal and vertical axes denote different polarization channels and Pearson correlation coefficients (ρ), respectively.The orange and blue bars mean the results from DeepCGH-ID and "GS+Encoder" approaches, respectively.

Figure 5 .
Figure 5. Testing results of polarization holograms on the dice patterns with specifications for xx, Rx, yy, and Ly polarization conversion channels.The second row lists the generated holograms as results.The PCCs are shown in the upper left corner while the first (second) arrow represents the output (input) polarization.
Such a process is shown in the upper part of the diagram.The target hologram {B ki } is transformed by the far-field predictor network to guess the far fields with phases {B ki exp (iφ ki )} at the hologram, which is inverse Fou-To formulate such process, the whole network is trained as an autoencoder of the holograms with the reverse process.The lower part of the diagram starts from the geometric parameters feeding to a decoder network, which transforms them into Jones matrix elements t ′ { } Subscript k iterates the polarization conversion channels: xx, xy, yx, yy for a LP hologram or LL, LR, RL, RR for a CP one, and subscript i iterates the pixels on the hologram.R/L is discrete to indicate it is a right-handed or left-handed structure.ki .t ′ { } ki is then Adv.Optical Mater.2024, 12, 2202663