Achromatic and Resolution Enhancement Light Field Deep Neural Network for ZnO Nematic Liquid Crystal Microlens Array

Nematic liquid‐crystal microlens arrays (LC‐MLAs) often exhibit chromatic aberration and low resolution, severely compromising their optical imaging quality. This study proposes an achromatic and resolution enhancement light field (ARELF) deep neural network (DNN) to address these issues. The training set is constructed by incorporating LC‐MLA characteristics’ degradation, retrofitting the vimeo90k dataset. The network's hidden layer is trained to learn about chromatic aberration and low resolution of LC‐MLA while extracting imaging features and fusing the information of complementary features of a light field under varying voltages. The loss function includes both chromatic aberration and overall resolution. The light field images of ZnO LC‐MLA under seven consecutive voltages are used as input to test the proposed DNN model. After experimental verification, the proposed model effectively eliminates chromatic aberration while enhancing the spatial and temporal resolution of LC‐MLA. This novel network can be utilized to optimize the design process of LC‐MLA and significantly improve its imaging performance.


Introduction
Traditional optical microlens arrays (MLAs) are susceptible to aperture, depth of field (DOF), exposure time, exposure level, and other parameters.These factors make it challenging to combine MLAs effectively with sensors, leading to a substantial reduction in image bandwidth and significant degradation in image quality. [1]However, liquid crystal (LC) is a high-performance optoelectronic material with unique properties. [2,3]Inducing a captured by LC-MLA at different voltages into the video superresolution architecture for computation, the complementary information of the light field at different voltages can be fully utilized to complete the resolution-enhanced reconstruction.This study proposes an achromatic and resolution enhancement light field (ARELF) deep neural network (DNN) for LC-MLA to address the issues of chromatic aberration and low resolution.Figure 1a illustrates the light field data processing flow.First, ZnO LC-MLA is used to construct the light field acquisition system, and the light field images under different voltages are collected.Among them, the light field acquisition system explains the reason for chromatic aberration produced by LC-MLA, which is called LC dispersion.Then, the acquired light field images are decoupled into multiview images.Subsequently, the decoupled multiview images under different voltages are fused into a multi-view (MV) format light field image, namely, all-in-focus image, corresponding to the voltage. [16,17]Finally, the 2D MV format light field images under multivoltages are input into the ARELF network to obtain the light field images with achromatic and enhanced resolution.The main modules included in the ARELF are presented in Figure 1b, consisting of the chromatic aberration feature extraction module (spatial and complementary information (SC) module), the reconstruction module, and the upsampling module.

Principle and Device Fabrication
Figure 1 depicts a schematic diagram of the proposed network configuration.The network structure is multibranch, similar to the video super-resolution framework, with the addition of multivoltages imaging information.This approach enables the inclusion of semantic and complementary information, resulting in a more comprehensive representation of light field data.The work utilizes residual dense blocks and jump connections to enhance expressive power and resolution reconstruction quality.Likewise, the designed SC module contains two branches, spatial information (S-branch) and complementary information (C-branch), which were utilized to extract the chromatic aberration feature.In addition, jump connections allow features from the high-resolution layer to be immediately sent to the upsampling feature map, providing precise result.Subpixel convolution converts a feature map to a highresolution image through rearrangement and convolution.The loss function of the designed network comprises two components, an overall resolution loss function and a chromatic aberration loss function.
where the first part of the overall resolution loss function is the L1 norm.
where yðiÞ represents the real image, and ŷðiÞ means the predicted image.
The second part of the chromatic aberration loss function is where rðiÞ, gðiÞ, bðiÞ refer to the theoretical intensity values of the red, green, and blue channels.r 0 ðiÞ, g 0 ðiÞ, b 0 ðiÞ mean the actual intensity values of the red, green, and blue channels.ε is a very small positive number.∇ is a gradient It is well known that LCs are dispersive materials.As a result, the main factor contributing to chromatic aberration in LC-MLA is the wavelength dispersion of the LC materials.To quantitatively evaluate the chromatic aberration characteristics of LC-MLA, this study introduces the intensity values of the red, green and blue (RGB) three channels and characterizes them using the L2 norm.The network's chromatic aberration feature extraction module comprises a two-branch structure with eight 3 Â 3 convolutional layers and four rectified linear unit (ReLU) layers, as shown in Figure 1b.According to the MV format, the light field images collected at seven consecutive voltages are stored as 2D images. [16]Then, those 2D images are superimposed as an allin-focus image and inputted into the proposed network. [17]he input information is decomposed into spatial and complementary information at different voltages, which are input to the chromatic aberration feature extraction module (SC module) using four 3 Â 3 convolutional layers and two ReLU activation layers for chromatic aberration feature extraction, as shown in Figure 1b.The SC module contains S and C two branches.On one hand, the S branch mainly extracts texture and channel information at the current voltage.In contrast, the C branch mainly extracts complementary information between the multivoltages.It uses the information from the two branches to enhance image structure and restore missing details.The resolution reconstruction module forms the network's backbone and fully extracts local features.It includes two 3 Â 3 convolutional layers, one ReLU layer, and eighteen residual dense block (RDB) modules, [18] as shown in Figure 1b.The RDB module consists of multiple 1 Â 1 convolutional layers, 3 Â 3 convolutional layers, and ReLU layer jump connections.The 1 Â 1 convolutional layers reduce computational effort by dimensionality reduction while fusing features between channels.At the same time, the dense jump connections improve gradient propagation mobility and utilize shallow features to increase the accuracy of reconstructed features.The upsampling module consists of 3 Â 3 convolutional layers, PixelShuffle-2D, and ReLU layers, as shown in Figure 1b.The sub-pixel convolution strategy [19] is utilized to convolve and reconstruct the low-resolution chromatic aberration feature image between multiple channels, obtaining a high-resolution achromatic feature image that enables the upsampling function.The ultimate network output is an achromatic high-resolution sequence of all-in-focus 2D images at corresponding voltages.

Dataset Construction
The vimeo90k dataset [20] was selected for this network for several reasons, including the availability of low-resolution and highresolution image pairs that can be utilized for resolution enhancement training.The training set for the ZnO LC-MLA-oriented ARELF network is constructed by adding chromatic aberration to the vimeo90k dataset.To achieve this, the prepared ZnO LC-MLA is imaged as light field data at different voltages in real scenes, represented as 2D images in MV format.The chromatic aberration bias values for ZnO LC-MLA oriented Δ are calculated using the CMC(l:c) chromatic aberration formula. [21]RGB color space needs to be converted to Lab* color space to calculate the color offset value of ZnO LC-MLA.Multiple sets of real scenes are imaged several times to ensure accuracy, and their average values are used for calculation.The calculated chromatic aberration bias value Δ is then used to modify the vimeo90k dataset, completing the construction of the degraded LC-MLA dataset.The above implementation steps show that no additional camera with the same field of view (FOV) as the LC-MLA light field camera is required.Therefore, the proposed method of dataset construction not only simplifies the construction of training data but also avoids errors caused by the mismatch of the FOV and focal plane due to the use of two cameras.

ZnO LC-MLA Fabrication
Conventional LC-MLA typically utilizes the rubbing-alignment method, which can lead to problems such as static electricity generation and leftover dust. [22]These issues can significantly impact the optical imaging quality of the system.Then, the photoalignment method was introduced, effectively addressing pollution problems caused by the traditional rubbing-alignment method due to static electricity and dust.However, the photoalignment method also suffers from the problems of thermal and optical instability of the LC orientation, which can degrade its optical imaging quality. [23]Due to the orientation structure in the device, this study introduces ZnO material and designs an LC-MLA device with a new ZnO microstructure-based orientation structure to eliminate the problem of degraded imaging quality, [24][25][26] as shown in Figure 2a.The device comprises two glass substrates, an aluminum (Al) film layer, a ZnO microstructure, and a nematic LC layer.The ZnO microstructure anchors the LC molecules without rubbing alignment, using the microstructure as an orientation layer on the LC-MLA bottom substrate. [27]The anchoring energy of this ZnO microstructure is %2.07 Â 10 À4 J m À2 .These ZnO microstructure-oriented polarization optical microscope (POM) results are presented in Figure 2d, demonstrating that the orientation structure can lead to more stable horizontal orientation results, avoiding the defects of frictional and light-controlled orientation and contributing to high-quality optical imaging.Figure 2b shows the transmittance in the visible range at different voltages, while Figure 2c presents the point spread function (PSF) of parallel white light converging after passing through this LC-MLA at 2.0V rms .The highest energy peak of the PSF is 16 134, the average energy peak is 14370.2,and the average full width at half maximum (FWHM) is 16.15 μm, indicating that the prepared LC-MLA has good optical focusing ability and optical imaging quality.
The preparation process of the proposed ZnO LC-MLA device is as follows.The ZnO-oriented microstructure was fabricated using the microelectronic process method.The top ITO glass substrate of the LC-MLA was coated with an Al film, and a circular hole array with a diameter of 34 μm and a hole center distance of 42 μm was prepared as the top electrode using photolithography.An empty LC cell has 15 μm spacer microspheres.The LC material (Merck E7, n o = 1.762, n e = 1.526) was poured into the prepared LC cell using the capillary and sealed using UV adhesive.

Training Details of ARELF
The degraded vimeo90k dataset was utilized as the training set to facilitate training.We cropped and reorganized the training patches, which were 64 Â 64 pixels in size.Seven consecutive images from the video data were fed at the same time.The total number of training patches used was 54 768.PyTorch was employed for development, with Adam used as the optimizer.For the proposed network, the number of epochs was set to 300, the batch size was eight, and the learning rate was 10 À4 .Network training was performed in the following software and hardware environments.The operating system used Ubuntu18.The hardware configuration consisted of an Intel E5-2660v3 with 2.6G ten-cores, twenty-threads CPU, and 16 GB DDR4 ECC of RAM.The used GPU was Nvidia RTX 3060 with 12 GB RAM.The total time spent training the model was %13 h.

Comparative Experiments
The model was tested on real-world scenes, where three toys with different DOFs were selected and imaged using an LC-based light field camera. [25]The test results were compared with traditional linear interpolation, white balance, and demosaicing methods, as shown in Figure 3. Subsequent results were compared using all-in-focus images.Linear interpolation is a conventional method for enhancing resolution, white balance is a traditional image correction method, and the demosaicing method is an image processing algorithm for image resolution enhancement and correction.These three methods were selected for comparison with our proposed network.The experimental results in Figure 3 indicate that our proposed method effectively enhances the spatial resolution, with smoother edges and no jagged effects in the local magnification results.For quantitative analysis, we gave the grayscale histogram of the local magnification map.The horizontal axis indicates the gray level, and the vertical axis indicates the proportion of gray values.Our proposed method performed much better than traditional linear interpolation, white balance, and demosaicing methods in solving the LC-MLA low-resolution problem.
The proposed network adopts a video super-resolution-like framework, particularly the SC module, which fully uses the LC-MLA light field imaging information at different voltages, especially the complementary information, making the final result with details and compensating for texture information loss in traditional methods.The network structure was designed to learn the dominant causes of chromatic aberration in LC-MLA using the vimeo90k dataset to obtain the training set through quality degradation.After applying the model, the local enlarged image shows that the test data is free of purple edges or chromatic aberration contours.We present the line contours of the corresponding red, green, and blue channels along the yellow line of each local image, respectively, to further demonstrate that our method can eliminate chromatic aberrations.
The intensity contrast curve of RGB pixels shown in Figure 3 indicates that the red and blue channels do not match well with the green channel in the line contour area of traditional methods, just like linear interpolation, white balance, and demosaicing.On the other hand, the proposed method in this study shows that the red and blue channels match well with the green channel, indicating that the chromatic aberration problem of LC-MLA is better solved. [17]It can be observed that the white balance and demosaicing algorithms are not specifically developed for chromatic  d) show all-in-focus images after applying bilinear interpolation, white balance, demosaicing, and our proposed method, respectively.e-h) present a side-by-side comparison of the locally enlarged images for each method.i-l) compare the intensity contrast curve of RGB pixels.m-p) display histograms for each method.
aberration.While these algorithms can solve some problems related to chromatic aberration, they cannot completely solve the problem caused by LC-MLA.In contrast, the proposed method is explicitly developed for LC-MLA.It uses the convolutional networks' deep learning ability to learn the chromatic aberration problem and obtain the chromatic aberration feature information of LC-MLA at different voltages.The proposed method eliminates chromatic aberration while achieving super-resolution using complementary feature information.
To provide a comprehensive comparison of the proposed methods in terms of achromaticity and resolution enhancement, we sought out three state-of-the-art (SOTA) methods for side-by-side evaluation.These methods include video overdistribution (EDVR), [11] saturation-value total variation model, [28] and explicit temporal difference modeling. [29]We selected these algorithms because there is no existing network that can simultaneously solve both super-resolution and achromatic aberration problems, and they are all single-function networks.Furthermore, there are fewer specialized networks designed explicitly for LC-MLAs.In summary, we compared our proposed method with an achromatic network and two video super-resolution networks, respectively, as shown in Table 1.The scene was the same as in Figure 3. Energy and Laplace values chosen in the red square area in Figure 3 were compared.The results demonstrate that our proposed method is capable of eliminating chromatic aberration while also achieving resolution enhancement.In contrast, the control group's algorithms were limited to either function, eliminating chromatic aberration or improving resolution.Moreover, our proposed method was able to compete with the current SOTA algorithm in both chromatic aberration elimination and resolution enhancement.That highlights the effectiveness of our proposed method in addressing these two critical challenges in light field imaging.
In summary, the proposed method maximizes the electronically controlled focal length feature of LC-MLA by fully utilizing the acquired information at different voltages to obtain high-quality imaging results with enhanced resolution and achromatic.The traditional methods used for comparisons, such as white balance and demosaicing, are not specifically designed to solve the chromatic aberration problem of LC-MLA.On the other hand, the compared achromatic network and two video superresolution networks are single-function networks.In contrast, the proposed method explicitly addresses this issue, making it a more effective solution.

Ablation Experiments
Currently, only a limited number of networks can effectively reduce chromatic aberration while improving resolution.Moreover, even fewer deep learning networks are designed explicitly for LC-MLA devices.To further highlight the unique characteristics of our network, we propose conducting an ablation experiment in this study.This experiment will allow us to individually analyze the structural modules of our network for different problems and deduce critical factors that contribute to the final structure of our network.Ablation experiments were conducted using the three toys with different DOFs, as in the previous experiments, to evaluate the proposed network's performance.The goal was to assess the effectiveness of each module in the network.Figure 4 presents the results of the ablation experiments.Figure 4a shows the network with the SC module retained but the remaining modules removed.Figure 4b shows the network with the RDB module retained but the remaining modules removed.Figure 4c shows the network with retained SC and RDB modules.The results indicate that the proposed method achieved significantly better-quantized values of energy of 263.9033 and Laplace of 29.9349 in the resulting images, compared to energy of 153.8775,Laplace of 17.6631 in Figure 4a and energy of 193.9188 and Laplace of 22.6986 in Figure 4b.Moreover, the partial magnification shows that Figure 4a eliminates some chromatic aberration but does not significantly enhance the resolution.On the other hand, Figure 4b enhances the resolution but does not eliminate chromatic aberration effectively.
Furthermore, the area bounded by the orange dashed line of the RGB pixel intensity map in Figure 4 demonstrates that the red and blue channels of the proposed method are most compatible with the green channels.The color signal difference between the green and red channels of the proposed method between 600 and 700 pixels of the RGB pixel intensity map is much smaller.Thus, the SC module extracts chromatic aberration feature information to achieve the achromatic function, while the RDB module enhances the resolution.The resolution is not significantly enhanced when only the SC module is retained.When only the RDB module is retained, chromatic aberration persists, and the resolution is enhanced.However, the relatively best results are obtained when both the SC and RDB modules are retained, as the chromatic aberration is eliminated and the resolution is enhanced.

Dynamic Experiments
Finally, we conducted a dynamic experiment to evaluate the proposed network's performance in capturing scenes with enhanced temporal resolution.We used a movable toy car for this experiment and captured the scene using our trained model.As the toy car was moved closer to the camera, we adjusted the voltage of the LC-MLA in steps of 1V rms step À1 , as shown in Figure 5.

Conclusion
This study proposes a novel deep learning approach to solve the chromatic aberration problem of LC-MLA-based imaging systems and enhance their spatial and temporal resolution.To our knowledge, this is the first time deep learning methods have been applied to address chromatic aberration in LC-MLA systems.Our proposed network offers a solution to the problem of chromatic aberration, which has been a significant challenge in LC-based imaging systems.Moreover, the network optimizes the design process of LC-based imaging systems by exploiting the full potential of LC-based oriented imaging systems.The proposed network demonstrates the feasibility of using deep learning techniques to improve LC-MLA-based imaging systems' performance, paving the way for further research in this field.
Our findings propose a novel approach for achieving high-quality LC-based imaging systems, which can potentially benefit a wide range of applications, including microscopy, medical imaging, and remote sensing.

Figure 1 .
Figure 1.Framework for ARELF DNN based on LC-MLA.a) Light field data processing flow.b) ARELF network structure.

Figure 2 .
Figure 2. Schematic diagram of the structure of ZnO LC-MLA and related experimental results.a) Structure diagram.b) ZnO LC-MLA transmittance results.c) 3D beam results of the convergence PSF after parallel white light passes through ZnO LC-MLA at 2.0V rms .d) POM results at different voltages and angles for the alignment results.

Figure 3 .
Figure 3. Field scene test results.A comparison of data from left to right, linear interpolation, white balance, and demosaicing.The final column represents the method proposed in this study.a-d) show all-in-focus images after applying bilinear interpolation, white balance, demosaicing, and our proposed method, respectively.e-h) present a side-by-side comparison of the locally enlarged images for each method.i-l) compare the intensity contrast curve of RGB pixels.m-p) display histograms for each method.

Figure 4 .
Figure 4. Ablation experiments.a) The SC module was retained and the rest of the modules removed.b) The RDB module was retained and the rest removed.c) The SC and RDB modules are kept simultaneously.The two rightmost columns display line contour images with red, green, and blue intensity values and bar charts of the red-green channels' difference, with pixel values ranging from 600 to 700.

Figure 5 .
Figure 5. Dynamic experiments (with video).a) The start states of the dynamic scene.b) The end states of the dynamic scene.c) The images of the light field array acquired by the ZnO LC-MLA.d) The decoupled and fused light field sequence image from (c), as the initial input of the ARELF network.e) The output result of the ARELF network.The red box on the right side is the enlargement diagram.

Figure 5a ,
Figure5a,b shows the initial and final states of the dynamic scene, respectively.Figure5cshows the light field data acquired by the ZnO LC-MLA in MV format.At the same time, Figure5dpresents the decoupled and fused light field sequence 2D images obtained from Figure5c, serving as the initial input to the ARELF network.The local enlarged image in Figure5dis clearly blurred, and chromatic aberration can be observed.Figure5eshows the output result of the ARELF network, in which the local enlarged image reveals significant resolution enhancement and eliminates chromatic aberration.Our experimental results demonstrate that the proposed network is effective not only for static scenes but also for dynamic scenes.The network leverages the light field acquisition characteristics of the LC-MLA and achieves high-quality imaging results with enhanced spatial and temporal resolution.The reconstructed video obtained from the dynamic scene highlights the effectiveness of our network in capturing scenes with enhanced temporal resolution.

Table 1 .
Results of image evaluation functions for four methods.The results marked in bold are the optimal values of the experimental results.