A Multimodal Sensing CMOS Imager Based on Dual‐Focus Imaging

Abstract Advanced machine intelligence is empowered not only by the ever‐increasing computational capability for information processing but also by sensors for collecting multimodal information from complex environments. However, simply assembling different sensors can result in bulky systems and complex data processing. Herein, it is shown that a complementary metal‐oxide‐semiconductor (CMOS) imager can be transformed into a compact multimodal sensing platform through dual‐focus imaging. By combining lens‐based and lensless imaging, visual information, chemicals, temperature, and humidity can be detected with the same chip and output as a single image. As a proof of concept, the sensor is equipped on a micro‐vehicle, and multimodal environmental sensing and mapping is demonstrated. A multimodal endoscope is also developed, and simultaneous imaging and chemical profiling along a porcine digestive tract is achieved. The multimodal CMOS imager is compact, versatile, and extensible and can be widely applied in microrobots, in vivo medical apparatuses, and other microdevices.

. Optical microscope image (A) and SEM image (B) of the OV5640 CMOS sensor. show images acquired by the CMOS imager at RHs of 9.25% and 80. 52%, respectively, and (iii) shows the differential of (i) and (ii). (B) Relationship between the intensity std. deviation and the unit size at an RH of 9.25% for 100s. (C) Relationship between the initial intensity and the unit size at an RH of 9.25%. The error bars denote min to max intervals of 6 sensing units with            The R, G, and B channels were divided, and the B channel showed the best sensitivity.

Data analysis
The video stream was transmitted to a smartphone in AVI format at a resolution of 2592×1944 pixels and frame rate of 10 frames/s. We converted the video file to an image sequence in JPG format using the software "Free video to JPG converter". Then, we imported the image sequence to the open source software ImageJ and split the images into gray and R, G, and B channel intensity stacks. The area of each sensing unit was input, and the mean intensity of the pixels in the area was calculated. We used the mean intensity stacks to quantitatively characterize the color change in the sensing units over time. For the endoscope application, a video resolution of 1600×1200 pixels and a frame rate of 10 frames/s were utilized to ensure that the CMOS chip did not overheat in vivo.
To reduce background color interference, an adjacent area with the same shape and size was selected as a reference. The response of the sensing unit was modified as follows: where ( ) and ( ) represent the mean intensity of the pixels in the sensing unit and reference region at time , respectively, and ( 0 ) represents the mean intensity of the pixels in the reference region at the beginning of the measurement.
The compensation effect of Equation (1) is shown in Figure S12. We placed an M-imager in an environment with fluctuated illumination, and the intensity changes of two adjacent imaging areas were calculated, one as a target signal and the other as a reference signal. The compensated signal obtained by Equation (1)  Figure S13 demonstrates a typical intensity response of a CO 2 sensing unit. The unit color changed from blue to yellow, and the intensity responses of the R, G, and B stacks showed different sensitivities. In this circumstance, we employed the most sensitive channel (Blue) to represent the CO 2 concentration. The same evaluation method was applied to select the blue channel for the 20°C temperature unit, CO 2 unit and humidity unit. The green channel was chosen for the 35°C temperature unit. The red channel was chosen for the 50°C temperature unit. The gray channel was chosen for the 65°C temperature unit.
In the CO 2 calibration, we employed a binding-saturation equation to fit the correlation between the intensity responses and gas concentrations. The equation is defined as: where 2 represents the concentration of the gas samples and 2 represents the intensity response of the sensing unit. , , , and are parameters in the binding-saturation equation.
To calculate the detection limit, we kept the sensing unit in standard air (0 ppm CO 2 ) for 100 s. The In the humidity calibration, we employed a power series equation to fit the correlation between the intensity responses and relative humidity. The equation is defined as: where RH denotes the relative humidity and ℎ represents the intensity response of the sensing unit. , , , and are parameters in the power series equation.
In the temperature calibration, the intensity responses were normalized to compare the responses of the 4 sensing units. The normalization was formulated as follows: where ( ) represents the intensity response with background correction at time , and and represent the minimum and maximum response values of the sensing unit during the calibration.