Advances in Machine-Learning Enhanced Nanosensors: From Cloud Arti ﬁ cial Intelligence Toward Future Edge Computing at Chip Level

of massively parallel signal processing to realize the learning – updating – memorizing capabilities. Especially for growing IoT with large number of sensor nodes, it ’ s highly desired to develop neuromorphic computing that integrates computing functions into sensor networks. This inherent feature, stemming from the design inspired by human neural networks, ensures that even as we push for miniaturization and ef ﬁ ciency, the computational power and adaptability are not only preserved but often ampli ﬁ ed. In conclusion, as we chart the trajectory of these interconnected technologies, we hope that this analysis serves as a beacon, illuminating the profound implications and potential of cloud and edge computing, especially when inter-twined with the marvel of neuromorphic systems


Introduction
In the past decade, the integration of various sensors and artificial intelligence (AI) has emerged various fields, such as healthcare, environmental monitoring, human-machine interactions, and smart homes. [1,2][5] The current state of cloud computing reveals a rapidly growing platform, such as Amazon Web Services, Microsoft Azure, and Google Cloud, etc. [6,7] The diverse cloud computing platform enables businesses and individuals to utilize AI models tailored to specific needs, thereby eliminating expensive hardware investments and providing ondemand access to powerful computing capabilities.[25][26][27][28] Additionally, cloud AI facilitates highly customizable applications and features for wearables, enabling users to tailor their devices to fulfill individual preferences and needs.
From design to data analysis, the applications of cloud computing in nanophotonics and wearable electronics encompass a wide spectrum, driving technological advancements and unlocking broader possibilities for the future.
[36] Environmental monitoring systems leverage cloud computing to aggregate and analyze data from various sensors, contributing to more accurate predictions and timely responses to environmental changes.[39][40] In the age of rapidly advancing IoT technology, where immense data is continuously generated and processed, the quest for efficient computational methods has become paramount.
Edge computing is proposed to enhance the efficiency and capabilities of modern computing systems, particularly in the context of AI and IoT applications.[47][48][49][50] Therefore, neuromorphic computing can be used for edge computing to improve the processing capabilities at the edge by efficiently handling tasks like pattern recognition and anomaly detection in real time.This synergy reduces the need for sending massive amounts of raw data to centralized data centers for processing, which stands to revolutionize fields ranging from health monitoring and robotic control to smart home solutions and human-machine interfaces.Traditional computing paradigms, though powerful, often require multiple components to process information.They tend to lack the fluidity and efficiency evident in biological systems, where sensing and computational functions coexist symbiotically. [51,52]Figure 1 shows the advanced AI sensors from cloud AI systems, dependent on remote data processing, to the innovative future of edge computing, which play an integral role in enabling efficient, realtime, and localized AI processing at the chip level.The inclusion of computing in memory showcases the capacity to execute machine-learning algorithms directly within nanoscale memory components, eliminating the need for extensive data transmission to cloud servers.Neuromorphic computing is inspired by the architecture and algorithms of the human brain to enhance the energy efficiency.Integrating neuromorphic unit, synaptic device, into edge devices holds the potential to revolutionize local processing and bring intelligence closer to the data source.
Yet, the arena of neuromorphic computing is not solely tethered to materials like metal-organic frameworks (MOFs) or conventional 2D structures. [53]A rising trend is the exploration into all-optical neuromorphic computing.Unlike earlier paradigms that relied on optical-to-electronic conversions, all-optical solutions handle both sensing and computation solely in the optical domain.This approach promises to bolster processing bandwidth and ramp up operational efficiency.Recent innovations in waveguide-based neural networks and photonic deep neural networks are testamentary to the burgeoning potential of this domain.These networks not only align with, but in some respects surpass, the capabilities of the state-of-the-art electronic platforms.[56][57] For neuromorphic systems to truly burgeon and find widespread real-world applications, they need to scale beyond chip-level arrays.Achieving wafer-level scalability is a critical step toward the realization of high-density neuromorphic computing systems.In this light, recent breakthroughs, like the wafer-scale solution utilizing 2D material MoS 2 , herald a new era.In summary, the future of computing is on the cusp of a transformative era.Neuromorphic in-sensor and in-memory computing, backed by novel material platforms and innovative designs, are paving the way.60] The purpose of this review is to comprehensively highlight recent advances in clouds computing and edge computing to discuss opportunities and challenges for future research.We first discuss recent advances in various nanophotonics and electronics devices based on cloud computing and their applications.We also present the algorithms, architectures, and systems of insensor computing, in addition to emerging edge computing applications and AI accelerators.Finally, forward-looking perspectives on the future prospects of in-sensor computing are presented.

Cloud AI-Enabled Sensor Inverse Design
63][64][65][66][67][68] Among numerous photonic sensors, nanoantenna-based sensors combined with machine-learning algorithms are promising, because the performance of nanoantenna sensors depends on their structural patterns.[79][80][81] Manually exploring this vast design space to identify optimal designs is time-consuming and computationally demanding.Furthermore, manual design iterations for nanoantenna-based sensors involve a time-consuming trial-and-error process, requiring repeated fabrication, characterization, and optimization steps.Machine-learning algorithms can address these challenges by efficiently exploring the design space, uncovering nonintuitive design relationships, facilitating multi-objective optimization, handling complex design parameters, and expediting design iterations. [82]he integration of machine-learning algorithms with the design process enables automatic design generation, leading to enhanced sensor performance, reduced design time, and increased innovation in the field of nanoantenna-based sensors.For instance, machine-learning algorithms can automate the optimization process by iteratively adjusting the design parameters to achieve the desired objectives. [83,84]By learning from existing data or simulations, these algorithms can explore the design space more efficiently, identifying optimal or near-optimal configurations.Through techniques like genetic algorithms, evolutionary strategies, or gradient-based optimization, [85][86][87][88][89][90][91] machine-learning algorithms can search for optimal designs while considering intricate relationships between multiple parameters.In particular, the inverse design of sensors based on requirements demonstrates the superiority of AI. [91][92][93][94][95][96] The fundamental concept underlying machine-learningassisted reverse design entails the training of a machine-learning model, which acquires knowledge regarding the relationship between physical responses and structures.Subsequently, this model generates structural patterns based on the desired physical responses, thereby eliminating the necessity for computationally intensive numerical simulations.A notable demonstration of this approach is presented in Figure 2a, where So and colleagues illustrate the simultaneous inverse design of material and structural parameters for a core-shell nanoparticle-based nanophotonic substrate. [97]Given that the structural parameters constitute continuous quantities while the material parameters are discrete, achieving the simultaneous inverse design of these two parameters through an algorithmic approach poses challenges.However, by integrating regression and classification within a unified implementation (Figure 2a-ii), it becomes feasible to reverse engineer core-shell nanoparticles in accordance with user-defined spectra.During the model training process, a substantial quantity of parameters and their corresponding spectra, obtained via forward design, is indispensable.The core-shell nanoparticle parameters are derived, and the resultant predicted spectra exhibit a close correspondence with the target spectra (Figure 2a-ii).
Semi-supervised learning algorithms can also be applied in the context of reverse design, a technique that aims to reduce the training data requirement by utilizing both labeled and unlabeled data.Ma et al. introduce a novel network architecture for inverse design, which distinguishes itself from other existing methods.The proposed deep generative model, as illustrated in Figure 2b-i, consists of three distinct submodels: the recognition model, the prediction model, and the generation model. [98]hese submodels are implemented using four neural networks that are intentionally designed with specific structures to serve different purposes.The recognition model is responsible for encoding the optical response of the metamaterial pattern into a low-dimensional latent space.In contrast, the prediction model generates a deterministic prediction of the optical response based on the given metamaterial design.The generation model combines the optical response and a sampled latent variable to generate feasible metamaterial designs according to specific requirements.By randomly sampling and decoding these latent variables within the latent space, it becomes possible to reconstruct the original structural geometry, thereby facilitating the inverse design process.Figure 2b-ii showcases the simulated spectra obtained from the inverse design parameters (middle and bottom pane), which closely align with the desired spectrum (upper panel).Through the sampling process, a multitude of outputs are generated for the same target spectrum, thereby producing numerous candidates for the reverse design task.
The feasibility of employing unsupervised learning algorithms for the inverse design of nanophotonic structures has been demonstrated.Generative adversarial network (GAN) is a type of machine-learning model that consists of two primary components: a generator and a discriminator.The generator begins with random noise as input and progressively refines its output to resemble the characteristics of the real data and the discriminator evaluates the data it receives and distinguishes between ground truth and the data generated by the generator.The core of GAN's operation lies in the adversarial training process.The GAN reaches equilibrium when the generator generates data that is so convincing that the discriminator cannot tell it apart from real data.Copyright 2019, John Wiley and Sons Ltd. i) Schematic drawing.ii) Schematic diagram of the supervised machine-learning model used in the reverse design.iii) Validation of the inverse design approach.The provided design parameters are utilized to obtain spectra for both the target input (solid lines) and the predicted responses (open circles).b) Inverse design of nanophotonic devices using a semisupervised deep-learning algorithm.Reproduced with permission. [98]Copyright 2019, John Wiley and Sons Ltd. i) Architecture of the proposed deep generative model.ii) The required reflection spectra (upper panel), the results of inverse design (middle, bottom panel).Insets are the design pattern Liu et al. utilized an GAN in their network model to achieve reverse design of arbitrary substrate geometries (Figure 2c). [78]his methods engage in a competitive process and learn simultaneously to produce authentic patterns.The generator takes in random noise and generates a structure pattern that possesses the desired optical properties.Subsequently, the critic evaluates the pattern, determining whether it originates from the target structural geometry.The primary objective of the generator network is to deceive the critic network by generating genuine patterns.Through training, the generator model acquires the ability to create designs that closely resemble the patterns observed in the actual geometric data (Figure 2c-ii).Following unsupervised training, the model becomes capable of generating structural patterns corresponding to a given spectrum.Notably, when subjected to specific spectral requirements, the simulated spectra of both the test pattern and the generated pattern (via reverse design) exhibit a high degree of concordance (Figure 2c-iii).

Cloud AI-Enabled Electric Sensing
[105] Environmental effects, such as temperature, humidity, and background gases, can affect sensor performance and accuracy.Drift overtime can result in reduced sensitivity and compromised accuracy, while limited selectivity makes it challenging to distinguish between similar gases or detect target gases in the presence of interfering gases.Additionally, regular calibration is often necessary to maintain accurate measurements, which can be time-consuming and costly.
[108][109][110][111][112] This enables the sensor to detect multiple gases simultaneously and accurately.Machine-learning algorithms improve accuracy and reliability by leveraging the complex relationships between gas concentrations and sensor responses.They compensate for cross-sensitivity and environmental effects, resulting in more reliable measurements.Gas identification and classification are enhanced through the learning of unique response patterns, enabling accurate detection of specific gases even in complex mixtures.In particular, machine-learning algorithms can analyze the response patterns of electric gas sensors to different gases and learn the distinctive features associated with each gas. [113]The algorithm can classify and identify the presence of specific gases based on the sensor's response pattern by training on a dataset of known gas concentrations.[116][117] By analyzing multiple sensor outputs or fusing data from different sensor types, machine learning improves selectivity and enables the detection of target gases in the presence of interfering gases.Adaptability to changing conditions and continuous learning from new data allow the algorithms to update models and improve performance.Furthermore, machine learning reduces the need for frequent manual calibration by compensating for variations, resulting in cost and time savings. [118]r instance, Acharyya et al. report the successful integration of a chemiresistive sensor based on a single metal oxide with various soft computing tools, aiming to achieve accurate identification of tested analyte molecules through signal processing, feature extraction, and machine-learning techniques (Figure 3a). [119]The sensor device was fabricated using chemically synthesized SnO 2 hollow-spheres as the sensing material (Figure 3a-i).Notably, the sensor exhibited outstanding gassensing performance toward different volatile organic compounds (VOCs) despite cross-sensitivity (Figure 3a-ii).To extract distinct characteristic features associated with each VOC, the transient response curves obtained from the sensor were processed using fast Fourier transform (Figure 3a-iii) and discrete wavelet transform (Figure 3a-iv).Comparative analysis of these signal transform tools was conducted to evaluate their effectiveness in terms of feature extraction and support for pattern recognition.The extracted features were then utilized as input information for supervised machine-learning algorithms, enabling qualitative discrimination among the tested VOCs.Additionally, a quantitative estimation of the concentration for each VOC was achieved with acceptable accuracy.The primary focus of this article lies in the meticulous and efficient selection of features from the transformed signal, which significantly contributed to the exceptional performance of the machine-learning algorithms in terms of classification (best average accuracy: 96.84%) and quantification (Figure 3a-v).
The issue of poor selectivity has been a persistent problem in the field of miniaturized chemical-resistor gas sensors, which can be addressed by machine-learning techniques.In a recent study conducted by Hayasaka et al., a novel gas-sensing method is presented, which utilizes a single graphene field-effect transistor (GFET) in conjunction with machine-learning techniques to achieve gas selectivity under specific conditions (Figure 3b-i). [120]his approach combines the unique properties of the GFET and the concept of an electronic nose (e-nose).Instead of employing multiple functional materials, the gas-sensing conductivity profiles of the GFET are recorded and separated into four distinct physical properties.These properties are then projected onto a feature space as 4D output vectors, which are subsequently classified into different target gases using machine-learning analyses (Figure 3b-ii).By employing the single-GFET approach along with trained pattern recognition algorithms, accurate quantitative classification of water, methanol, and ethanol vapors was achieved when they were tested individually.When disparate chemical gases are mixed, it becomes imperative to establish a vector space for elucidating the distinct sensor responses corresponding to each gas.Specifically, a 3D/4D vector can be formed as follows: q 1 -the electron mobility (μ e ); q 2 -the carrier concentration (n); q 3 -the hole mobility (μ h ); and q 4 -the ratio of the residual carrier concentration to the charged impurity concentration (n*/n imp ).These parameters can be obtained from the sensor output shown in Figure 3b-ii.Utilizing the binary mixture in vapor water (methanol:MeOH; ethanol (EtOH):H 2 O) as an illustrative case, it is evident that the characteristics inherent in their respective 3D vectors exhibit notable distinctions, as delineated in the middle panel of Figure 3b-iii.Then, using a multi-class classification model, each component in the binary gas mixture can be well distinguished, with an accuracy of 96.2% (the right panel of Figure 3b-iii).It demonstrated the capability of the proposed  [119] Copyright 2022, Elsevier.i) Schematic drawing of the sensor.ii) Sensor response to gases.iii) Power density spectrum.iv) Characteristic coefficient values corresponding to the response curves.v) Algorithm performance.b) Machine-learning-enabled graphene field-effect transistor (GFET) gas sensor.Reproduced with permission. [120]Copyright 2020, Springer Nature.i) Diagrammatic representations depicting the variations in conductivity profiles relative to the applied gate voltage, accompanied by the corresponding underlying physical phenomena observed in a GFET.ii) Transient conductivity profiles versus the gate voltage with respect to time for water, methanol, and ethanol.iii) The 3D vectors of sensor outputs, which contains the characteristics of the sensor's response to gases and is used for machine-learning-enabled classification of the gas sensing.c) Machine-learning-enabled triboelectric nanogenerator gas sensor.Reproduced with permission. [105]Copyright 2021, American Chemical Society.i) Machine-learning-assisted and plasma enhancement mid-IR methodology.ii) Machine-learning analysis.iii) Healthcare diagnosis applications.
scheme to differentiate between gases in a realistic ambient environment with varying levels of background humidity.
In addition, electrical signals can also be converted into optical signals to obtain richer characteristic signals.Zhu et al. put forth a proposal for utilizing machine-learning techniques to enhance plasma discharge in the mid-infrared (mid-IR) range for the detection of various VOC species, including methanol, ethanol, and acetone (Figure 3c-i). [105]They successfully achieved voltages in the order of kilovolts through the multi-switched manipulation of a triboelectric nanogenerator.The output voltage from this nanogenerator was effectively utilized in a specific tip-plate electrode configuration, enabling plasma discharge across a wide range of VOC species.The authors demonstrated the synergistic effect of the strong electric field generated by the plasma and the mid-IR absorption characteristics of VOC molecular bonds, resulting in improved VOC sensing and identification capabilities.Leveraging plasma-enhanced IR absorption, accurate quantification of VOC species (such as methanol, ethanol, and acetone) was achieved even in mixed environments.Additionally, the authors visually represented the relationship between different VOC classifications at sub-parts-per-million (sub-ppm) concentration levels using machine-learning-assisted plasma-enhanced IR absorption (Figure 3c-ii).Lastly, the feasibility of plasma-enhanced IR absorption in healthcare diagnosis applications was demonstrated through the analysis of breath samples from simulated patients (Figure 3c-iii).

Cloud AI-Enabled Optical Sensing
Compared with electric sensors, optical sensors can offer high selectivity by leveraging the unique optical absorption or emission spectra of different analytes. [51,121,122]By using specific wavelengths of light, optical sensors can be tailored to target specific analytes and minimize interference from other analytes or environmental factors. [123]126][127][128][129][130] Optical waveguides can experience losses due to material absorption, scattering, or radiation losses.Minimizing these losses is crucial to maintain the signal integrity and maximize the sensor's sensitivity. [123,125,131,132]Strategies such as using low-loss materials, optimizing waveguide geometries, or employing effective cladding and coatings are employed to mitigate losses.139] Nanoantennas typically exhibit resonant behavior, leading to narrowband sensing responses. [19,140]Extending the sensing capability to a broader range of wavelengths or enabling multimodal sensing (e.g., polarization or phase) is a challenge.Machine-learning techniques can assist in extending the sensing capabilities of nanoantennas to broader wavelength ranges or multiple modalities.By analyzing large datasets of nanoantenna responses, machine-learning models can identify patterns and correlations that enable broadband or multimodal sensing, providing a more comprehensive understanding of the analyte or parameter being sensed.Furthermore, the enhancement of pattern recognition by machine learning to sensors is promising.
For an instance, Zhou et al. present an AI-enhanced metamaterial waveguide sensing platform (AIMWSP) that utilizes AI to analyze aqueous mixtures in the mid-IR range (Figure 4a-ii). [141]he authors achieve enhanced sensitivity of the waveguide sensor in a compact design by carefully designing the waveguide geometry on the silicon-on-insulator platform and employing a subwavelength grating metamaterial (Figure 4a-i).To confine the sensing length to a small region, a microfluidic channel is formed by bonding a polydimethylsiloxane (PDMS) chamber onto the chip surface, limiting the length to only 2 mm.The AIMWSP platform successfully realizes two key sensing functions: spectral recognition and decomposition of a ternary mixture consisting of acetone, isopropyl alcohol, and glycerin in a water solution.In the first function, the authors employ a convolutional neural network (CNN) to recognize the absorption spectra of mixtures with 64 predefined mixing ratios, achieving an impressive classification accuracy of 98.88%.Furthermore, the AIMWSP platform accurately discriminates the spectra of glycerin solutions with concentrations below the limit of detection of 972 ppm, achieving an accuracy of 92.86%.In addition to spectral recognition, the authors take a step further by utilizing a multilayer perceptron (MLP) regressor to perform spectrum decomposition and concentration prediction on the 64 mixture spectra (Figure 4a-iii).By accurately decomposing the spectrum into its pure components, the AIMWSP platform achieves reliable prediction results.Specifically, 62% of the prediction values have a root-mean-squared error (RMSE) within 0.5 vol%, and over 81% of the prediction values have an RMSE within 1 vol%, indicating the accuracy of the concentration predictions (Figure 4a-iv-vi).
[144] This alleviates the burden imposed by the vast amount of spectral data through the reduction of dimensions.In applications involving real-time monitoring, the resulting output data comprises 3D information, encompassing spectral intensity, wavelength, and time.When multiple analytes are targeted, the information expands to a 4D representation, incorporating category information.Consequently, the optical methods encounter difficulties in accurately and swiftly analyzing and processing the substantial volume of spectral data.However, utilizing machine-learning algorithms such as principal component analysis (PCA) in reducing the dimensionality of information while preserving pertinent features. [145,146]Consequently, this leads to a reduction in data quantity, simplification of data processing, and expedited generation of test results.As advancements in sensor technology and the variety of gases being monitored persist, it is expected that data volume will inevitably increase.Hence, the utilization of machine-learning algorithms for dimensionality reduction in spectral data is an invaluable asset to VOC sensors.Ren et al. devised a hook-shaped nanoantenna array that utilizes wavelength multiplexing to achieve continuous broadband detection of multiple absorption peaks in the fingerprint region (Figure 4b-i). [147]The surface-enhanced infrared absorption (SEIRA) spectra of different analytes possessing similar functional groups often overlap, making it challenging to distinguish them in mixtures when using narrowband SEIRA substrates (Figure 4b-ii).However, through the integration of PCA and support vector machines (SVM) algorithms, the authors achieved 100% accuracy in recognizing methanol, ethanol, and isopropanol, as demonstrated in Figure 4b-iii-vi.In summary, the advancements in AI techniques hold significant potential for enhancing VOC sensors by enabling rapid sensor design and automated data processing.

Cloud AI-Enabled Wearable Electronics
In addition to helping design and analyze optical sensors, AI cloud computing has recently been widely used in wearable sensors.Wearable sensors combining with AI data analytics can capture the signals of muscle deformation, joint bending, temperature changing, and heartbeat frequency, etc., where such  [141] Copyright 2023, American Chemical Society.i) The scanning electron microscope image and the distribution of electric-field magnitude of the subwavelength grating metamaterial.ii) Schematic illustration.iii) Machine-learning algorithm.iv-vi) Prediction of component concentration and accuracy assessment.b) Machine-learning-enabled optical nanoantenna sensor.Reproduced with permission. [147]Copyright 2021, John Wiley and Sons Ltd. i) Schematic drawing of the platform.ii) The reflection spectra of sensing data for machine-learning model.iii) Data dimension reduction.iv) The confusion map for machine-learning outcome.
information is crucial and widely applied for healthcare, environmental monitoring, human-machine interactions, and smart home applications.The following examples are presented to highlight the importance of built-in tactile sensors.As shown in Figure 5a, Sundaram et al. proposed a low-cost and scalable tactile glove (STAG), which can identify objects, estimate the weight of objects, and recognize hand poses. [148]A STAG and employ deep CNNs is leveraged to establish that a uniformly distributed array of sensors placed across the hand can effectively identify distinct objects, estimate their weight, and uncover characteristic tactile patterns that emerge during object manipulation.The sensor array, consisting of 548 sensors, is intricately integrated into a knitted glove.This array encompasses a piezoresistive film interconnected by a network of conductive thread electrodes.This dataset comprises 135 000 frames, each capturing the entirety of hand interactions while engaging with 26 diverse objects.With the help of AI, the encompassing range of interactions with various objects effectively unveils the crucial correlations spanning different regions of the human hand during the manipulation process.By extrapolating insights from the tactile signatures observed during human grasping activities through the perspective of an artificial emulation of the innate mechanoreceptor network.
As depicted in Figure 5b, Li et al. successfully developed a flexible quadruple tactile sensor to let the robot hand perceive grasped objects of different materials and shapes, and further Figure 5. Cloud AI-enabled wearable electronics.Advanced AI-enhanced wearable glove sensors.a) A scalable tactile glove (STAG) consists of a sensing sleeve with 584 piezoresistive sensors.Reproduced with permission. [148]Copyright 2019, Springer Nature.b) A flexible quadruple tactile sensor for robot hand perceives grasped objects.Reproduced with permission. [149]Copyright 2020, American Association for the Advancement of Science.c) A TENG strain sensor based on a unique yarn structure for the smart glove application.Reproduced with permission. [150]Copyright 2020, Springer Nature.d) A low-cost, self-powered, and intuitive glove-based HMI combining superhydrophobic triboelectric textile sensors.Reproduced with permission. [151]use an MLP that contains three hidden layers to realize automatic garbage classification. [149]The tactile sensor features a construction comprising two sensing layers enclosing a central layer of porous silver nanoparticle-infused PDMS.Each sensing layer is composed of two sensing elements.The upper and lower layers of the sensor are responsive to the thermal conductivity of the contact object and the applied pressure, respectively.This response is grounded in the disparity of thermal conductivity among various materials and the alteration in thermal conductivity within the porous material due to deformation.Concurrently, the cold films within the sensor function as local temperature detectors, registering both object and ambient temperatures.The developed tactile sensor is adept at simultaneously detecting multiple stimuli without encountering significant cross-coupling errors.This capability translates into the provision of enhanced features related to objects, consequently leading to improved accuracy in object recognition during the machine-learning process.This innovation holds the potential to considerably alleviate the challenges associated with environmental conservation and sustainable development within smart homes, thereby demonstrating its practicality in lessening the burdens faced by individuals in these contexts.
Compared with the traditional and dominant resistive and capacitive sensors, piezoelectric sensors and triboelectric sensors can produce self-generate voltage upon mechanical deformation, eliminating the need of external power supplies.Zhou et al. proposed a triboelectric nanogenerator (TENG) strain sensor based on a unique yarn structure as illustrated in Figure 5c. [150]The core of the sensing unit is composed of a conductive yarn coiled around a rubber microfiber, with the entire body sheathed by a PDMS sleeve.Varying degrees of deformation will result in a constant and continuous change in the contact area between the PDMS sleeve and the coiled conductive yarn, enabling the sensor with good linearity and sensitivity within a large strain range (20-90%).After integrating a wireless printed circuit board for signal collection, processing, and transmission, a wearable sign-to-speech translation system could be achieved with the multi-class SVM algorithm, whose overall accuracy could be maintained higher than 98.63% with fast response time (<1 s), showing a cost-effective approach for assisted communication between signers and nonsigners, as well as the prospect of TENG-based human machine interface (HMI) in the field of healthcare.
In addition, a low-cost, self-powered, and intuitive glove-based HMI is developed by combining superhydrophobic triboelectric textile sensors with machine learning, as shown in Figure 5d. [151]his innovative design allows for complex gesture recognition and control in both real and virtual spaces while minimizing the negative effects of humidity and sweat on performance.A carbon nanotubes (CNTs)/thermoplastic-elastomer-coating method is used to create superhydrophobic textiles, resulting in improved energy-harvesting and human-motion-sensing capabilities.This textile has a quicker recovery time from high-humidity environments, threefold boosted triboelectric performance, and better biomechanical energy scavenging compared to pristine textiles.The glove-based HMI, enhanced with machine learning, demonstrates a high recognition accuracy of 96.7%, outperforming non-superhydrophobic systems (92.1%).It also maintains 80% voltage output even after an hour of exercise.The developed glove interface has been successfully applied to various virtual reality (VR)/augmented reality controls, including shooting games, baseball pitching, and flower arrangement.
Moreover, glove-based gesture recognition systems hold great potential for assisting the speech and hearing impaired, particularly in sign language recognition.AI-enhanced glove systems can effectively recognize and translate various sign language gestures in real time, facilitating seamless communication for the speech and hearing impaired.Additionally, these advanced systems can be further refined by training them on diverse and extensive datasets, improving their accuracy and recognition capabilities for a universal platform to recognize complex gestures in various applications.Therefore, as shown in Figure 5e, Wen et al. demonstrate a sign language recognition and communication system based on smart glove sensors. [152]he deep-learning algorithm identifies word elements first and subsequently reconstructs the original sentences, achieving accuracies of 82.81% and 85.58%, respectively.Moreover, the segmentation method offers new possibilities for the recognition of new or previously unseen sentences.Specifically, recognized word units can be arranged in a new or different order to form new sentences.Simultaneously, the deep-learning model recognizes all basic word elements in the new sentence and provides a reasonable translation.In this manner, new sentences that are not included in the training dataset can be recognized.Lastly, the recognition results of sentences are projected into virtual space, where the signer can use their familiar sign language to communicate, while nonsigners directly type in their controlled VR interface.This advancement in recognizing existing sentences and new sentences enhances the practicality of sign language recognition systems, paving the way to reduce communication barriers between signers and nonsigners.Furthermore, recent tactile sensors have focused on performing the regression analysis to evaluate the impact of single or multiple stimuli on perceptual systems.The key focus is not only capturing tactile information but also analyzing and interpreting the quantitative effects of this information.Luo et al. propose a textile-based tactile-learning platform for a regression problem of the pose prediction task to record, monitor, and learn humanenvironment interactions. [153]Predicted models can be established through regression analysis to quantify the relationship between tactile sensor data and stimulation, which is crucial for the optimization of sensory systems and more accurate feedback.These studies have a wide range of applications, covering healthcare, VR, robotics, and other fields, providing strong support for improving human-computer interaction and enhancing user experience.The rise of this trend provides new opportunities and challenges to further promote the development and application of tactile sensor technology.

Cloud AI Sensors Toward Multimodality and Artificial Intelligence of Things
The current cloud AI sensors are gradually developing toward multi-modality, with its diverse range of advantages, to present a host of benefits that are pivotal in a variety of applications. [154,155]Combining different sources of information can enhance system reliability by providing backup alternatives in case of failures.Through the amalgamation of multiple modalities, these systems can adapt to changing conditions and still maintain their performance.This adaptability is particularly critical in dynamic environments where external factors can vary significantly.These advantages make smart sensor systems an essential component of various fields, such as robotics, environmental monitoring, healthcare, and security, where reliable and comprehensive information is vital for effective decision-making and performance, allowing smart homes to make more informed decisions.As depicted in Figure 6a, a bioinspired data fusion architecture was developed to perform human gesture recognition. [156]This architectural marvel seamlessly blends the power of visual data with the richness of somatosensory data, harnessed from cutting-edge skin-like stretchable strain sensors.The strain sensors were made from single-walled CNTs and the learning architecture employed a CNN for visual processing, followed by a sparse neural network for sensor data fusion and recognition at the feature level.The data fusion approach achieves a remarkable recognition accuracy of 100%.It gracefully navigates through the challenges posed by noise, as well as the daunting domains of under-or overexposure that tend to confound image sensors.This revelation affirms the stability and robustness of the data fusion approach, as evidenced by a minuscule error rate of a mere 1.7% under the benign gaze of normal illumination.Even in the shroud of darkness, this fusion approach demonstrates its mettle with an error rate of 3.3%, further attesting to its reliability.
Additionally, a novel method of data fusion from multiple sensors using a hierarchical SVM (HSVM) algorithm is presented in Figure 6b. [157]The validation of this approach is carried out through experimentation using an intelligent learning system that combines radar technology to detect hand and finger movements, alongside a flexible pressure sensor array employed to gauge pressure distribution surrounding the wrist area.The architecture of the hybrid sensor and vision-based model (HSVM) is ingeniously crafted to seamlessly amalgamate diverse data modalities encompassing differences in sampling rates, Figure 6.Cloud AI sensors toward multimodality and artificial intelligence of things (AIoT).a) Bioinspired data fusion architecture by integrating visual data with somatosensory data from skin-like stretchable strain sensors.Reproduced with permission. [156]Copyright 2020, Springer Nature.b) HSVM algorithm for radar and pressure sensors.Reproduced with permission. [157]Copyright 2019, John Wiley and Sons Ltd. c) A mole-inspired olfactory-tactileassociated machine-learning architecture.Reproduced with permission [158] Copyright 2022, Springer Nature.d) A self-powered piezoelectric AIoT node for smart mining, factory automation, transportation, and smart city applications.Reproduced with permission. [159]Copyright 2022, American Chemical Society.e) A multifunctional walking stick for the care of the elderly.Reproduced with permission. [160]Copyright 2021, American Chemical Society.
data formats, and gesture information sourced from both the pressure sensors and radar system.The outcomes derived from the amassed datasets involving 15 distinct participants exhibit that the standalone radar approach achieves an average classification accuracy of 76.7%.Conversely, the singular employment of pressure sensors yields an accuracy level of 69.0%.Notably, upon the integration of the pressure sensor outputs with radar data via the proposed HSVM algorithm, the classification accuracy escalates remarkably to 92.5%.
Furthermore, a bioinspired olfactory-tactile (BOT)-associated machine-learning architecture was proposed to process multimodal data and achieve object recognition (Figure 6c). [158]his architecture linked to the BOT incorporates a CNN, a multilayer neural network, and a decision neural network.The CNN is responsible for extracting a feature vector (comprising 512 dimensions) that correlates with pressure-related data.Meanwhile, the multilayer neural network acquires a feature vector (comprising 100 dimensions) associated with olfactory data.The decision neural network merges these two distinct feature vectors into a unified 612D feature vector and subsequently undergoes learning processes to achieve precise object recognition.These endeavors validate the effectiveness of amalgamating data from multiple sensors and employing machine-learning algorithms to create a robust learning system.Such a system demonstrates remarkable accuracy and adaptability, rendering it suitable for intricate tasks such as high-precision object recognition and decision-making within complicated environments.
More and more people are paying attention to the energy consumption of cloud AI sensors.Huang et al. introduced a selfpowered piezoelectric artificial intelligence of things (AIoT) node called intelligent cubic-designed piezoelectric node (iCUPE), designed for smart mining, factory automation, transportation, and smart city applications (Figure 6d). [159]The iCUPE boasts a modular design fashioned as a 3D hexahedron, with replaceable sensing and functional modules positioned at each of its six faces.These modules encompass a temperature-and humidity-sensing module, a Bluetooth module, a core data-processing module, and a frequency up-conversion piezoelectric generator (FUC-PEG) module.Specifically, the FUC-PEG module serves the purpose of extending the iCUPE's operational frequency span.It achieves this by integrating a low-frequency PEG (LF-PEG) with a thick-film high-frequency PEG (HF-PEG).This integration facilitates the conversion of LF stimulations into HF self-oscillations, ultimately resulting in an open-circuit voltage output of 48 V under LF conditions.The iCUPE is equipped to detect ambient vibration signals without requiring an additional power source.This capability allows for the initial capture of sensing details such as frequency and acceleration.The autonomous triaxial piezoelectric sensor (TPS), coupled with machine-learning techniques, incorporates three perpendicular piezoelectric sensing units utilizing the LF-PEG technology.This configuration culminates in the attainment of high-precision and multifunctional vibration recognition, featuring impressive resolutions of 0.01 g for acceleration, 0.01 Hz for frequency, and 2°for tilting angle.Consequently, the TPS emerges as a robust performer, yielding an exceptional recognition accuracy rate ranging from 98% to 100%.Moreover, in response to the increasing global population of over one billion elderly individuals and people with mobility limitations, and recognizing the requirements to cater to healthcare, the concept of a multifunctional walking stick has been introduced in Figure 6e. [160]This innovative walking stick features two primary functional units: the hybridized unit and the rotational unit.The hybridized unit comprises a top press TENG (P-TENG), a middle EMG, and a bottom rotational TENG (R-TENG), while the rotational unit contains only the EMG component.The P-TENG incorporates two aluminum layers, a nitrile layer, and a silicone rubber layer, generating varying output voltages depending on the pressure applied.The bottom aluminum is divided into five electrodes, which can record the entire process of the walking stick contacting and leaving the ground, including contact point, force, time, and sequence.Utilizing a deep-learning 1D-CNN structure to analyze the P-TENG output, the walking stick can distinguish between five distinct movements (stand up, sit down, walk, climb upstairs, and go downstairs), assess three different statuses, and identify ten separate users.Simultaneously, the R-TENG can detect irregular gait patterns, such as the user falling, through abnormal output signals.A virtual environment mimicking real-life situations has been developed to accurately represent the user's real-time motion status.The output signals of both P-TENG and R-TENG are collected by an microcontroller unit module and wirelessly transmitted to a computer for analysis.Using the deep-learning model, the user's real-time motion status within the home can be effortlessly obtained and reflected in the virtual environment.In addition, the smart walking stick can detect the anomaly gait immediately, enabling a swift call for the help of users' fall down.The caregiving walking stick solely monitors the user's motion status as a critical well-being indicator, addressing privacy concerns independent to other camera solution.Meanwhile, through the linear-to-rotary structure, which converts LF linear motion to high-speed rotation, the two units can efficiently harvest the ultra-LF motion of mobility-impaired individuals.A maximum average power density of 0.595 mW cm À3 at 1 Hz driven frequency and the ability to charge a 4 mF capacitor to 5 V in 8 s have been successfully demonstrated.The harvested biomechanical energy can power a self-sufficient IoT system featuring GPS location tracking, environmental temperature, and humidity sensing, achieving a comprehensive monitoring for users.The future development of cloud AI-enabled sensors holds immense potential for various industries, including IoT, healthcare, environmental monitoring, and more.These sensors, when combined with cloud computing, offer advanced data processing, analysis, and accessibility for advanced realtime analytics, high scalability and flexibility, high efficiency, high security, and privacy.In addition, future autonomous systems combining cloud computing with sensors can applied in various industries, from self-driving cars to smart agriculture.

Edge Computing at Chip Level
Software implementations of brain-inspired computing have been widely employed in various important AI computational tasks.However, abundant energy consumption and significant delay resulting from data transfer on traditional von Neumann computing architecture are aggravated by the recent explosive growth in highly datacentric IoT applications.To overcome the limitations of massively parallel signal processing, the more radical approach is to design hardware that mimics basic building blocks of the biological brain to the greatest extent, where highly interconnected elements allow parallel processing with learning-updating-memorizing capabilities. [40]specially for growing IoT with large number of sensor nodes, it's highly desired to develop novel computation paradigm that integrates computing functions into sensor networks.Many sensors, especially IoT devices, have limited computational resources and memory, making it challenging to perform complex computations locally.To address this limitation, developers can use lightweight algorithms and data compression techniques to reduce the computational burden on the sensors.Additionally, advances in hardware design, such as more energy-efficient processors, can help improve processing power.In-sensor computing can consume a significant amount of energy, which can be a concern for battery-powered devices.Current solution is utilizing low-power hardware components and implementing power management strategies to optimize energy consumption.Additionally, the use of energy-efficient machine-learning models and algorithms can reduce the processing load and energy usage.63]

Near-Sensor Computing
Integrating intelligent sensor with a dedicated processor to implement data-centric computing architecture, where tasks including data generation, collection, and computation are performed close or within the sensory devices, can help eliminate data movement and conversion at the sensor/processor interface.Figure 7a schematically illustrates different computational architecture.In a near-sensor computing architecture, dedicated adjacent processing units enable simple and specialized tasks at sensor endpoints.In contrast to conventional sensors that typically sum/accumulate or calculate a linear function, analogue designs with ability of brain-inspired algorithm such as accelerators for deep neural network and CNNs are involved.Namely, it involves analogue computing that directly processes analogue signals from sensor without analogue-to-digital conversion as in conventional sensing system.More specifically, recent AI-specific computing systems, that is, AI accelerator, are constructed with array of parallel computing and storage units.

Fully Connected Layer in ANN
Loosely inspired by biological neural networks, artificial neural networks (ANNs) have been widely adopted as a prevalent near-sensor computing solution for a remarkable human-like performance in diverse tasks, notably in image and voice recognition.ANN can be mapped onto multiple crossbar arrays of analogue device of which the resistance (e.g., two-terminal memristor), conductance (e.g., three-terminal transistor), or transmittance (e.g., optical devices), standing for neuron weight, can be effectively tuned to generate the trained readout.Tactile sensing can be processed with near-sensor computing architecture for recognition or intelligent actuation.

Convolution Layer
On the one hand, the convolutional operation in CNNs dominates the power consumption and operation time, which is suitable to be shifted to be finished close to the sensor as it typically is the first neuron layer, accelerating the computing.On the other hand, for a limited size device array, the most typical and promising hardware implementation scenario is utilizing the processor to perform as a convolutional kernel (or filter).That is because not only the MVM size required is comparatively small but also the weighting level required is much less than that in a fully connected layer.Realize hardware implementation of edge detection.However, a significant challenge lies in the practical convolution of the entire image.To address this, the implementation necessitates the use of a selector to manage the light input for each pixel and subsequently switch the image pixels during the convolution computation on a per-pixel basis.Despite this process ensures that the convolution operation is appropriately performed for all pixels in the image, the sensing information storage or signal shift during selection may become another issue.

Reservoir Layer
Originating from recurrent neural networks, reservoir computing (RC) is well-suited for real-time time-series analysis of information generated by dynamic systems, and only requires small training datasets.With a pool of interconnected neurons-the reservoir-RC adjusts only the output weights toward the target signal, giving a simple and fast-learning scheme.The reservoir device can extract temporal dynamics of the input stream and map onto a higher-dimensional computational space for a trained readout function to implement high-level processing.66]

Spiking Neuron Network
Inspired by biological systems, spiking neuron network (SNN) has attracted ever-growing interest where neurons exchange and transmit information via trains of spikes.Different spiking neuron models with proper synaptic plasticity, that is, learning rule, have been developed continuously in neuromorphic hardware for intelligent applications such as inference or recognition and event-driven processing. [167]Yet, SNNs for in-sensor computing have rarely been reported because of the complexity of implementing spike coding with the device itself.
The utilization of sensors and computation circuits typically entails diverse materials and manufacturing technologies.Consequently, ensuring the practical application of near-sensor computing necessitates meeting specific requirements to seamlessly integrate these components. [168]The integration of aforementioned dedicated processors with emerging sensors can be realized through the utilization of advanced integrated circuit packaging technologies.These cutting-edge packaging techniques enable the seamless integration of multiple components, ensuring efficient communication and collaboration among them, thereby fostering the development of complex and sophisticated systems.In addition, flexible or wearable synaptic devices are envisaged to open an avenue for new integration schemes for fully wearable intelligent systems. [169,170]n the in-sensor computing architecture, individual selfadaptive sensors or multiple connected sensors can be specifically engineered to combine the sensory information by arranging the unit into a square array of multiple pixels.For example, current from two-terminal electrical unit or photodetector are summed along a row or column naturally and intuitively, according to Kirchhoff 's law. Figure 7c shows a schematic of the sensor array, where the dimension of the input stimuli is n and the stimuli contain m classes; that is, the dimensions of the input and output layers are n and m, respectively.
The discrepancy between the currents produced by the array and the "inference" currents can be analyzed and programmed off-chip and then updated on-chip to adjust the neuron weight.Once trained, the chip can execute edge computing with large data samples, which further eliminates the sensor/processor interface and combines the sensing and computing functions.In general, for sensor networks, it can be a collection of sensors that measure the same external stimuli simultaneously, or it detects multimodal sensing information, which involves the physical coupling between stimuli.It benefits from the improved footprint, time delay, and energy efficiency through direct processing of the raw analogue data at the sensor endpoint, and hence is expected to be one promising approach for real-time and dataintensive applications.

In-Sensor/Near-Sensor-Processing Applications
In recent years, 2D materials as an emerging material platform have gained much attention for the demonstration of various neuromorphic and in-sensor computing.By skipping the traditional von-Neumann architecture where massive amount of data are transferred between CPU and memory unit, data-processing speed and efficiency can be greatly improved.[173] One area with huge potential is to utilize the optoelectronic properties of 2D materials for neuromorphic vision functions.Similar to our human eyes which are able to preprocess the perceived light information through bipolar neurons and ganglion cells before sending the signal for brain processing, 2D-material-based heterojunctions with gate tunability could mimic the retina functions that reduce the back-end data handling load through in-sensor processing. [174,175]Mennel et al. reported a neural network vision sensor based on a reconfigurable 2D material photodiode array (Figure 8a). [176]Copyright 2020, Springer Nature.b) Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor.Reproduced with permission. [177]Copyright 2020, American Association for the Advancement of Science.c) In-sensor optoelectronic computing using electrostatically doped silicon.Reproduced with permission. [178]Copyright 2020, Springer Nature.d) Broadband convolutional processing using band-alignment-tunable heterostructures.Reproduced with permission. [179]Copyright 2020, Springer Nature.e) A 2D midinfrared (mid-IR) optoelectronic retina enabling simultaneous perception and encoding.Reproduced with permission. [180]Copyright 2020, Springer (Figure 8b). [177]The more challenging of photoresponse represented by small gate voltage will produce a negative photocurrent that can be summed up via Kirchhoff 's law, thus performing the in-sensor multiply and accumulation functions.Three different operations using inverse, difference of Gaussian, and Laplacian filters were performed for image enhancement.In addition, the 3 Â 3 reconfigurable vision sensor array has been used for letter recognition with the help of off-chip activation and backpropagation.Jang et al. have demonstrated in-sensor optoelectronic computing based on a pure silicon-based solution (Figure 8c). [178]The highly doped silicon photodiodes are patterned with dual gates that can electrostatically tune the responsivity, thus performing the in-sensor-processing capabilities.The whole design is compatible with wafer-scale complementary metal-oxide-semiconductor (CMOS) fabrication, which brings it closer to real-world applications.As a demonstration, a 3 Â 3 network of the photodiodes was used to in-sensor image processing with seven different convolutional filters.The broadband convolution processing (BCP) using 2D vdW heterostructures is another developing trend as it covers multiple spectral bands from UV to IR regime, enabling key spectral and spatial features in remote sensing.To achieve BCP, Pi et al. developed gate tunable vdW heterostructures using PdSe 2 and MoTe 2 to perform in-sensor convolutional processing (Figure 8d). [179]ompared with other works, they have achieved multiband insensor convolutional processing instead of single-band-based solutions, and in each band, the kernel functions of sharpness and edge enhancement were independently demonstrated.The neuromorphic vision sensor based on 2D materials not only could work in visible light range, it has also been proven to work in IR regime, which greatly extends the operational capability, as IR imaging holds potential for various applications including LiDAR, sensing, communication, etc.Moreover, Wang et al. have developed an IR machine vision system that works in all-optical regime (Figure 8e). [180]The 2D b-AsP and MoTe2 vdW heterostructure could simultaneously perceive and encode data on a single device using two wavelengths at near infrared (730 nm) and mid infrared (4.6um).As a result, an inference accuracy of more than 96% to MIR MNIST dataset encoded by the device is achieved.In addition to the wavelength multiplexing, the level of convolution layers can be extended for high-order in-sensor computing too.Wang et al. have recently proposed a 3D neuromorphic photosensor array for nonvolatile in-sensor visual processing using vertical graphite/CuInP 2 S 6 /graphite photosensor unit.As shown in Figure 8f, three layers of the photosensor array (3 Â 3 Â 3) are stacked on top of each other, representing three kernels to improve the time and area efficiency for image processing. [181]The nonvolatile in-sensor computation was enabled by the directional Cu þ ion migration with voltage pulse programming.Near-sensor vision processing that integrates the sensing device and synaptic device can also be a solution that reduces the data load to the back-end neural network processing.As demonstrated by Seo et al., h-BN/WSe2 vdW heterostructures are formed as both sensing and synaptic device that could handle optical signals in RGB channel (633, 532, 405 nm), and the weight is controlled via an additional weight control layer based on electron-trapping and detrapping mechanism (Figure 8g). [182]ynaptic plasticity, postsynaptic current and long-term potentiation/depression were investigated to develop an optical nerual network (ONN) that emulated the colored and color-mixed pattern recognition capability of a human vision system, and over 90% accuracy was achieved in the color-pattern recognition task.In short summary, neuromorphic in-sensor computing has changed the image-data-processing paradigm by integrating the processing capability to the sensor end, which solves the data transfer bottleneck through weighted pixel acting as neuron functions.Depending on the applications, a wide range of operation wavelengths from UV to IR have been explored on different 2D material platforms.

In-Memory Computing and Neuromorphic Applications
Aforementioned works have shown that in-sensor vision processing using 2D-material-based vdW heterostructures has huge potential for fast data preprocessing with a compact device footprint, and the neural network weight is applied either through electrical bias or optical excitation to continuously perform the multiplication functions.However, to make it more energy efficient, nonvolatile memory is a crucial component to be integrated to fully exploit the neuromorphic computing capabilities.In recent years, resistive random-access memory (RRAM) as one of the common technology has been integrated with optoelectronic devices to realize neuromorphic in-memory computing. [183]Zhou et al. first demonstrated the integrated optoelectronic RRAM synaptic device for neuromorphic visual processing including both nonvolatile optical resistive switch and light-tunable synaptic behaviors (Figure 9a). [184]The simple two-terminal Pd/MoO x /ITO-stacking structure is able to perform ultraviolet (UV) light sensing and optically triggered resistance switching, enabling image memorization and real-time preprocessing functions such as contrast enhancement and noise reduction.Usually for pure electronics-based in-memory computing, the system requires at least one transistor and one resistor (1T1R) so that the transistor isolates the electrical current to selected cells.Lee et al. reported back end of line (BEOL)compatible all-oxide-based memristive crossbar array that performs morphological imaging processing for defect identification (Figure 9b). [185]The HfO 2 -based memristor is integrated with indium (In)-rich indium zinc oxide (IZO)-based thin-film transistor (TFT) to form a 1T1R pixel for the crossbar array.Together with a morphological image process algorithm, the defect identification task can be done with 10 4 times more energy efficiency compared with traditional CPU-based solutions.In addition to the 1T1R structure, 1PT1R structure incorporating one phototransistor and one memristor brings direct light sensing into the implementation of an optic ANN (OANN).Dang et al. have demonstrated the zinc oxide (ZnO)-based phototransistor and Mo/SiO 2 /W-based nonvolatile memristor for image recognition task in OANN (Figure 9c). [186]The 16 Â 3 device array with highly linear weight updates and uniform multilevel conductance states has achieved recognition accuracy of 99.3% after online training of only ten epochs.An even higher level of integration with on-chip light sources has been proposed for noise reduction purposes in UV image processing.Seung et al. have added quantum-dot light-emitting diodes (QLEDs) to the UV-responsive synaptic phototransistors for visualization and recognition (Figure 9d). [187]The integration was inspired by the all-or-none potentiation of the human synapse, as the on-chip integrated QLEDs with threshold switching could enable nonlinear filtering of the preprocessed signal, thus amplifying the signal output with reduced background noise.While various works have reported the in-sensor computing systems, the capability of processing stored images directly within the sensor is lacking.To address this gap, Lee et al. demonstrated a heterogeneously integrated 16 Â 16 one-photodiode one-memristor crossbar arrays for insensor image preprocessing using InGaAs photodetector and HfO 2 -based RRAM (Figure 9e). [188]The major difference in their approach is that the image is first stored in the crossbar array, then the trained weight values are applied as input voltages, thus reducing the sensory data transportation need.After the encoded image images were conveyed to off-chip ANN for classification, an accuracy of 82% was achieved with 100 training epochs.The 2D materials have also been explored with nonvolatile memory integration for neuromorphic computing.Lee et al. reported a black phosphorous (bP)-based phototransistor array with dual programmability (both electrically and optically), long charge retention time, and a high 5-bit memory resolution (Figure 9f ). [189]A stack of Al 2 O 3 /HfO 2 /Al 2 O 3 as the gate dielectric and charge storage layer was utilized for nonvolatile memory function through charge-trapping mechanism.With a multispectral image input covering S, C, and L bands, insensor computing for edge detection has been demonstrated, and the device array was also used as an optoelectronic CNN for image recognition, with a binary image classification accuracy of 92%.Most recently, Fu et al. have developed a simple two-terminal graphene/MoS 2-x O x /graphene photomemristor with tunable nonvolatile responsivities and demonstrated the computationally complete logic with photoresponse-stateful operations (Figure 9g). [190]The whole device can be used both as logic gates and memory unit, and the nonvolatile photoresponse rather than physical state variables of light, voltage, and conductance can be jointly controlled through electricfield-driven migration of ion and photoinduced redox reactions, which expands the functional diversity of edges-side neural networks.This work has proposed a new way of implementing onchip neuromorphic computing which can lead to both versatility and high-density integration.Compared with those without memory functions, in-memory computing generally requires more complicated device structures such as 1T1R or one photomemristor one resistor (1P1R) to allow either weight or image to be stored in the memristor; however, the energy consumption can be improved, and data manipulation can be more flexible as delayed processing can be done.

From Artificial Sensory Perception to Sensing-Computing Systems
In addition to codesign of sensor units and computing network discussed in Section 3.2, sensors with an inherent computing capability, which can execute signal conversion or information processing at the sensor node, can reduce data transfer and simplify the system structure.In this strategy, the sensor output is not linearly dependent on input stimuli as conventional ones, but represents temporal, spatial, or relationship information.Intelligent matter based on a single device with functionalities inspired by novel concept of AI has been reported for low-level in-sensor computing. [47,173]Despite that the original intention behind certain bioinspired devices was to emulate the behavior of sensory neurons, remarkably, this objective aligns seamlessly with the prevailing trend of neuromorphic sensing and computing.

Multimodal Sensing-Computing Device
The sense of touch plays a fundamental role in human perception, which is the most important part of building artificial sensory perception.Tactile sensing, including force/pressure, temperature, proximity sensing (humidity), and so on, has been long established and exploited in the fields of health monitoring, human-machine interface, robotic control, and smart home, etc.Within the realm of artificial tactile sensory systems, the pressure sensor functions as a sensor cell, responsively detecting external stimuli.The propagation of electrical pulses within the device closely resembles the information transmission process in axons, while the postsynaptic current undergoes intricate processing. [191]As shown in Figure 10a, Kim et.al.developed artificial afferent nerve based on organic devices, which converts the pressure information from clusters of pressure sensor into action potential by introducing integrated ring oscillators, and further integrate action potentials at a synaptic transistor. [192]s proof of concept, it can be used to identify braille characters pressed on an array of three pixels.A stretchable synaptic transistor based on elastomeric electronic materials is implemented with mechanoreceptors in an array format to form a deformable sensory skin, illustrated in Figure 10b. [193]As the synaptic transistor features filtering behavior for multiple input pulses, a soft neurorobotic is demonstrated with ability of performing adaptive locomotion in a programmable manner upon physically tapping the top skin.As one of the pioneering works combining synaptic behavior with intelligent robot, it suggests a promising direction for the development of bioinspired system, by not only mimicking the biological behavior but also developing inspiring functionalities for sensors toward engineering system.Liu et al. leverages a flexible multi-gate electrolyte-gated transistor, exhibiting inherent nonlinearity and short-term dynamics enabled by ion movement, to implement nonlinear parallel integration of the time-series signal (Figure 10c). [194]The device can execute multichannel signal integration and temporal features extraction such as correlations at the sensor node, which reduces the neural network analysis and computational costs.
Pain perception is an important function of the sensory system that prevents potential or actual harmful stimulus.In the bio-system, when the intensity of the noxious stimulus surpasses the nociceptor's threshold, the firing rates of the nociceptor increase proportionally with the rising stimulus intensity.This heightened firing response serves as an indicator of the severity of the noxious stimuli, which can be realized through diffusive memristor.A nociceptor is a crucial and specialized sensory receptor, primarily responsible for detecting noxious stimuli and promptly alerting the central nervous system to initiate motor responses in the body.Yoon built an artificial sensory system based on a diffusive memristor, with the ability of alarming when an external stimulus surpasses a predetermined threshold value, shown in Figure 10d. [195]igure 10.From artificial sensory perception to sensing-computing system.Tactile sensing-computing devices including a) an artificial afferent nerve system.Reproduced with permission. [167]Copyright 2018, American Association for the Advancement of Science.b) A stretchable synaptic transistor for performing adaptive locomotion.Reproduced with permission. [193]Copyright 2023, American Association for the Advancement of Science.c) A flexible synaptic transistor for nonlinear parallel integration of the time-series signal, Reproduced with permission. [194]Copyright 2023, John Wiley and Sons.d) An artificial nociceptor based on a diffusive memristor.Reproduced with permission, [195] Copyright 2018, Springer Nature.e) An olfaction sensingcomputing device to simulate alcohol-inhibited human brain nerve behavior.Reproduced with permission. [198]Copyright 2021, Royal Society of Chemistry.f ) A vision sensing-computing device to mimic locust nervous system.Reproduced with permission. [200]Copyright 2021, Springer Nature.g) A multimodal sensing-computing device combining visual and haptic receptor.Reproduced with permission. [203]Copyright 2020, Springer Nature.
Olfaction represents a crucial biological function in organisms, serving to discern diverse odors, detect hazardous gases, and evade toxic environments.The capability to identify gas or liquid compositions through smell aids in assessing their safety for human exposure.For olfactory sensing, it has been pointed out that a crucial step involves canceling the DC baseline of a chemosensory array, which often exhibits significant variations among different types of sensors.To address this challenge, olfactory chips have been created using planar system-on-chip integration. [196]These chips feature olfactory sensors that are linked to adaptive circuits specifically designed for baseline cancellation.The incorporation of adaptive elements within the circuits empowers the sensors to self-adapt effectively within the circuit's working range, making them highly responsive to different odors encountered during operation.
From the perspective of bio-mimicking properties, device engineering considers developing intelligent matter that comprehends environmental variations and enables reconfigurable sensation abilities to external stimuli.For instance, the integration of gas detectors with artificial synaptic devices facilitates the emulation of olfactory perception, enabling the realization of olfactory bionics.Ban et al. integrates selective gas sensor for VOCs. [197]lfactory memory functionality is defined by switching the memristive device from high-resistance state to low-resistance state (LRS) and retaining it at LRS after removing the gas stimuli, triggering the memory device to respond to gas above a certain threshold concentration.As shown in Figure 10e, Hang et al. employed the gas-mediated covalent organic framework RRAM to simulate alcohol-inhibited human-brain-nerve behavior, through observing the conductance of the device in alcohol gas environments. [198]Notably, the inhibition effect gradually increases with the increasing concentration of methanol.Although the idea and computing demonstration remain in their preliminary stages, it paves the way for simulating the perception process of smell through the integration of gas detectors with synaptic devices.
Among various sensors, image sensors are one of most welldeveloped and CMOS-compatible devices over a large scale.In addition to electronic-signal-based system, optoelectronic artificial efferent nerve, in which the signal transmission between the layer remains in optical domain through limit emitter, receiver, and memristor, is constructed to control manipulators intelligently. [199]Some researchers are working on in-sensor visual adaptation based on emerging bioinspired vision sensors.As shown in Figure 10f, Han et al. combined the memristive switching characteristics with a carefully designed wide fieldof-view artificial vision neuron device to mimic the lobula giant movement detector, which is the wide-field movement-sensitive neuron located in the lobula layer of the locust nervous system. [200]The fabricated memristor is strategically connected in parallel with a capacitor and in series with a resistor as the capacitance can be charged initially during neural refractory period until reaching the threshold of memristor, inducing the neuron firing.
Similar to the near-sensor computing for vision sensing, auditory sensors have been developed to emulate the function of biological sound.One can use the device to realize sound/voice recognition.By developing neuromorphic fibers to build a dendritic neural network, Kim et al. proposed an approach to build a simulated neuromorphic computing device to enhance the learning capability, which is compatible with speech recognition. [201]In addition, multisensory integration by the nervous system enhances the performance in a number of perceptual and behavioral domains, such as object identification, spatial and temporal perception, recognition, and recall.Presently, research efforts predominantly concentrate on singular-sensory-processing mechanisms.However, considering the remarkable capacity of the human perception system to concurrently sense diverse external stimuli within complex environments, the development of multimodal tactile sensing systems becomes crucial.Such systems should be capable of simultaneously processing multiple types of stimuli, encompassing various modalities. [202]One approach toward achieving multimodal tactile sensing involves employing a single tactile sensor designed to respond to multiple stimuli.As shown in Figure 10g, to implement accurate depiction of the environment based on multiple sensory cues, Chen's group enhanced the recognition capabilities of artificial sensory neuron by fusing visual and haptic receptor, deploying resistive pressure sensor, perovskite-based photodetector, hydrogel-based ionic cable, and a synaptic transistor. [203]oving toward intelligent edge computing, the design of a multimodal sensing system goes beyond mere sensing and integration of external stimuli; it also incorporates the crucial capacity for learning.The plasticity of single synaptic devices provides the basis for the learning ability of multimodal neural networks.The learning capability in multimodal sensing system facilitates the rapid and reliable combination of signals from different sensory organs and their sensing fields.Indeed, achieving the hardware implementation of multimodal integration from the underlying ANN algorithm to a single sensor involves a considerable amount of work.It necessitates a comprehensive approach that encompasses several stages of development, including hardware design, sensor integration, and algorithm optimization.
Till now, bioinspired sensing-computing systems consist of artificial synapses and neurons have been proposed to implement synaptic functions in the last decade because it enables efficient parallel information processing at ultralow power consumption, among which most reported work focused more on the resemblance of their systems to biological behaviors.Although the complexity and practicality of the enabled neurological functions in these demonstrated systems is still far from that in neurons and synapses, it is the first step toward future sophisticated and intelligent network.In addition, there are many challenges to be resolved at system level.For instance, the major obstacle for artificial neurons with sensor is the cascade issue for complicated neural networks with several layers.To construct novel intelligent systems, more investigations are necessary for the realization of spiking-time-dependent plasticity.Yet, at least for single-layer architecture, constructing a synaptic device suggests a broad range of future-orientated applications beyond soft machines, ranging from real-time pattern recognition to neuroprosthetics.

Other Promising Solutions-Possible New Technology and Developing Trends for in-Sensor Computing
Artificial sensory perception is important for in-sensor computing.Recent development of neuromorphic in-sensor and in-memory computing is not limited to only the crossbar arrays or 2D vdW heterostructures or various phototransistors and photomemristors.However, without being too broad to include every aspect of the neuromorphic computing, here, we will briefly mention few promising solutions and developing trend in addition to the majority of the current reported material platforms.One aspect is the novel material platform.Ding et al. has provided a summary of porphyrin-based metal-organic frameworks for neuromorphic electronics (Figure 11a). [204]As MOF has been widely used for optical-sensing applications, the idea of incorporating memory and computation capability is plausible and feasible with certain materials that possess extraordinary optoelectronic characteristics.Nonvolatile thin-film porphyrinbased metal-organic frameworks (PP-MOFs)-based memristors that utilize charge trapping/detrapping have been widely reported with on/off ratio as large as 10 7 and retention time over 10 5 s, and PP-MOF-based synaptic devices have been demonstrated with both all-optical and electronic stimuli.While there is yet an actual demonstration of integrated devices to perform in-sensor and the device performance still needs to be improved, it's worth further investigation in this direction due to its versatile and promising characteristics such as low-dimension nanostructures, high solution stability, and enhanced optoelectronic properties.Another area that has gained huge attention both academic and industry is all-optical neuromorphic computing, in particular, the waveguide-based neural network chips.Unlike the abovementioned works which involve optical-to-electronic conversion at the sensor side, the all-optical Figure 11.Possible new material platform and developing areas for neuromorphic in-sensor computing.a) Porphyrin-based metal-organic frameworks for neuromorphic electronics.Reproduced with permission. [204]Copyright 2023, John Wiley and Sons.b) An on-chip photonic deep neural network for image classification.Reproduced with permission. [207]Copyright 2022, Springer Nature.c) Wafer-scale solution-processed 2D material analog resistive memory array for memory-based computing.Reproduced with permission. [209]Copyright 2022, Springer Nature.d) A multiply-add engine with monolithically integrated 3D memristor crossbar/CMOS hybrid circuit.Reproduced with permission. [210]Copyright 2017, Springer Nature.solutions perform the sensing and computation in the optical domain, which can further increase the processing bandwidth and operational efficiency.Typical configuration of waveguidebased all-optical neuromorphic computing include the coherent optical computing in forms of interference type and the incoherent broadcast and weight (B&W) scheme in forms of coupler type. [205,206]Ashtiani et al. have presented an end-to-end on-chip photonic deep neural network with integrated imaging sensing and classification task (Figure 11b). [207]The input-light-signalcarrying image information is weighted through on-chip electronically controlled PIN attenuator, and the on-chip SiGe photodetectors perform the summation operation and send the results to an microring resonator-based modulator for nonlinear activation.The whole ONN chip consists of three layers, and a proof-of-concept demonstration of two-class and four-class classification of handwritten letters yields an accuracy of 93.8% and 89.8% in the time of 570ps, which is comparable to the state-of-the-art electronic platform.More recently, phase change materials-based all-optical neuromorphic computing has seen progress as well.Dong et al. has successfully demonstrated the on-chip in-memory photonic dot-product engine with electrically programmable weight banks using GST material. [208] record high 4-bit weight encoding and low energy consumption of 1.7 nJ dB À1 per unit modulation depth has been achieved to realize 86% inference accuracy in MNIST database.With all the progress in recent years, we anticipate that all-optical neuromorphic in-sensor and in-memory computing will advance rapidly in next few years.Scalability is another important aspect for the growth of neuromorphic in-sensor computing.Since most reported works have successfully demonstrated the in-sensor and in-memory computation with chip-level device array size up to 16 Â 16, wafer-level scalability is desired to achieve highdensity neuromorphic computing systems.Tang et al. has reported a wafer-scale solution-processed 2D material (MoS 2 )based memristor array for in-memory computing.Inter-flake sulfur vacancies diffusion was utilized for conductance modulation to achieve linear operations (Figure 11c). [209]As a result, the MNIST handwritten digits recognition accuracy more than 98% was achieved.The 3D stacking is another direction for high-density integration of neuromorphic computing systems.By layerstacking the thin-film 2D materials like the 3D NAND flash memory devices, computation density can be multiplied within the limited footprint.Especially in optoelectronic vision applications, the top layer can be used for the input layer while the middle layers filled with multiple memristors can be used for hidden layer to fully execute the on-chip neural network functions.A similar concept has been reported by Chakrabarti et al., as a monolithically integrated multiply-add engine in memristor crossbar/CMOS hybrid circuit was used for mathematical operations (Figure 11d). [210]Here, only two layers of memristive crossbar was integrated on a prefabricated CMOS substrate, and crossbar array was operated through the underlying CMOS circuitry.Few other works have also reported the 3D-based neuromorphic computing system, including bioinspired 3D artificial neuromorphic circuits, [211] flexible 3D memristor array, [212] and organic-based electrochemical transistors for 3D neuromorphic engineering. [213]

Conclusion
By reviewing the recent advances in the spheres of cloud computing and edge computing, we point out that neuromorphic computing offers exciting possibilities for high-speed and lowpower AI computational tasks with unprecedented compactness.On the one hand, with the astronomical capability of capturing essential features from vast amounts of high-dimensional data, AI models based on cloud computing with high computational power have become a promising tool to aid photonic design and enhance sensing performance in various ways.On the other hand, for computing at the edge, the hardware that mimics basic building blocks of the biological brain can overcome the limitations of massively parallel signal processing to realize the learning-updating-memorizing capabilities.Especially for growing IoT with large number of sensor nodes, it's highly desired to develop neuromorphic computing that integrates computing functions into sensor networks.This inherent feature, stemming from the design inspired by human neural networks, ensures that even as we push for miniaturization and efficiency, the computational power and adaptability are not only preserved but often amplified.In conclusion, as we chart the trajectory of these interconnected technologies, we hope that this analysis serves as a beacon, illuminating the profound implications and potential of cloud and edge computing, especially when intertwined with the marvel of neuromorphic systems.

Figure 2 .
Figure 2. Cloud artificial intelligence (AI)-enabled sensor inverse design.a) Simultaneous material and structural inverse design through a supervised deep-learning algorithm.Reproduced with permission.[97]Copyright 2019, John Wiley and Sons Ltd. i) Schematic drawing.ii) Schematic diagram of the supervised machine-learning model used in the reverse design.iii) Validation of the inverse design approach.The provided design parameters are utilized to obtain spectra for both the target input (solid lines) and the predicted responses (open circles).b) Inverse design of nanophotonic devices using a semisupervised deep-learning algorithm.Reproduced with permission.[98]Copyright 2019, John Wiley and Sons Ltd. i) Architecture of the proposed deep generative model.ii) The required reflection spectra (upper panel), the results of inverse design (middle, bottom panel).Insets are the design pattern through algorithms.c) Inverse design of nanophotonic devices using an unsupervised deep-learning algorithm.Reproduced with permission.[78]Copyright 2018, American Chemical Society.i) Network architecture to inverse design structural images.ii) Generating patterns with a predesigned class of geometric data.iii) Examples of the results of the inverse design.

Figure 3 .
Figure3.Cloud AI-enabled electric sensing.a) Machine-learning-enabled metal oxide gas sensor.Reproduced with permission.[119]Copyright 2022, Elsevier.i) Schematic drawing of the sensor.ii) Sensor response to gases.iii) Power density spectrum.iv) Characteristic coefficient values corresponding to the response curves.v) Algorithm performance.b) Machine-learning-enabled graphene field-effect transistor (GFET) gas sensor.Reproduced with permission.[120]Copyright 2020, Springer Nature.i) Diagrammatic representations depicting the variations in conductivity profiles relative to the applied gate voltage, accompanied by the corresponding underlying physical phenomena observed in a GFET.ii) Transient conductivity profiles versus the gate voltage with respect to time for water, methanol, and ethanol.iii) The 3D vectors of sensor outputs, which contains the characteristics of the sensor's response to gases and is used for machine-learning-enabled classification of the gas sensing.c) Machine-learning-enabled triboelectric nanogenerator gas sensor.Reproduced with permission.[105]Copyright 2021, American Chemical Society.i) Machine-learning-assisted and plasma enhancement mid-IR methodology.ii) Machine-learning analysis.iii) Healthcare diagnosis applications.

Figure 4 .
Figure 4. Cloud AI-enabled optical sensing.a) Machine-learning-enabled optical waveguide sensor.Reproduced with permission.[141]Copyright 2023, American Chemical Society.i) The scanning electron microscope image and the distribution of electric-field magnitude of the subwavelength grating metamaterial.ii) Schematic illustration.iii) Machine-learning algorithm.iv-vi) Prediction of component concentration and accuracy assessment.b) Machine-learning-enabled optical nanoantenna sensor.Reproduced with permission.[147]Copyright 2021, John Wiley and Sons Ltd. i) Schematic drawing of the platform.ii) The reflection spectra of sensing data for machine-learning model.iii) Data dimension reduction.iv) The confusion map for machine-learning outcome.

Figure 7 .
Figure 7. Illustrations of emerging computation paradigm for edge computing.a) Near-sensor processing with different sensory information.b) AI-inspired computing algorithms for on-chip processing.c) In-sensor computing architecture through sensor networks.d) Internet of thing (IoT) computation tasks ranging from low-level processing to high-level AI computing.
Both supervised and unsupervised learning and training for image classification and encoding have been demonstrated with a throughput of 20 million bins s À1 .Similarly, neural network vision sensor enabled by vertically stacked WSe 2 /h-BN/Al 2 O 3 vdW heterostructures with positive and negative gate tunability to mimic the biological retina has been reported byWang et al.

Figure 8 .
Figure8.The 2D-material-based emerging devices for neuromorphic in-sensor computing.a) Ultrafast machine vision with 2D material neural network image sensors.Reproduced with permission.[176]Copyright 2020, Springer Nature.b) Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor.Reproduced with permission.[177]Copyright 2020, American Association for the Advancement of Science.c) In-sensor optoelectronic computing using electrostatically doped silicon.Reproduced with permission.[178]Copyright 2020, Springer Nature.d) Broadband convolutional processing using band-alignment-tunable heterostructures.Reproduced with permission.[179]Copyright 2020, Springer Nature.e) A 2D midinfrared (mid-IR) optoelectronic retina enabling simultaneous perception and encoding.Reproduced with permission.[180]Copyright 2020, Springer Nature.f ) The 3D integrated photosensor array for multilevel on-chip convolution and image processing.Reproduced with permission.[181]Copyright 2020, American Chemical Society.g) Artificial optic-neural synapse for colored and color-mixed pattern recognition.Reproduced with permission.[182]Copyright 2018, Springer Nature.