Intelligent Optical Microresonator Imaging Sensor for Early Stage Classification of Dynamical Variations

Although machine learning (ML) solutions are conquering science and technology with a growing number of the affected branches, only a few examples have been demonstrated so far to boost the highly sensitive and label‐free optical detectors. Nevertheless, ML methods can indeed provide novel strategies for emerging optical sensing technologies to overcome the challenges of real‐world implementation. On this way, the sensing techniques become intelligent self‐learning systems capable for predictions of the probed parameters in an automated and self‐adjusting manner. Herein, a report on the upgraded ML engine of an affordable multiresonator imaging sensor operating at a fixed frequency is provided. It is demonstrated for the first time that such a sensor supplemented by a long short‐term memory (LSTM) network processing engine enables accurate prediction of the dynamical sensor responses already within the early stages of the observed variations. On example of the experimental data of solutions mixing dynamics, the possibility for accurate (>95%) differentiation of the dynamical variations at the refractive index level of 5.5 × 10−4 within a six times shorter period than the whole time series is shown. The impact of the parameters and architecture of the LSTM‐based processing engine on the speed and prediction accuracy is analyzed.


Introduction
A label-free detection method based on excitation of standing waves inside an optical microcavity with circular symmetry is known as the whispering gallery mode (WGM) sensing mechanism. [1][2][3] Here, the light energy can be effectively (with high quality factor of the resonance peaks) stored within a tiny structure thanks to the total internal reflection at the resonator's boundary, [4,5] whereas the sensing approach is based on tracking of the spectral variations caused by the interaction of the confined optical field with the surrounding medium. Depending on the morphology, material, and/or geometry of the microresonators, it is possible to monitor various external properties, e.g., pressure, [6] temperature, [7,8] biomolecular composition, [9][10][11] refractive index, [12,13] humidity, [14] or gas/vapor type. [15,16] Past years showed significant progress in the development of the optical microresonator-based sensors in terms of sensitivity enhancement [17,18] and advancement of the integration level of the microcavities and the optical components for excitation and signal collection. [19] Nevertheless, the vast majority of the WGM sensing solutions still utilizes the original concept of spectrally resolved data collection whose realization requires costly equipment either on the side of the light source or detector. The paradigm-changing method, that exploits the concurrent imaging of the WGM signal radiated from more than a hundred of microresonators, offers an affordable and efficient alternative, which is especially attractive for the real-world practical applications. [20,21] The multidimensionality exhibited by the signal of such a sensor and the dispersion of properties of individual microresonators allow to apply the cost-and time-effective fixed frequency interrogation approach, albeit complicates the interpretation of the measured parameters. Here, the well-known and typically used analytic models, that describe the response of the optical field on the external influences via the resonance frequency shift [22,23] and line broadening, [24] are no longer applicable. The reason for this is that the collected signal now contains only the projection of the spectral information onto the single selected frequency. The translation of the sensor signal into the measured parameters is thus addressed using the deeplearning methods. [21] DOI: 10.1002/adpr.202100242 Although machine learning (ML) solutions are conquering science and technology with a growing number of the affected branches, only a few examples have been demonstrated so far to boost the highly sensitive and label-free optical detectors. Nevertheless, ML methods can indeed provide novel strategies for emerging optical sensing technologies to overcome the challenges of real-world implementation. On this way, the sensing techniques become intelligent selflearning systems capable for predictions of the probed parameters in an automated and self-adjusting manner. Herein, a report on the upgraded ML engine of an affordable multiresonator imaging sensor operating at a fixed frequency is provided. It is demonstrated for the first time that such a sensor supplemented by a long short-term memory (LSTM) network processing engine enables accurate prediction of the dynamical sensor responses already within the early stages of the observed variations. On example of the experimental data of solutions mixing dynamics, the possibility for accurate (>95%) differentiation of the dynamical variations at the refractive index level of 5.5 Â 10 À4 within a six times shorter period than the whole time series is shown. The impact of the parameters and architecture of the LSTM-based processing engine on the speed and prediction accuracy is analyzed.
The deep-learning powered sensors and especially the optical ones establish an emerging research field, where only few rare examples have been reported so far. In addition to the aforementioned solution on imaging of the signal from multiple microresonators at the fixed frequency, [21] the other machine learning (ML)-based application for WGM sensors utilizes the complete spectrally resolved information of a single microresonator. [25][26][27][28] Further examples of optical sensors supported by the ML techniques include surface plasmon resonance method [29] and attenuated total reflectance approach. [30] The reported intelligent WGM sensor [21] has been proven highly efficient for parameters prediction under the equilibrium condition when a steady state of the sensed medium around the optical sensor is reached. The corresponding response and thus waiting time, though, may vary here from several milliseconds up to several hours, depending on the type and value of the measured parameter. As a characteristic example, one can point out the affinity biosensors dedicated to the detection of the type and/ or concentration of biomolecular components, [31][32][33] e.g., proteins, viruses, bacteria, DNA, or RNA. In this case, the sensing response is governed by the binding of the particles with the ligand preliminary immobilized on the resonator's surface so that the description of the measured parameter within only the steady state appears not comprehensive. In fact, the complete response curve, referred to also as a sensorgram providing the shift of the resonance frequency is monitored over time, has to be considered. The measured dynamical response in terms of spectral shift of the resonance frequency can be then analyzed using the Langmuir adsorption model. The latter is defined by the association and dissociation rates of the molecules together with their concentrations. [34][35][36] The model-based description option is though not available for the signal collected from multiple microresonators at a fixed frequency. Instead, the dynamics analysis can be addressed by extending the previously reported [21] deep-learning engine to learn from the time series data through the use of the recurrent neural networks (RNNs) and their particular type-long shortterm memory (LSTM) network. The number of the reported examples of the LSTM networks implementation for processing of the dynamical variations covers predominantly the problems of handwriting, speech recognition, and machine translation. At the same time, several examples of spectral data processing applications, e.g., classification of the time series spectral images [37] and mass spectrometry data [38] have been already reported.
This work reports on a novel strategy based on the LSTM network training for quantification of the dynamical responses within the optical sensing and is demonstrated for multimicrocavity imaging optical sensor operating at the fixed frequency. The performance of the proposed approach together with the impact of the training dataset and complexity of the deeplearning engine for accuracy of the external parameter prediction has been studied. Benefits for early decision making without necessity for analysis of the whole dynamics till the steady state have been discussed.

Data Collection
To match the variations of the radiated intensities with the particular microresonators on the chip, a map of the microcavities is established. For this, the multistep image processing approach for edges detection has been applied as described in a previous study. [39] The radiated light observed within each microcavity area is then perceived as the microcavity signal. The intensity of the signal for each microresonator is described by the location of the fixed laser frequency in the WGM spectrum ( Figure 1). Within a particular sensing event a set of intensities with degrees www.advancedsciencenews.com www.adpr-journal.com of freedom equal to the number of resonators (of the order of several hundreds) is produced. The most frequent scenarios for the WGM-based sensing are the monitoring of the bulk refractive index and of the biochemical molecules' adsorption. In both cases, an overall value of the resonance frequency shift is primarily analyzed, where, though, the dynamical variations between the steady states might be also of particular importance for complete description of the measured entity. For the molecules adsorption process, the sensorgram follows a differential rate equation described by the Langmuir model, where the actual concentration impacts the shift and the kinetic rate. For the bulk refractive index sensing, the period between the steady states for different refractive index units (RIUs) is mainly described by the mixing of the sensed medium inside the flow chamber with the incoming fluids. The larger the difference between the previous and the subsequent concentrations is, the longer is the period required to reach the final resonance frequency position. Still, a more complex behavior of the dynamical variations of the spectral shift can be observed for microresonators with functional layers or with structural response enhancement. [11] In case of the fixed frequency sensing the response of each resonator is nonlinearly modulated by the WGM spectrum that is unique for each microcavity (Figure 1). This results in both the nonlinear character of the captured intensity variations between the steady states, as it was discussed in the study by Saetchnikov et al., [21] and in the complex nature of the dynamical variations between them. As a result, depending on the physical and/or chemical nature of the process to be sensed and the time scale of the variations produced by the external process, the collected signals differ greatly.

Preprocessing Strategy
The preprocessing of the dynamical variations in the fixed frequency illumination scheme is proposed as following. Initially, the set of the radiated intensities (number of observations-o i ) for each particular resonator (r) has been scaled to the [0;1] range. Then the dataset has been resampled with the frequency of 1 Hz, o r (mean frame rate of the camera is 60 Hz). The sampling frequency has been selected with respect to the average duration required to observe the changes in the measured temporal variations beyond the two standard deviations of the signal for the water environment. Later n different groups representing different external impacts including m repetitions are produced. Due to the limited amount of the repetitions that can be produced within an experiment a set of the artificial data is generated for each group (output state) according to the normal distribution with the mean value and the standard deviation calculated at each time point of the experimental data. The whole amount of the repetitions for each group is set to 200. The aforementioned steps are schematically shown in Figure 2.
The demand on the computational resources is expected to rise when switching from the classification task to the regression one by increasing the number of the groups representing various dynamical variations. To keep the training and testing time within the reasonable limits the data preprocessing steps could be additionally supplemented by the translation of the measured intensities variations over time into the series of the characteristic points by fitting with parametric curves. In this case, however, one should account for several substantial drawbacks: first, the temporal resolution of the sensing data is thus reduced; second, the efficiency for such conversion is expected to be low providing the captured variations of the intensities radiated by the microresonators possess rather simple dynamics with minimum inflection points over the sensing period.
During the analysis of the experimental dataset on variations of the radiated intensities, there is a number of microcavities observed that demonstrate no variations in the collected signal while the parameters of the sensed medium are changed. As the measured signal is the function of the resonance spectrum of the microcavity where the resonance peak is narrowly allocated, the illumination frequency for mentioned resonators falls into the spectrum part with no resonance. The absence of the resonance behavior in general or resonances with low quality factors that appear for corrupted microcavities are also the reason for observing insignificant or no temporal variations in the signal. To reduce dimensionality of the data and to extract the most significant features (microcavities), they have been rearranged according to the ReliefF algorithm [40,41] with a modification to handle the temporally resolved data proposed in the study by Radovic et al. [42] This method has been chosen as the most  beneficial in terms of the algorithm efficiency for features selection and tested on various types of synthetic data. [43] The ReliefF algorithm is an iterative method to estimate the feature weights based on the variation of their value for different output states. The weights of the features are initialized with zeros. At each iteration of the algorithm a random instance (R) is selected. For the considered type of data this is a single sample of the dynamical response. Then two different setsone with k nearest neighbors (here k ¼ 20) representing the same class as the instance R and another one with k nearest neighbors from a different class-are formed. The neighbors of the same class are called hits (H) and the ones from the different class are referred to as misses (M). The weight of the feature is updated by subtraction of the average distance to the hits and addition of the average distance to the misses. In the original algorithm, the instance has no dynamical feature and thus the Euclidean distance can act as the measure. To handle the time series that describes the instance, a method of the dynamical time warping (DTW) is selected for measuring the similarity between the dynamical variations of R, H, and M. A single iteration of the weight update is repeated several times (selected to be equal to the number of the time series in the training data). Finally, the features are sorted in the descent manner according to their weights.

Background on RNN and LSTM
RNN is a class of the neural networks that are supplemented by the loops in the connections compared to the conventional feedforward networks. [44] These loops in the structure enable processing of the input sequences by introduction of a recurrent hidden value whose state at each time point of the sequence is calculated considering its state at the previous time point. Depending on the task, the RNN architecture can be designed for different types of the input and output values.
LSTM networks are based on recurrent connections of the LSTM cells as well. [45] However, unlike the conventional RNN, the LSTM network is specially proposed to learn the long-term dependencies in the input sequences and thus allows to overcome the main problems of the RNNs: vanishing and exploding gradients when the long sequences are used for training. This endows the network with less sensitivity to the time gaps in the data sequence. In addition to the hidden value state, at each time point of the recurrent structure, the LSTM block delivers a cell state. The mechanism for calculation of the outputs of the LSTM cell is based on the input (i), output (o), and forget ( f ) gates. The input gate controls the level for the cell state to be updated with the input value, the forget gate-the level to pass the previous hidden state through the cell, and the output-the level to transfer the cell state to the next LSTM cell.

Results and Discussion
In this section, the performance of the LSTM network for prediction of the sensed parameters characterized by the dynamical response of the WGM signal is analyzed. The algorithms for the LSTM network construction have been implemented in Matlab 2021a using the Deep Learning Toolbox, Statistics and Machine Learning Toolbox. The local workstation with GPU Nvidia Quadro RTX 6000 has been utilized for training and testing the LSTM network. Within this research two studies on application of the LSTM network in terms of its prediction accuracy were carried out. The first one was intended to verify the impact of the amount of the features (microresonators) in the dataset used for training and testing. The second study is focused on the optimization of the LSTM network architecture. The experimental dataset including the dynamical changes in the collected signal of the WGM imaging sensor in the fixed frequency illumination scheme has been produced on example of small temporal gradient of the refractive index value in the chamber. This has been realized by mixing the initial medium in the fluidic chamber (water) at the constant pumping speed with the incoming liquids of different refractive indexes.

Dataset Description
Time series data have been produced with five different concentrations of ethanol from 1% to 5% with a step of 1%. In terms of the refractive index change this corresponds to 5.5 Â 10 À4 -2.7 Â 10 À3 relative to the water refractive index. The measurement for every concentration has been repeated four times with 2000 s in each phase. After each measurement, the flow chamber has been flushed with the deionized water for 2000 s to ensure the pure water environment around the sensor. Hereby, the whole duration of the experimental run lasts for 80 000 s that considering the 60 Hz frame rate of the camera corresponds to 4.8 million of observations. The collected signal, which includes the intensities radiated from 181 microresonators, has been grouped into five data sets per 2000 s so that each of the sets represents a different temporal gradient of the refractive index in the flow chamber. The obtained variations of the radiated light for six representative microcavities on the imaging sensor are shown in Figure 3.
The graphs show the time period of 2000 s where the slow mixing of the pure water inside the flow cell with five different ethanol solutions (1%, 2%, 3%, 4%, and 5%) happens. The whole time period can be split into three parts. The first one lasts up to %250 s from the beginning (varies between the resonators due to different position relative to the input channel), when the solution is transported from the valve to the sensing chamber. Within this period, the signal remains constant and corresponds to the pure water environment. The second is the period of mixing the present solution with the incoming one. Here, the uniqueness of the response among the microresonators as well as between the different concentrations is clearly observed. Each of the resultant responses is a combination of two nonlinear functions where the first one is common for all microcavities and describes the temporal gradient of the bulk refractive index inside the chamber; and the second function describes the individual WGM spectrum of the microcavities. This can be clearly observed for the microresonators #1-4 in Figure 3 where the temporal variations of the radiated intensity exhibit the signs of the resonance peaks of the microcavities. Here, the amount of these peaks and their linewidth depend on the filling medium concentration, where higher values of concentration lead to appearance of more spectral features and to faster variation rate. In contrast to microcavities www.advancedsciencenews.com www.adpr-journal.com #1-4, resonators #5-6 show less-complex behavior in the variations that is explained by almost linear part on the resonance curve where the refractive index changes appear: falling (#5) and rising (#6) part of the resonance peak. The third period starting from %800 s represents a steady state for the chosen ethanol solution.
The repeatability between the identical sensing events (captured with the time delay up to several hours) varies between 0.01 (#6) and 0.1 (#2) arb. units for the steady state. Dispersion of the repeatability value between different resonators arises from the different nonlinear response of each microresonator collected at the fixed wavelength of the illumination source. Here, the same repeatability in terms of spectral shift is translated into different variations of the intensity signal depending on the shape of the spectral curve and location of the illumination wavelength on this curve. At the same time, considering several hundreds of microresonators simultaneously the overall repeatability of the sensor approaches the lowest value among them. This is tuned by the distribution of the weights among the features during the process of the neural network (NN) training. The stability of the signal at the steady state (third period) for the vast majority of the microcavities on the sample demonstrates the relative drift below 1%. The particular cases when the relative drift increases such as observed for microcavities #1 and #4 are expected to be caused by the incomplete enveloping of the resonator with liquid.
As it was previously reported, [21] the RIU states can be accurately predicted by the deep-learning engine that is pretrained on the data from the third period. However, for the represented dataset at least 550 s are required to wait during the experiment until steady state is reached. So, the deep-learning engine is intended to learn from the time series data from the second period in order to enable the prediction of the external impact without measuring the signal till the equilibrium.

LSTM Network Architecture
The measured experimental data represents the sequence input to single output classification problem, where the temporal variations of the radiated intensities correspond to one particular concentration of the incoming agent. The LSTM network structure has been constructed out of the following blocks ( Figure 4): sequence input layer, bidirectional LSTM (BiLSTM) layer, dropout layer with 0.2 probability, fully connected layer for five output classes, a softmax layer to compute the probability of each dynamic to belong to a certain class, and a classification output layer. Compared with the conventional LSTM layer, the bidirectional one processes the sequence data in the forward and backward directions in two separate hidden layers. Dropout layer is intended to avoid the overfitting via dropping the random neuron from the network during the training with the given probability.  Adam optimizer including L2 regularization factor has been selected for the network training with the gradient threshold of 1 and initial learning rate of 0.01. The amount of the training epochs for LSTM network is controlled by both the maximum number of epochs (here equal to 100) and by the validation patience parameter (here: 5). The latter is introduced to avoid the overfitting as well, and describes the number of times when the loss on the validation set is larger or equal to the previously smallest loss before the training stops. If the loss parameter is continuously decreasing during the training process, it stops when the maximum number of the epochs is reached. The performance of the LSTM network for classification of the sensed mixture concentrations is finally evaluated by the correct classification rate.
The preprocessed experimental dataset with the reorganized position of the features according to their weights descending (ReliefF) has been split into 10 subsets with different number of features [10:10:100]. Later, the sets have been further divided into the subsets of different sequence duration (from 100 to 400 s with a step of 10 s) with the starting point at 150 s (see Figure 3). The aforementioned operations are intended to perform the studies on the impact of the number of the most significant features (microresonators) and of the duration of the sequence response of the multiple microresonator imaging sensor on the prediction accuracy. For each hyperparameter tuning operation, the input data have been split into training (70%), validation (15%), and test (15%) parts where the temporal sequences have been randomly selected from the whole dataset. For generalization, every LSTM network training procedure has been repeated 100 times.

Features Selection
For the first test intended to determine the optimal number of the features (microresonators) in the dataset, the amount of the hidden LSTM units has been fixed at 5. The evolution of the correct prediction rate at different amounts of significant features (10,20,30,50,70,100) in the input data is shown in Figure 5. Each plot reveals the calculated classification errors together with their mean values over 100 training repetitions for different durations of the input sequence. In addition to them, the exponential functions (y ¼ ae 1Àbx þ c) that are fit to the mean error values over the time are represented.
Except for the results for 10 and 20 most significant features, the selected function accurately follows the mean accuracy values for the represented durations of the sequences and approaches 100% accuracy for the longer ones. For the case of 10 and 20 features, one can observe the reduction of the mean prediction accuracy when comparing longer input sequences with the shorter ones (e.g., 160 s with 140 s and 180 s with 160 s). Due to the moderate number of features together with rather short duration of the input sequences, these results are expected to be related with the dominant impact of the short-term local variations of the collected intensities on the LSTM network parameters. Reduction of the prediction accuracy when the number of the features increases for the same sequence duration confirms this interpretation. Here, the impact of the long-term dependencies in the dataset on the LSTM network output rises. The prediction accuracy for longer durations of the training data follows the expected behavior and the 95% rate level shifts from 260 to 240 s when 20 most weighted features are considered. www.advancedsciencenews.com www.adpr-journal.com The subsequent increase in the amount of the significant features up to 30 enables to slightly shorten the sequence duration to ensure the same accuracy. Here, the prediction accuracy is continuously increasing when longer sequences have been used for training. For the dataset with 50 features, the 95% level is observed at %230 s and variability among the repetitions is reduced. When considering more features up to 70 and 100, the 95% accuracy level slightly shifts toward longer durations. Among them, the results for 100 significant features stands particularly out, where the improvement of the mean prediction error for shorter input sequence durations (up to 200 s) is observed. After a certain number of significant features, the portion of those with low variability and thus with less distinguishable dynamical changes between the output states grows up. When those features become majority, the more weight during the training of the neural network is redistributed to the short-term memory representing the last local variations in the radiated microcavity signal. In fact, the result for 100 features represents exactly the case when the LSTM network starts to follow predominately the local variations. Reduced prediction accuracy for 70 features compared with the case with 50 features agrees with this explanation as well. Here, the dataset has been supplemented with the microresonators with low variability compared with the case of 50 features. However, the amount of such resonators remains minor compared with those with clear distinction in the dynamics.
The aforementioned results reveal that the optimum amount of features in the training set exists. This allows to obtain the high prediction rate for the minimal duration of the measured temporal variations in the WGM data obtained at the fixed frequency. The generalized strategy for definition of the required number of significant features for training of the LSTM network can be formulated as the necessity to pick all the microcavities that demonstrate the clearly distinguishable dynamical changes for different external conditions. Hereby, the number of features in the reported experimental set can be reduced from 181 to the first 50 significant ones.

LSTM Layer Complexity Optimization
The second study is indented to determine the optimal number of the hidden LSTM units to ensure the low error rate classification of the sequences within the minimal time. The results on classification of the sequences dataset with 50 significant features of the varying duration (from 140 up to 280 s) with LSTM networks of different complexity of their architecture are shown in Figure 6.
The obtained results demonstrate the clear correlation between the number of the hidden units and the prediction accuracy. It has been determined that an increase in the number of the hidden units in the LSTM network from 5 to 10 is followed by the shift of the 95% mean detection rate level from 230 s of the required sequence duration down to %200 s. Among these architectures, the 99% detection rate in case of 10 hidden units is observed for the input sequence duration of %275 s. The results for more complex network architectures with 15, 25, and 30 hidden units exhibit the minor variability in the required sequence duration to reach the 95% and 99% accuracy. The first level of the correct detection rate is reached for the dynamical variations duration of %190 s and the second-at %250 s. Hereby, 15 hidden units in the architecture of the LSTM network are www.advancedsciencenews.com www.adpr-journal.com determined as sufficient to meet the best prediction accuracy at the minimum duration of the training sequence.
When analyzing the error rates for individual output states, the difference among them in the required time to reach the desired accuracy rate is examined. Here, the shortest sequence length to reach the 95% level equal to %150 s has been observed for the first output state (1% ethanol) and the longest of %250 s-for the fourth (4% ethanol). This is predominantly caused by the nature of the experiment (Figure 3) where lower gradient between the incoming and the present liquids in the sensing chamber results in earlier steady state.
As the experimental sequence part before 150 s has been excluded from the LSTM training process, this value has to be added to the actual sequence duration. At the same time, as discussed in Section 3.1 and shown in Figure 3, the first 250 s are required to transport the agent to the sensing chamber and have to be excluded. Thus, the measured time series dynamical variations can be accurately classified within the first 90 s after the start of the reaction in the chamber with the 95% correctness. The prediction rate of 99% is possible already after 150 s. Hereby, the accurate prediction of the external impact that demonstrates a set of dynamical variations of the intensities radiated by the microresonators is possible %6 times faster than the timescale (550 s) required to observe the steady state in the microresonator sensor signal. The averaged training time for this configuration of the input sequence and LSTM network is measured to be 12 s and the testing time as 0.08 s.

Conclusion
This article demonstrates the first example of application of the recurrent neural networks to build an intelligent WGM sensor capable for quantification of the external impacts on the time series dynamics of the sensor response. This has been realized by combining the LSTM network together with the affordable instrumental approach based on the multiple microcavity imaging sensor illuminated by the diode laser at the fixed frequency. The reported method for parameters quantification of the external impact that is characterized by the dynamical changes in the measured signal is a generalized intelligent WGM sensing solution with respect to the feedforward network-based solution. [21] In terms of the LSTM-based approach the latter is an extreme case where a set of the microresonator signals under the equilibrium condition has been utilized for prediction.
Under optimization of the number of the microresonator signals to consider for the LSTM network training together with the complexity of its architecture, we demonstrated the discrimination of the set of complex nonlinear highly specific time series signals of the intensities radiated by the microcavities within the substantially reduced timescale than required to reach the steady state. The exact portion of the time series data required for precise estimation of the external parameters is expected to depend on the dynamics of the measured response, sensor spectral features, and the difference in the parameters to be predicted. At the same time, unlike for the common study of spectral variations of the microresonators, here the presence of the additional subpeaks in the initial WGM spectrum is beneficial as it provides more variability (as being modulated by the nonlinear function) between the measured temporal variations. In this article, on the reported example of five different dynamical variations provided by the temporal gradient of the refractive indexes with relative difference of 5.5 Â 10 À4 RIU due to slow mixing of the solutions inside the flow chamber, a decision on the concentration of the incoming analyte with >95% accuracy is possible 6 times faster than the steady state is observed.
The reported WGM-based sensing device supplemented by the LSTM network processing engine is a highly promising solution within the emerging field of the ML inspired sensing devices and fulfills the main critical requirements such as affordability, reduced instrument complexity, and fast response time. The discussed mechanism for accurate prediction of the external impact utilizing the part of the captured set of the temporal variations in the multiple microresonator imaging sensor is not limited to the demonstrated experimental data of slowly changing bulk refractive index. The proposed LSTM network-based approach can be directly implemented also for other applications, e.g., for prediction of the protein concentrations. The latter is possible due to the similar nature of the collected signal for the adsorption of the molecules on the resonator surfaces in the fixed frequency illumination scheme. [20] In general, the collected signal in the fixed frequency illumination scheme includes the whole palette of the WGM spectrum variations (spectral shift, linewidth broadening, and mode splitting) that may arise in the optical microresonator-based sensing. Thus, the proposed method for the dynamical changes prediction is applicable to the whole variety of the possible sensing tasks where the training data can be collected in automated manner, where, depending on the dispersion and sampling rate of the sensed parameter in the experimental dataset used for the LSTM network training, the dynamical changes prediction problem can transform from the classification to the regression one. Albeit with the reduced efficiency due to the small variability of the dynamical changes among the microresonators, the LSTMbased approach could be employed also with the conventional frequency sweeping scheme. In addition, the method is expandable for other than WGM types of sensors which are capable to demonstrate the reproducible dynamical response under the same conditions. Among them, the most promising might be classification of the biomolecules' concentrations for the affinity-based sensors in different configurations, e.g., electrochemical, surface plasmon resonance, attenuated total reflectance, etc.

Experimental Section
Multiple Microresonator Sensor: Spherical microcavities (Cospheric LLC) with varying diameter in the range from 90 to 120 μm out of the soda-lime glass were allocated on the 150 μm thick 20 Â 20 mm 2 cover glass substrate with %500 nm water-matched adhesive layer (MyPolymers MY-133-MC). The latter ensured both the stable coupling conditions for excitation of the WGMs and the minimal energy dissipation for the water-based environment. No special arrangement procedures were required to allocate the microcavities. The detailed description of the optical microresonator-based sensor fabrication was previously reported in previous studies. [7,20,21] To enable the WGM detection, each of the fabricated chips was enclosed in the two-component structure out of polyether ether ketone with the polydimethylsiloxane (Dow Corning Sylgard 184) layers as gaskets. When assembled together the structure formed a flow www.advancedsciencenews.com www.adpr-journal.com chamber whose dimensions were numerically optimized in the CFD Module of the Comsol Multiphysics to ensure the homogeneous distribution of the fluid velocities within the whole sensor. The viewing window of 10 Â 10 mm 2 covered with the glass plate enables the imaging of the microresonators. The loaded quality factors of the individual microspheres on a single chip vary in the range of 10 3 -10 6 and the sensitivity for the bulk refractive index ranged from 4 nm/RIU to 20 nm/RIU. [7,20,21] These values were common for the glass microresonators of the selected dimensions operating in the spectral range around 785 nm and agreed well with the numerical estimations for the fundamental mode. Variability of both the quality factors and sensitivity was due to the dispersion of the microresonators properties, difference in the excitation efficiency, and the order of the excited resonance mode among the microresonators on the sample. Instrument: The BK-7 optical prism with the antireflection coating on the legs was selected for the robust excitation of the multiple microresonators in a parallel manner. The fiber-coupled fixed frequency diode laser (Thorlabs LP785-SAV50) with central wavelength of 785 nm and 10 MHz linewidth was supplemented with a polarization controller (Thorlabs FPC030) whereupon the collimated beam was guided to the prism ( Figure 7).
The incidence angle and polarization state of the laser light were selected to excite the first-order transverse electric (TE) mode (numerically estimated for the average microresonator size of 105 μm). The pressurebased controller (Fluigent LINEUP FLOW EZ) supplemented with the flow rate sensor (Fluigent FLOW UNIT) and the selection valve (Fluigent M-SWITCH) were utilized for sequential pumping of different fluids through the sensor with constant pumping speed of 100 μL min À1 . The variations of the radiated energies due to the changes of the properties of the resonators and/or external medium were collected by the objective and captured with the camera.