Highly Uniform All‐Vacuum‐Deposited Inorganic Perovskite Artificial Synapses for Reservoir Computing

The development of artificial synapses is inspired by the energy‐efficient recognition ability of the central nervous system of living organisms and is proposed for use as the basic building units of next‐generation neuromorphic computing networks. Herein, the perovskite/MoO3/Ag synaptic device exhibits very uniform electrical characteristics, a low operation voltage, and linear and repeatable analog switching. Important synaptic functions such as paired‐pulse facilitation (PPF) and short‐ and long‐term potentiation (STP/LTP) are all successfully and stably implemented in the device. The diffusion of Ag+ and I− ions is believed to contribute to the synaptic behavior of the device. Furthermore, the uniform and stable volatile characteristics are utilized in a proof‐of‐concept reservoir‐computing‐based simulation program using all the experimental parameters; the results strongly suggest that the neuromorphic computing that utilizes such components can simultaneously reduce the circuitry complexity and increase its recognition accuracy with noisy inputs. The findings of this work promise to significantly facilitate the utilization of highly homogeneous vacuum‐deposited perovskite/metal oxide structures in high‐performance neuromorphic computing systems.

inputs. Therefore, RC is inherently suitable for dealing with time-related input signals. In recent times, various systems are being utilized to build RC frameworks. [37] In addition, several time-series tasks, such as digit recognition and chaotic system forecasting, are resolved. [2,38] Halide perovskites are some of the most fascinating semiconductors, exhibiting intriguing optoelectronic properties. They have been used to develop very efficient solar cells, light emitting diodes, and lasers. [39][40][41][42][43][44] The easy migration of ions in the perovskite thin film is believed to be able to mimic the biological synapse. [31] In previous works, we demonstrated the effectiveness of a vacuum process for the production of thin, large, and homogeneous high-quality perovskite films for the fabrication of solar cells. [45][46][47][48][49][50] However, due to the ease of implementation and versatility of solution systems, they are utilized in most studies on halide perovskites, including the pioneer works that revealed the synaptic characteristics of the materials. [29,31,51] In the present study, we demonstrated the high performance of artificial synapses with all-vacuum-deposited inorganic halide perovskite/metal oxide/metal structures. The devices were found to exhibit excellent synaptic characteristics, along with superior device-to-device and cycle-to-cycle uniformity and stability. These real-life device parameters were further evaluated using an RC-based handwritten digit recognition simulation program and were shown to achieve a recognition accuracy of 89%. Notably, the proposed devices attained an acceptable accuracy of 74%, regardless of exceedingly noisy images. The recognition accuracy value was close to that of the traditional non-reservoir NN, which possessed a significantly larger network size. The results also strongly suggested that the device-to-device uniformity and cycle-to-cycle variation were the two most important factors of the recognition accuracy of the device, indicating the promise of all-vacuum-fabricated perovskite artificial synapses for the development of future hardware-based neuromorphic computing systems.
All-vacuum deposition affords great flexibility for the fabrication of a variety of layered structures without having to deal with the dissolution problems associated with solution processes. Cesium-based inorganic perovskites, CsPbX 3 (where X denotes a halide), were chosen owing to their compatibility with the vacuum process. [45,50] The ability of the device resistance to change from high impedance to low impedance and vice versa, which is analogous to the potentiation and depression of a synaptic link, was evaluated. Previous studies have revealed that the strong reaction of halides with the metal eventually compromises the stability of the device. [52][53][54][55] To counter this issue, we introduced a thin MoO 3 layer between the perovskite and the Ag layer to prevent direct contact between the two, and the device was configured as: indium tin oxide (ITO)/CsPbI 2 Br/MoO 3 /Ag. The top-view photographs of the devices with and without the MoO 3 layer between CsPbI 2 Br/Ag interfaces are shown in Figure S1a,b, Supporting Information, respectively. These devices were stored in a N 2 glove box for %3 months. The one without the MoO 3 layer clear exhibited a ring-like appearance around the Ag electrode, which was due to the chemical reaction between Ag and halides in perovskite under a direct contact. The corresponding current (I)-voltage (V ) and current (I)-time (T ) characteristics of the device are shown in Figure S2a,b, Supporting Information. The device exhibited conspicuous changes in the current under electrical loop and pulse stimulation. Nevertheless, exceedingly high current densities were observed; this is unsuitable for practical low-power applications, irrespective of the change in resistance.
By inserting another thin MoO 3 underlayer between the ITO substrate and the CsPbI 2 Br layer, we found that there was scope for significant reduction in the device current. This is due to the much larger bandgap of MoO 3 and large carrier injection barrier from the perovskite to the MoO 3 layer, which efficiently decreases the overall device current. [56][57][58][59] The optimized device not only exhibited synaptic characteristics, but also showed exceedingly diminished driving voltage and operation current density, which are highly desirable for low-power computing. The structure and typical I-V sweeping characteristics of the device are presented in Figure 1a,b. The ITO anode and the Ag cathode, respectively, mimic pre-synapse and post-synapse, whereas the electrical stimulation of the device induces the mimetic synaptic behavior. The device resistance decreased under negative sweeping loops and could be restored to the high-resistance states (HRSs) via positive sweeping loops.
We investigated the lowest spiking voltage that could be used to drive the device. As is evident from Figure S3a,b, Supporting Information, the device clearly shows a gradual increase of conductance under stimulation by a very low stimulation of À0.1 V pulses and sweeping voltages. This low operation voltage is much lower than those of other two-terminal devices and threeterminal synaptic transistors and is among the lowest values observed for any type of artificial synapse (see Table S1, Supporting Information). [1,3,31,51,[60][61][62][63][64][65] The energy consumption of the device is %2 nJ mm À2 , but the adoption of a smaller device area can be used to substantially reduce the operating current of the devices, as shown in Figure S4, Supporting Information. Hence, it is believed that a smaller device area can reduce the energy consumption to a value comparable to that of the human brain (%10 fJ per synaptic event), [51,66] as listed in Figure S5, Supporting Information. It is evident from Figure S6, Supporting Information, that the device exhibits a smooth curve and analogue change under different spiking conditions. Unlike in the case of conventional memories, [67][68][69][70] the device current changes abruptly from HRS to low-resistance state (LRS); moreover, artificial synapses are capable of exhibiting continuous change between HRS and LRS. Continuous resistance changes in the present device between 1000 and 10 000 Ω are shown in Figure 1c; the high stability of the device can also be observed from the figure. The device shows a good potentiation linearity with a non-linearity factor value of À0.05 and a moderate depression linearity with a non-linearity factor of À2.8. [1] To fully elucidate how well an artificial synapse mimics the behavior of a natural synapse, it is necessary to access several cognitive mechanisms of the artificial synapse, especially in the time domain. In a natural neuromorphic system, the synaptic plasticity, which is the ability of the synapse to adjust its synaptic weight, is an important factor of learning and memory, according to Hebbian theory. [11,71,72] Synaptic weight is the strength of a connection between neurons, which can be mimicked by different conductance states of an electronic device. When a voltage stimulates an artificial synapse, the concentration of the neurotransmitters temporarily increases, with the original state restored with time. If a second stimulating voltage spike www.advancedsciencenews.com www.advintellsyst.com arrives within a short time interval (Δt) before the restoration, i.e., before the complete disappearance of the neurotransmitters induced by the first spike, the concentration of the neurotransmitters after the second spike would be higher than that after the first spike. Conversely, if Δt is sufficiently long for the synapse to recover to its original state after the first spike, the neurotransmitter concentrations induced by the two spikes would be similar. This phenomenon is referred to as paired-pulse facilitation (PPF) and is shown in Figure 2a. [73,74] As shown in Figures S7a,b, Supporting Information, for a small Δt, the current after the second pulse is higher than that after the first pulse, whereas the two are the same for a large Δt. Figure S7c, Supporting Information, shows the PPF index as a function of Δt. As can be observed, the PPF index increases with decreasing Δt, and when Δt is sufficiently large, the PPF index decreases to almost 100%. Figure 2b shows the I-T curve of the device under consecutive stimulations by a series of À0.5/þ0.05 V pulses. The write and read currents were both found to increase linearly with continuous pulse stimulation, indicating a highly stable PPF behavior. Another basic synaptic parameter is the spikingrate-dependent plasticity (SRDP), which relates the plasticity of the synapse to the frequency of the input spikes. [75][76][77] Similar to the PPF, when a train of high-frequency spikes arrives at the neuron, the synaptic weight increases with each successive spike. The SRDP of the present device is shown in Figure 2c,d, which reveal that at sufficiently small Δt values between successive pulses, there is a significant increase in the device current; this behavior is typical of the SRDP. The device current for the application of ten pulses is plotted against Δt in Figure S7d, Supporting Information. Figure 2e shows the decay stability of the present device. The decay curves remained unchanged over 10 5 stimulations and under consecutive measurements, thus establishing the reliability of the short-term and long-term stability characteristics in further simulations. Short-term potentiation (STP) and long-term potentiation (LTP) are two important types of plasticity of the human brains. They are considered to reflect the brain's basic functions of memory and learning and are distinguished by their retention times, with STP lasting for tens of milliseconds and LTP lasting for minutes to hours. [78][79][80] We successfully implemented the STP-to-LTP transition behavior, as represented in Figure S8a, Supporting Information. Noticeably, the LTP current increases with increasing stimulation voltage, and the on/off ratio remains sufficiently high compared with the STP state. In addition, stimulation of the device by a train of À1.8 V pulses produces a sufficiently long retention time of over 10 6 s (18 days) in Figure S8b, Supporting Information. Figure S9, Supporting Information, shows that the device can still be tuned back to HRS after the application of a series of 1.5 V pulses, indicating good performance with the device remaining intact under the extreme LTP. Figure 3a shows the cross-sectional transmission electron microscope (TEM) image of the fabricated device. The complete separation of the top metal layer from the perovskite layer by a 15 nm thick MoO 3 layer can be observed. The smoothness and uniformity of the device morphology are considered to be the basis of the observed uniform electrical properties. The high quality of the vacuum-deposited CsPbI 2 Br layer was evident from its lattice diffraction patterns, as shown in Figure 3b  www.advancedsciencenews.com www.advintellsyst.com hybrid interfaces between perovskite and MoO 3 layer, [59] and the formation of a high carrier mobility (tens of cm 2 V À1 s À1 ) [81] AgI layer at this interface provided a stepwise energy level alignment, which enhanced the charge conductivity of the devices. As a result, we believe that the synaptic mechanism of our device is based on the migration of Ag þ and I À ions, accompanied with AgI formation/annihilation at the CsPbI 2 Br/MoO 3 interface to modify the charge transfer properties. [82] Under reverse bias, the Ag þ ions experience an electrical field and migrate across the MoO 3 layer to encounter the easily migrating I À ions [83][84][85] to form a thin AgI composite, [86] which gradually improves the conductance of the device, owing to the enhanced charge transfer ability. The AgI layer is unstable under weak stimulations, and the ions drift eventually back. This corresponds to short-term behaviors, as shown in Figure 3c. To verify this hypothesis, we replaced the top electrode (Ag) with Au. The I-V curve of these devices is shown in Figure S10a,b, Supporting Information. In the absence of synaptic behavior, it is evident that the strongly mobilized Ag þ ions play a crucial role in the synaptic phenomenon. We then substituted the MoO 3 underlayer with different carrier transporting layers. The I-V curves in Figure S2a and S10c,d, Supporting Information, exhibit synaptic behavior under distinctive current ranges, thus verifying that the mechanism is unrelated to underlayer materials. Based on the results, vacuum sublimation was used to introduce a thin 2 nm AgI layer at different positions in the device. As is evident from the results in Figure 3d, the fresh device 1) with deliberately deposited AgI at the perovskite/ MoO 3 interface exhibits a significantly higher current compared with the device 2) without the AgI layer, 3) AgI inside the MoO 3 , and 4) at the MoO 3 /Ag interface. Hence, we conclude that the formation/annihilation of AgI at the CsPbI 2 Br/MoO 3 interface contributes to the synaptic mechanism. The I-V results in Figure S10e-g, Supporting Information, of other device structures are also consistent with the proposed mechanism. For the STP-to-LTP transition, the AgI layer exhibits enhanced stability under stronger stimulations, as shown in Figure S11. Energy-dispersive X-ray spectroscopy (EDS) analyses of the Ag/MoO 3 /CsPbI 2 Br structure in the HRS (fresh device) and LRS (by applying a large stimulation to LTP) revealed trace Ag in the perovskite layer, as indicated in Figure 3e. It has been verified by various groups that halides in perovskite materials are highly reactive with Ag; [68,82,86,87] hence, we believe that after the Ag migration, the AgI layer is formed at the perovskite/MoO 3 interface. The result of LTP was in good agreement with those of the EDS analyses and the HRS/LRS characteristics of the device. www.advancedsciencenews.com www.advintellsyst.com STP behavior of the artificial synapses can be implemented for efficient RC. To evaluate the performance of the present device in an RC framework, we conducted a few related measurements before running a simulation program. Only volatile STP was utilized here, and the devices were operated at a relatively small stimulated voltage (À0.3 V) to avoid them entering the LTP region. Figure 4a shows the results of 4 bit test. We stimulated various arrangements of "1" (a À0.3 V, 100 ms pulse) and "0" (no additional pulse, device current decays naturally) to the device, and the device conductance was read with a À0.05 V, 50 ms pulse. The device responded to the different inputs with distinct final conductance values, thus verifying its ability to distinguish different 4 bit inputs. In Figure 4b, a specific 4 bit pattern is applied repetitively on the device 50 times with a 2 s time interval between successive patterns to obtain the cycle-to-cycle variation. The standard deviation (σ) of the final conductance value was appreciably low (%2.5%). In Figure 4c, a specific 4 bit pattern is applied on different devices, and the σ of the final conductance values does not exceed 15%, thus proving a low device-to-device variation. The variation is small compared with other studies (%15%). [88] The stability of the present device under the same conditions for the 4 bit tests (Figure 2e) proves the reliability of the results.
Based on the framework of RC and the measured real-device parameters, we carried out a series of experiments on handwritten digit recognition to simulate the strength of our device for real-world applications. The recognition system was trained www.advancedsciencenews.com www.advintellsyst.com and tested with the commonly used Mixed National Institute of Standards and Technology (MNIST) database, which contains a training dataset of 60 000 images and a testing dataset of 10 000 images. Before being fed to the dynamic reservoir, input data/ images went through a pre-processing stage. Based on our optimization results (Figure 4a), we decided to divide each row into seven 4 bit sections, which leads to a total of 196 sections (seven sections per row multiplied by 28 rows), and each 4 bit section has 2 4 possible pattern combinations. This setting implies a good tradeoff between efficiency and accuracy. From the efficiency perspective, there is a tremendous reduction in the size of the recognition system (i.e., NN), and the recognition will be more efficient. From the accuracy perspective, the distinguishability of 2 4 possible combinations is effective even in the presence of variation, and thus, high-quality recognition accuracy can be maintained. We conducted two sets of experiments on handwritten digit recognition. In the first experiment, we focused on comparing our RC-based approach against traditional NN, in terms of recognition accuracy. The details of the RC framework setup are presented in Figure 5a and in the Experimental Section. The setup of the traditional NN is shown in Figure 5b. Furthermore, we introduced the commonly encountered saltand-pepper noise into the testing images, to showcase the overall performance of the device in a more real-world scenario. The results are presented in Figure 5cand Table S2, Supporting Information. Upon comparison of the recognition accuracy of our RC-based approach with traditional NN, it is evident that there is some accuracy loss (%3-5.7%) depending on the percentage of noise in the testing images. However, as mentioned earlier, based on the RC framework and our experimental settings, the RC-based simulation only requires a dynamic reservoir and a readout function. The former does not need any training, and the latter is a 196 Â 10 single-layer network. On the other hand, the traditional NN is a 784 Â 10 single-layer network, which is four times larger than the RC-based simulation network size. Consequently, this will lead to significantly higher costs in terms of computation and energy. Considering the overall system size, it is clear that our RC-based approach would boast a significantly reduced training time. Meanwhile, the recognition accuracy remains promising, with minimal accuracy loss in comparison with a much larger system.
In the second experiment, the aim is to demonstrate the effects of imperfect devices and noisy images on recognition accuracy. To this end, we not only considered noisy images, but also introduced device-to-device and cycle-to-cycle variations during the training and testing procedures. Figure 5d and Table S3, Supporting Information, present the results of the second experiment. The value of σ was set at 0.15, based on the measurements of our device instances. This means that device-to-device and cycle-to-cycle variations of our device instances are around the magnitude and do not exceed 0.15, when considered jointly. However, the settings of σ ¼ 0.5, 1, and 2 represent devices with much higher variation, which significantly exceeds than that of our device.
Compared with the ideal case of σ ¼ 0, the accuracy loss (as a result of σ ¼ 0.15) is consistently marginal (<0.7%) across all Figure 4. a) 4 bit test result. Various arrangements of "1" (a À0.3 V-100 ms pulse) and "0" (no additional pulse, device current decays naturally) are applied on the device, and the device conductance is measured with a À0.05 V pulse of duration 50 ms. The clearly separated final conductance value of different inputs proves the potential of the device for utilization in an RC network. b) Cycle-to-cycle variation measurement. A specific "1101" pattern is repetitively applied 50 times on a single device, and the standard deviation of the final conductance value is 2.38%. The inset shows the cycle-to-cycle variation of the "1010" input pattern with a standard deviation of 2.81%. c) Device-to-device variation measurement. A "1101" pattern is applied on over 20 devices, and the standard deviation of the final conductance value is 5.17%. The inset shows the device-to-device variation of the "1010" input pattern with a standard deviation of 14.4%.
www.advancedsciencenews.com www.advintellsyst.com noise levels. The results confirm that the device-to-device and cycle-to-cycle variations of our device instances are well controlled, such that the accuracy of a real-world recognition system will not be noticeably affected in the presence of variations. As a reference, when σ ¼ 0.5, a significant drop in the accuracy can be observed. In addition, larger noise levels result in a more significant drop. For the cases of σ ¼ 1 and 2, the resulting accuracy is rather unacceptable. Based on the experimental results and our observation, we are certain that the device uniformity plays a significant role in the success of RC-based recognition tasks. Owing to the highly uniform characteristics of our device, we achieved promising recognition accuracy, even in the case of exceedingly noisy data. Therefore, our device has a tremendous potential for real-world applications. After elucidating the impact of variation and noise on recognition accuracy, we delved deeper into the interaction between these two properties from the perspective of software simulation. Namely, we set σ to 0.15 and investigated two different training approaches: variation-aware training and variation-unaware training. Furthermore, we applied different percentages of noise to the testing images, ranging from 0% to 25%, to perform an in-depth testing of the two training approaches. For clarity, variation-aware training means that during the training process, the device variation is already considered. Therefore, we will Figure 5. a) Illustration of the RC framework. The handwritten digits are divided into smaller sections in the pre-processing stage and fed into the dynamic reservoir. The reservoir would generate a total of 196 reservoir outputs, and a 196 Â 10 single-layer NN can be utilized with 196 input neurons from the reservoir outputs and ten output neurons corresponding to ten classes of digits (0-9). b) Illustration of the traditional NN with a much larger 784 Â 10 single-layer network, leading to much higher costs in terms of computation and energy. c) Accuracy results of handwritten digit recognition (MNIST) with our RC-based approach and the traditional NN. Both were tested with four different percentages of noise in testing images. The accuracy of the RC-based approach is slightly lower than the traditional NN but still acceptable, and the network size and training cost are greatly reduced. The inset shows a series of handwritten digit "5" with noise varying from 0% to 25%. d) Introducing cycle-to-cycle and device-to-device variation into the RC framework. Both types of variation were modeled as normal distributions and applied to the reservoir outputs, to simulate the non-uniformity among real devices, where larger σ value implies higher variation among devices, and vice versa. Compared with the ideal case (σ ¼ 0), the accuracy loss of our worst real-device variation parameters (σ ¼ 0.15) is consistently marginal (<0.7%) across all noise levels, confirming that the device-to-device and cycleto-cycle variations of our device are well controlled and does not severely affect the accuracy of a real-world recognition system. e) The accuracy results with a new assignment strategy of input patterns to device conductance.
www.advancedsciencenews.com www.advintellsyst.com apply the variation to the reservoir outputs of all the training images, by setting the σ value that matches the degree of device variation. On the other hand, variation-unaware training implies that no variation is added to reservoir outputs. In other words, the σ value is set to zero during the training process. Figure S12 and Table S4, Supporting Information, present the experimental results. It is evident that the variation-aware training approach outperformed the other approach consistently. Moreover, as the noise percentage grows, the gap between the two lines increases. For instance, at 0% noise, the gap between the two lines is %1%, and at 8% noise, it increases to %1.5%, following which it continues to increase and reaches an accuracy difference of 3.5% at 25% noise. Therefore, based on the observations, variation-aware training can be concluded to be the superior approach for simulations with imperfect devices. With this approach, we can not only achieve higher recognition accuracy across the broad, but also reduce the accuracy loss caused by noisy images. Up to this point, we already showed various promising results and valuable findings about the proposed RC-based system. However, it came to our attention that we were using a naïve assignment strategy, which assigns input patterns (from image) to device's RC conductance (from measurement) in unaltered order, that is, input pattern (0000) to device's RC conductance of (0000), input pattern (0001) to device's RC conductance of (0001), etc.
We believe that with a more careful assignment, better performance for any given recognition task can be expected. Therefore, to further explore the potential of the RC-based system, we came up with a well-designed assignment strategy, as described in the sequel. First of all, we define the importance of an input pattern: if a 4 bit input pattern occurs more frequently and thus had better to be distinguished/recognized from other patterns (for better recognition of the whole image), it is more important; input patterns (0001), (0011), (0111), (1000), (1100), (1110) are such important patterns. Otherwise, input patterns such as (0010), (0100), (0101), (1010), (1011), (1101) are relatively less important. We can then obtain the importance of each pattern based on simple statistics and, finally, assign a larger RC conductance to a more important input pattern. By doing so, inputs patterns "which are more important and need to be clearly distinguishable" can be better distinguished/recognized. Here, we would like to emphasize again that, in the original naïve assignment strategy, RC conductance was not assigned based on some sort of fancy strategy. Instead, RC conductance was assigned in an unaltered order, no matter whether an input pattern is complicated/important or not. Figure 5e and Table S5, Supporting Information, present the experimental results. By comparing Figure 5e (Table S5, Supporting Information) with Figure 5d (Table S3, Supporting Information), it is clear that under every single experimental configuration, the new assignment strategy outperformed the original naïve assignment strategy. Meanwhile, it is worth mentioning that the new assignment strategy is particularly beneficial to noisy images. As can be compared between Figure 5e (Table S5, Supporting Information) and Figure 5d (Table S3, Supporting Information), when σ ¼ 0 and noise ¼ 0%, there is a 0.75% accuracy improvement. However, when σ ¼ 0 and noise ¼ 25%, the accuracy improvement is up to around 7.5%, which is much more significant than the noiseless case. In summary, with the newly designed assignment strategy, we not only improved the overall recognition accuracy, but also strengthened the capability of fault tolerance in an RC-based system.
On a final note, given 1) a specific application of RC; 2) N for dividing each row of input data into multiple N-bit sections; and 3) measurement of 2 N RC conductance, there exists an optimal assignment leading to the best accuracy. Finding the optimal assignment efficiently is an algorithmic problem and not particularly addressed in this work.
However, to showcase the strength of our device and our training methodology in a more detail manner, we presented two additional sets of pairwise comparison. In both these sets, as shown in Figure 6, we presented the testing images and their corresponding distributions of output probabilities under different configurations. In the output probability distributions/ histograms, the probabilities (y-axis) of ten output neurons represent the probabilities of recognizing the testing image as digit 0-9 (x-axis). Meanwhile, the highest probability determines the result of recognition.
The first set of comparison is between the simulation results of two different training methodologies: variation-aware and variation-unaware. In the training-aware methodology, device instances in the system are assumed to be imperfect and differing from one another. The deviation across all device instances follows a normal distribution with a device variation of σ ¼ 0.15. In contrast, the variation-unaware methodology assumes the ideal case, where all device instances are (and behave) exactly the same. As shown in Figure 6a,b, with variation-aware training, the system can recognize the testing images accurately, even when the image is exceedingly noisy. However, with variationunaware training, the system failed to do so. Therefore, it is clear that when it comes to RC-based simulation, variation-aware training is superior to variation-unaware training and compensates for the negative impact on variation-induced accuracy loss. A similar experiment is shown in Figure S13, Supporting Information.
The second set of comparison is between simulation results using devices with two different degrees of variation. To be precise, the two σ values of the devices were set to 0.15 and 0.5, and the variation-aware training methodology was used. As shown in Figure 6c,d, the recognition results are incorrect when σ ¼ 0.5. On the other hand, when σ ¼ 0.15, the system recognizes the same images correctly. Based on the observation mentioned earlier, we come to the conclusion that even with variation-aware training (the superior training methodology), when the device variation surpasses a certain degree, the resulting accuracy loss cannot be compensated. Therefore, inevitably, devices with large variation will lead to less satisfactory recognition accuracy. Another example is showcased in Figure S14 Supporting Information. Overall, the variation of our device is relatively small, with the σ value not exceeding 0.15. In conjunction with variation-aware training, we can realize an RC-based system with exceptional overall performance, which exhibits two highly desirable aspects: 1) marginal loss in accuracy and 2) significant saving in energy consumption.
In summary, we fabricated an all-vacuum-deposited perovskite/MoO 3 /Ag artificial synaptic device. The device was characterized by an analogue switching behavior, broad dynamic range, repeatable resistance control, and low operation voltage. All its parameters exhibited highly desirable values, comparable with those of the state-of-the-art artificial synapses. Moreover, the device exhibited crucial synaptic features, such as PPF, SRDP, and STP-to-LTP behavior. The results of TEM and EDS investigations performed on the AgI layer of the device revealed that the diffusion of Ag þ and I À ions was the basis of the analogue synaptic behavior of the device. The measured parameters of the device were used for the simulation of RC-based handwritten digit recognition. By utilizing a much smaller 196 Â 10 singlelayer network, we demonstrated a surprisingly high accuracy of the recognition results, even with substantially high input noises. The results flaunted the advantage of device uniformity and low cycle-to-cycle variation of the vacuum-deposited perovskite artificial synapses. The encouraging results of the simulation and the promising device performance exhibit the strong potential of the proposed all-vacuum-deposited perovskite artificial synapse for the development of future neuromorphic computing networks.
Device Characterization: The electrical characteristics of the devices were measured in a nitrogen glove box using a Keithley 2636B SourceMeter and a Keysight B1500A semiconductor device analyzer with the B1530A waveform generator fast measurement unit (WGFMU). All the characterization processes were carried out in a dark environment to avoid the effects of the photocurrents generated by the perovskite film. TEM lamella samples were prepared by dual beam focused ion beam (FEI, HELIOS-660). The final TEM lamella sample thickness was %50 nm. TEM and EDS analyses were conducted using a JEOL JEM-2800 equipment.
Network Simulation: In the pre-processing stage, each original 28 Â 28 gray-scale handwritten digit image was converted into a binarypixel (black-and-white) counterpart. Following this, we divided each row of the image into smaller sections; by doing so, compared with directly feeding the entire row undivided into the reservoir, we can effectively reduce the number of possible input patterns from 2 28 to a much smaller degree. This would significantly increase the distinguishability for the reservoir. Based on our optimization results, we decided to divide each row into seven 4 bit sections.
Based on the aforementioned pre-processing steps, for each handwritten digit image, the dynamic reservoir would generate a total of 196 reservoir outputs (7 sections Â 28 rows). Therefore, the second part of the RC framework (the readout) can be realized by a 196 Â 10 singlelayer NN with 196 input neurons from the reservoir outputs and ten output Figure 6. a,b) Output probability distributions of an image (digit "2") with a) 0% noise and b) 25% noise. The output probabilities were obtained by two different training methodologies: variation-aware training and variation-unaware training. The image is recognized correctly as "2" with variation-aware training (tallest red bars in both histograms) and is recognized incorrectly as "3" with variation-unaware training (tallest gray bars in both histograms). c,d) Output probability distributions of an image (digit "6") with c) 0% noise and d) 25% noise. The output probabilities were obtained by two different degrees of device variation: σ ¼ 0.15 and σ ¼ 0.5. The image is recognized correctly as "6" when σ ¼ 0.15 (tallest red bars in both histograms) and is recognized incorrectly as "4" when σ ¼ 0.5 (tallest gray bars in both histograms).
www.advancedsciencenews.com www.advintellsyst.com neurons corresponding to ten classes of digits (0-9). The readout function was trained with 60 000 images from the training dataset of MNIST for 4 000 epochs and tested with 10 000 images from the testing dataset of MNIST.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.