Toward Memristive Phase‐Change Neural Network with High‐Quality Ultra‐Effective Highly‐Self‐Adjustable Online Learning

Memristive hardware with reconfigurable conductance levels are leading candidates for achieving artificial neural networks (ANNs). However, owing to difficulties in device character design and circuit combination, the ability to perform complicated online‐learning tasks on a memristive network is not well understood. Here, tandem (T) material states are harnessed in a phase‐change memory (PCM) element, i.e., the primed‐amorphous state and the partial‐crystallized state, by utilizing an impetus‐and‐consequent pair pulse through a large degree of configurational ordering, and illustrate the development of an integrated system for achieving in‐memory computing and neural networks (NNs). A correct classification of 96.1% of 10,000 separate test images from the conventional Modified‐National‐Institute‐of‐Standards‐and‐Technology (MNIST) database in the tandem neural‐network (T‐NN) model is achieved, as well as image recognition for 28×28‐pixel pictures. The T‐NN configuration exhibits an in situ learning, with 50% of the elements stuck in the low‐conductance state, and at the same time, maintains an identification accuracy of ≈90%. The structural origin of the large degree of configurational‐ordering‐enhanced improvement in the extent of the conductance uniformity in the T‐based memristive element is revealed by theoretical studies. This work opens the door for attaining a widely relevant hardware system capable of performing artificial intelligence tasks with a large power‐time efficacy.


Introduction
9][10][11][12] However, traditional memristive experimental studies have been constrained to reduced problem types.[27] A maximum number of distinguished conductance levels used for NN learning of 2-64 has been The T material state for in situ learning.a) PCM elements, based on the reversible switching between the crystallized state and the glassy state of a chalcogenide layer, exhibiting a marked contrast in the optical reflectivity and electrical conductivity, can reveal T-material states to develop new types of neural networks.Here, memristive elements with reconfigurable conductance modes are utilized for achieving accurate image classification.The neural network is implemented based on the design flow: i) a staircase-based programming strategy is harnessed to achieve adjustable conductance modes; ii) alteration of the cluster-size distribution enables conductance-mode variations; iii) the memristive network utilizes the differences in conductance modes; and iv) image classification is performed, based on a conductance-mode update methodology.b) Schematic representation of the evolution of the cluster-size distribution upon the application of dfferent impetus pulse types.c) Variation of the conductance modes for different pulse numbers.d) Dependence of the conductance mode on the pulse number for different pulse types.e) Evolution of the conductance modes for different voltages/ pulse numbers.
shown in the experiments.[30] It has been shown that the test accuracy decreases with an increasing amount of relative injected noise level. [28]However, the shortage of balanced and large degree-of-uniformity feedbacks to input stimuli has hindered the learning capability of memristive network systems.
[39] Herein, to alleviate the difficulties in memristive-conductance modulation, we harness proposed tan-dem (T) material states in phase-change memory (PCM) elements/ devices, viz., the primed-amorphous state and the partially-crystallized state, by rejuvenating and modifying the crystallization mechanism, and demonstrate a design of a hybrid system for in-memory computing and NNs (Figure 1a).Administering an impetus-and-consequent pair pulse to generate a high degree of thermal configurational ordering is the strategy harnessed in this work, which facilitates a large extent-of-uniformity nucleation and growth for a series of subsequent pulses.This process allows the network to adjust and improve constantly its dataset when additional training data are provided, which substantially enhances the training accuracy and facilitates the resistance to imperfections.A correct classification of 96.1% of 10000 distinct test images in the T-NN variety was achieved, as well as training on 50000 specimens from the Modified National Institute of Standards and Technology (MNIST) handwritten-digit dataset using an in situ algorithm.The T-NN design reveals an image classification on 28 × 28-pixel pictures, along with in situ learning with 50% of the elements stuck in the low-conductance state, and simultaneously, retains a classification accuracy of ≈90%, which illustrates the self-adjusting ability of the onlinelearning algorithm to hardware inadequacies.A distinguishable conductance-level number of ≈15 was also attained in the Tbased memristive element.Theoretical studies have elucidated the crystallization kinetics and the structural origin of the large degree of configurational-ordering-facilitated enhancement in the extent of the conductance uniformity in the T-based memristive element.These findings suggest that analogue-memristive networks are able to attain an excellent training accuracy with a potentially substantial enhancement in the power-time efficacy.

Results
Balanced and High-Degree-of-Uniformity T Material-State Modulation.[54] The primed state can be distinguished from the amorphous state with regard to their altered responses to thermal or electric stimuli, although a negligible variation occurs in the electrical conductance.57] Furthermore, the general topic of the substantial change in the macroscopic character of material states achieved as a result of small variations in their atomic orientation continues to intrigue the research community.
To achieve a high degree of uniformity and balanced conductance with T material states, we utilized the Ge-Sb-Te (GST) PCM system, viz., the material Ge 2 Sb 2 Te 5 (Figure S1-S3, Supporting Information).The PCM layer was sandwiched between a heater electrode and a counter other electrode, and the electrodes thereby created a mushroom-type design of system configuration (Figure 1a).A 40 nm-thick TiW counter electrode was utilized as the starting material, on which a 50 nm-thick GeSbTe layer was deposited.Subsequently, a 40 nm-thick insulating layer was deposited, patterned and etched to generate ≈200 nm-wide pores.A 40 nm-thick TiW heater electrode was finally deposited to complete the device structure.
The large degree of conductance-increase after an impetus pulse in the T material state may be understood by an increasing cluster distribution, and the primed-amorphous states can be represented by cluster distributions that provide the basis of the kinetic theory of nucleation.The primed-amorphous states may be described in terms of the degree of medium-range order in a disordered-network system, that varies transiently and spatially after a stimulation, on a microscopic scale.[60] A thermally-induced event in the growth of clusters results in a time dependence, which involves the atomic attachment or detachment through thermal fluctuations.The layer temperature increases after administering electrical pulses through Joule heating in the PCM layer.The cluster distribution proceeds to increase from n(t) when applying the consequent pulse, after a weak impetus pulse is administered, until time t.When the size of the largest clusters in n(t) lies nearer to the critical nucleation size, r c , the number of clusters that grows past r c at the temperature associated with the consequent pulse becomes higher (Figure 1b).As a result, n(t) at the time of administering the consequent pulse determines the conductance enhancement, which describes the process of a large degree of conductance increase for every impetus pulse (Figure 1c).The right-most clusters in the distribution, viz., the largest clusters, control the conductance increase after a consequent pulse, although the effect of the temperature-spatial distribution, anticipated to appear in the PCM layer after administering set pulses, may be included in the broad cluster distribution depicted in Figure 1b.The crystallized clusters generated after the consequent pulse can then grow upon applying a series of specified-amplitude consequent pulses, resulting in an increase in the layer conductance (Figure 1d).
[63] This mechanism can be expanded to the continuous melting induced in a sequence of targeted-amplitude partial-reset pulses.At the glass-crystal interface, heating the PCM layer beyond the melting point results in a substantial population of structural defects, e.g., dangling bonds, a "thermal shock" in the crystallized PCM layer and bond breaking.The disordered state becomes "frozen-in" as a result of the rapid quenching after an electrical pulse, leading to a conductance decrease.The short injections of energy provided by subsequent pulses result in a delayed reordering, while inadequate thermal energy is provided during rest periods to allow the atomic diffusion needed to destroy disordered structures over a short period, given that the disordered configuration is metastable.The delayed reordering process involves the destruction of chemically disordered, i.e., "frustrated", structural entities, which readily reassemble as ordered units and subsequently function as an attachment point for additional atoms.Thus, a succession of designated-strength subsequent pulses leads to a minimization in the number of ordered entities, decreasing the degree of structural order in the PCM layer, which correspondingly results in a decreased layer conductance (Figure 1e).
The key achievement in this work utilizing T material-state modulation is depicted in Figure 1d,e, wherein selected combinations of impetus and consequent pulses and partial-reset pulses are utilized to enhance the PCM performance.The PCM layer exhibits a conductance increase with a high degree of uniformity in the presence of a strong impetus pulse, based on a decreasing staircase-pulse type, and with an increase in the number of pulses, as shown in Figure 1d.Moreover, when partialreset pulses in accordance with an increasing staircase-pulse archetype are implemented and with an increased pulse number, a high degree-of-uniformity conductance decrease results (Figure 1e).Based on these findings, and to achieve a high degree-of-uniformity and balanced conductance variation, we utilized the strong-impetus decreasing staircase-pulse type and the partial-reset increasing staircase-pulse types.The data simulation of a T material state-based NN model/ element assemblage (we call it the T-NN model) was performed using the electrical character of a PCM element based on the conductance-alteration strategy utilized in this work.The element was in the initial amorphous state, and decreasing-type voltage stimuli were applied to the PCM element to increase the conductance, with the number of stimuli and stimulus amplitudes determining the resulting element conductance (Figure 1e).The conductance was decreased by switching the element to the fully crystallized state and then administering increasing-type voltage stimuli with a specified stimulus number and amplitude to reset the PCM element to the targeted conductance level.For this work, it is of interest to train the T-NN type with an online algorithm.To recognize handwritten digits in the MNIST database, the T-NN configuration was trained utilizing a stochastic-gradient descent (SGD) algorithm (64-66).The T-NN mode was trained online.The network infers the log-probability of each label output utilizing the softmax function for each new training data sample, updating the weights on each layer accordingly for future data at each step.
T-NN Online Training.Every synaptic weight was described as an alteration in the output conductance between two memristive elements to perform the SGD algorithm in a memristive-element assemblage for the T-NN variety.The element current of the final layer was read after the inference was implemented through stimulating the memristive element in the first layer using a series of stimuli and with the utilization of the stimulus strength to represent a picture (Figure 2a,b).[69] Nonlinear activation, i.e., the rectified-linear function in the program utilized in this work, was administered to the weighted sums calculated in the crosspoint by hidden neurons after every layer.The desired weight update, Δw, for every layer was utilized in the crosspoint after it was computed in the program, based on Equation 1 (Figure 2c).
where  describes the learning rate,  ′ l is the output-error column vector of the layer, v ′ l denotes the input-voltage column vector of The largest output current is attained for the neuron describing the targeted digit, e.g., digit "1", output neuron 1; digit "2", output neuron 2, which suggests a correct recognition.d) The associated Bayesian probability of the digits computed using a softmax operation.
the lth layer, S represents the sample size and n is the sample number.The output-error row vectors were calculated based on Equation 2 for a NN with L layers.
i , l < L and I j > 0; 0, l < L and I j ≤ 0. ( where y denotes the Bayesian probability calculated by the NN, and t j is 1 if the sample is associated with the class j and 0 otherwise.The NNs optimize the log likelihood of a correct recognition for every instance in the network calculation.Based on the readout-current value, the error backpropagation was computed in the program.

MNIST Handwritten-Digit Recognition with T-NN.
To train the T-NN design on the MNIST dataset of handwritten digits from 0 to 9, which is a conventional benchmark for assessing a next-generation machine-learning algorithm, we generated a two-layer perceptron with 784 input neurons, 50 hidden neurons and 10 output neurons by using twenty-eight 64 × 64 element assemblages.The full MNIST image with a size of 28 pixels by 28 pixels was utilized (Figure 3a).To facilitate negative effective weights, the intensity of every grayscale-picture pixel was represented as 1D input element vectors, which were replicated to generate analogue voltages in the 0-0.2V range (Figure 2b).Twentyeight 64 × 64 element assemblages were harnessed in the twolayered network, i.e., twenty-six 64 × 64 element assemblages were utilized to create the first layer, and for the case of generating the second layer, two 64 × 64 element assemblages were used.
The memristive elements were in the low-conductance state initially, and memristive networks were trained on 50000 pictures obtained from the training database and for one hundred training cycles.Figure S4 (Supporting Information) reveals the lineartype connection between the programming voltage and the output conductance utilized for every update sequence.The memristive network recognized correctly 96.11% of the 10000 pictures in a distinct test ensemble (Figure 3b-d; Figure S5, Table S1, Supporting Information).
We performed further computations based on the element parameter, i.e., the conductance/voltage level, to examine the computing signature of the T-NN mode.The MNIST recognition accuracy of the network utilized in this work (the fifteen conductance-level neural network), i.e., 96.11%, is comparable with that of the same network design trained in accordance with the Tensorflow platform (the 32768 conductance-level neural network) , [70,71] viz., 99.10% (Figure 4a).This result indicates that the memristive network exhibits an excellent MNIST classification accuracy.
Using an extensive network with a large number of hidden entities, an enhanced recognition accuracy is predicted in the T-NN model for classifying high-resolution pictures.To identify the 28 × 28-pixel picture from the MNIST database, we conducted a simulation for different numbers of hidden entities.We describe the degree of interpretability of the network as the number of hidden entities.Thus, a smaller number of hidden entities is associated with a lower degree of interpretability.As we are interested in the variation of the synaptic weight, we record the classification-accuracy alteration.A clear dependence of the classification-accuracy difference on the degree of interpretability was observed in Figure 4b, with a high degree of The computation of the network utilized in this work, viz., the memristive system with fifteen conductance levels, has an accuracy close to that of the network based on the Tensorflow platform, i.e., the approximately 32768-conductance-level system.The data bar describes the recognition accuracy for a test set of 10000 pictures.b) The computation of a large memristive network built with a high number of hidden units could attain an accuracy above 96%, which indicates that extended memristive networks could minimize the accuracyperformance gap between a traditional memristive system and a conventional CMOS platform.c) The influence of non-receptive elements on the inference accuracy using the ex situ and in situ methodologies.The in situ training adjusts to the imperfections, resulting in a stronger imperfection resilience than the ex situ training procedure, i.e., the process with preloaded network weights.The network with an in situ training can still retain an accuracy of ≈90% for the case where 50% of the elements are stuck in the low-conductance state (we call it the LO state).d) The in situ training further exhibits an excellent imperfection resilience for memristive elements stuck in the alternate conductance state.The in situ training procedure is able to update the weights for the case where memristive elements are stuck in the high-conductance state (denoted the HI state).
interpretability of the network being the most effective and a low degree of interpretability being the least effective.An increased classification accuracy occurs with an increasing number of hidden entities owing to an increase in the degree of interpretability (Figure 4b).This means that the hidden entity with a high degree of interpretability could facilitate learning and enable correct image recognition in the T-NN type.In a memristive network with 784 input neurons, 200 hidden neurons/entities and 10 output neurons, and with two layers, to describe the synaptic weight, a classification accuracy of 97.49% was attained for the test data upon training on 10000 pictures.This finding indicates that deeply-connected neural networks can be constructed in a combined platform using various large element assemblages for achieving high classification accuracy and more complex workloads.
The multi-layered T-NN configuration trained with the in situ program is robust to element inadequacies.Experiments have revealed that the percentage of memristive elements stuck in the high-conductance state or low-conductance state is 1-50%. [71,72)We utilized a percentage of memristive elements stuck in the high-conductance state or the low-conductance state of 1 to 50% based on these findings.In this work, the simulations revealed that when 50% of the memristive elements are stuck in the low-conductance state, a classification accuracy of ≈90% is still maintained using the in situ training, as shown in Figure 4c.The recognition accuracy decreased with an in-crease in the imperfection ratio, for the case where the pretrained weight was administered to the memristive crosspoint, viz., ex situ training was utilized.[73] However, the conventional methodology requires modulating the network parameter using unique element information.Furthermore, the in situ training was able to revise the synaptic weight for the case where memristive elements were stuck in the high-conductance level (Figure 4d).For the stuck-state recovery strategy, post-failure transmission-electron microscopy (TEM) analysis and electrical testing on PCM-based memristive elements suggested that fieldinduced ion migration was responsible for stuck-set failure, [74,75] but it was unclear which atomic species was most important for electromigration. [76]Nevertheless, a small number of reverse voltage stimuli are required to alleviate the stuck-set failure.A two-stage mechanism for electromigration has been suggested: a first stage in the molten phase-change region, and a second stage during crystallization at low temperatures. [77]he large enhancement in the power-time efficacy is a potential advantage of harnessing the analogue computing in the T-NN type.The ability to implement computing in the same locality harnessed to store the network information in the T-NN variety is a key benefit, which decreases the power cost and speed of retrieving network parameters needed in the traditional von Neumann design.Another key advantage is the ability to handle analoguetype data based on digital sensor systems for the T-NN model, which decreases the power overhead from an analogue-to-digital conversion.Furthermore, a previously unreported combination of the primed-amorphous state and the partially-crystallized state in the T-NN mode was utilized, which resulted in a high degree of conductance/voltage uniformity.This enables the utilization of the stimulus amplitude as the analogue input for the NN layer for decreasing the circuit complexity and therefore power consumption for the output-current readout and hidden units.

Discussion
Applications such as analogue computing are challenging for NNs because of several requirements: 1) high classification accuracy; 2) large input-data-size recognition; 3) high imperfectiontolerant in situ learning; and 4) large number of distinguished conductance levels.Currently, none of the NNs available fulfill all the requirements listed above.The examples shown in this work indicate that the current state of the T-NN model is able to achieve most of these requirements with a reasonable accuracy.The key improvement in the T-NN design to enable these applications is the achievement of a classification accuracy of 96.1% on the MNIST dataset and with 10000 separate test images, which is 5.3% higher than the average of ≈90.8% for existing memristive NNs (Figure S6, Supporting Information).As a result, a large proportion of the input data are identified correctly for achieving effective NNs.An enhancement in the input-data-size was also achieved.The attainment of an image classification with a picture size of 28 × 28 pixels in the T-NN type, which is ≈2 times larger compared to the average of 14 × 14 pixels for current memristive NNs (Figure S7, Supporting Information), allows the identification of complicated input-data features, facilitating high-quality NNs.Another key advantage of the NN is the imperfection tolerance.For instance, the T-NN variety demonstrates an in situ learning with a classification accuracy of ≈90% and with 50% of the elements stuck in the low-conductance state, which is much larger compared to the average of 10% of the elements stuck in the low-conductance state for state-of-the-art memristive NNs (Figure S8, Supporting Information).This allows the utilization of an online algorithm, resistant to a large degree of imperfections for achieving highly self-adaptive in situ-learning NNs.Moreover, the T-based memristive element reveals ≈15 distinct conductance levels, which is 87.5% larger than the average of 8 distinguished conductance levels for current PCM elements utilizing GeSbTe-based layers and for NN learning.This enables the utility of a large number of bits per element for achieving highly efficient NNs (Figure S9, Supporting Information).

Conclusion
High classification accuracy, large input-data-size recognition, high imperfection-tolerant in situ learning, and a large number of different conductance levels have been achieved through an electrically driven large degree of configurational-ordering process in the T-NN design that utilizes the primed-amorphous state and the partially-crystallized state of a GeSbTe PCM material.This approach is pertinent to all PCM material systems and device architectures, so that a suitable integration of writing strategy and PCM material types can pave the way for enhancing T-NN performance.The memristive crosspoint archetype is an emerging rapid and energy-efficient technology for next-generation artificial-intelligence workloads and presumed "mortal" computing, i.e., the utilization of conventional analogue hardware to achieve low energy or cost but at the expense of algorithm loss, as indicated by the demonstrated performance with inference and online training.

Experimental Section
Device Fabrication: A phase-change-memory (PCM) element with a mushroom-type structure and with deposit-only electrodes was fabricated. [78]Each layer was patterned utilizing electron-beam lithography (JEOL).The materials were deposited using a standard sputtering system (Balzer Cube) with RF and DC power sources.A 4″ Si wafer with a 1 μm-thick thermal oxide (SiO 2 ) coating was utilized as the starting material on which a 40 nm-thick titanium-tungsten (TiW) counter electrode was deposited.A 50 nm-thick Ge 2 Sb 2 Te 5 (GST) active layer was deposited, and a 40 nm-thick silicon nitride (Si 3 N 4 ) insulating layer was then deposited and etched to generate a via with a diameter of ≈200 nm.Finally, a 40 nmthick TiW heater electrode was deposited to complete the device structure.
Testing Setup: Electrical testing was implemented utilizing a standard semiconductor-characterization platform (Keithley 4200-SCS).During the alternating-current (AC) pulse testing, the positive voltage was connected to the counter electrode and the heater electrode was grounded.The maximum current range was fixed at 100 μA.The rise and fall times of the pulses were kept the same at ≈20 ns.The PCM elements were switched ≈10 times prior to each testing mode.
Dataset: Datasets from the MNIST were utilized to train and test the PCM assemblage on the handwritten digits from 0 to 9. The dataset included the input-feature vector, i.e., x(n) for the sample n, and the targetoutput vector, viz., t(n).The value t c (n) = 1 if the sample n belonged to a class c, and was 0 otherwise.For the MNIST dataset, feature vectors were the unrolled grayscale-pixel values of the handwritten digital two-dimensional images.The full images with a size of 28 pixels by 28 pixels were utilized.The 28 × 28 grayscale images were unrolled to a 784-dimensional input-feature vector.The input-feature vectors were converted to voltage values in the 0-0.2V range.The output vectors comprised ten dimensions, each corresponding to a digit.
Inference: The online training consisted of two stages: i) feedforward inference; and ii) feedback-weight update.The multilayer inference was implemented sequentially in a layer-by-layer manner.The input-voltage vector to the first layer was a feature vector from the dataset, while the input vector for the subsequent layer was the output vector of the previous layer.The analogue weighted-sum step was administered in the memristive assemblage, as described by (3) where v l , was the lth layer input-voltage vector that was applied to the top electrode of the memristive assemblage, l l , was the readout-current vector from the bottom electrode of the memristive-crosspoint assemblage, and W l , was the weight matrix of the layer l.Each weight value was represented by the difference in the conductance of two memristive elements: The weights can be negative.This was accomplished by replicating the voltage vector V i , i.e., with +V i applied to half of the memristive assemblage and the −V i applied to the other half.A rectified linear-unit activation function was utilized for the hidden layer, which was defined as where c was a scaling factor that corresponds with the voltage range used.Softmax and Cross-Entropy Loss Function: The inference result was computed as a Bayesian probability, using the conversion represented by y c (n) = e kl c (n) ∑ C m=1 e kl m (n) (7)   where y c (n) was the probability that the sample n belongs to the class c, and C was the total number of classes.The k value was fixed at 5 × 10 7 /A for the MNIST network utilized in this work.The goal of the training procedure was to alter the weight values to maximize the log-likelihood of the true class.A cross-entropy loss function was utilized, which was defined as where N was the total number of samples.SGD with Back-Propagation: In order to estimate the gradient of the loss function for training the weights, a subset of B samples (it was called a minibatch) from the training set without replacing the minibatch at each round of the training was utilized.The SGD algorithm was used to update the weights along the direction of the steepest descent for the E( ∈ ).The desired weight update, Δw, for each layer was utilized in the crosspoint after it was computed in the program, based on Equation 1The error-row vectors were calculated based on Equation 2 for a network with L layers.
The memrisitve assemblage received the weight updates or changes.The stimulus amplitudes were adjusted and the number of stimuli applied to the memristive element using Equation 9.
where ΔV l denotes the change in the voltage value, and ΔW l depicts the weight update.The stimulus amplitude injected to the differential pairedelement ensemble's positive and negative-based components or branches were of the same magnitude but in opposing directions, for a targeted stimulus-amplitude alteration.For both positive and negative-based components of differential paired-element ensembles, upper and lower limits of 1.275 and 0.925 V were utilized, respectively.For the positive weight update, the differential paired-element ensemble's positive-based component to the initial amorphous state was initialized .The component conductance was subsequently increased to the specified conductance level by administering an updated number of stimuli and revised stimulus amplitudes to positive-based components of differential paired-element ensembles.In the case of the negative weight update, the differential pairedelement ensemble's negative-based component was initialized to the initial amorphous state.Subsequently, an updated number of stimuli and improved stimulus amplitudes were applied to the negative-based components of the differential paired-element ensemble, increasing the component conductance to the targeted conductance level.

Figure 1 .
Figure 1.The T material state for in situ learning.a) PCM elements, based on the reversible switching between the crystallized state and the glassy state of a chalcogenide layer, exhibiting a marked contrast in the optical reflectivity and electrical conductivity, can reveal T-material states to develop new types of neural networks.Here, memristive elements with reconfigurable conductance modes are utilized for achieving accurate image classification.The neural network is implemented based on the design flow: i) a staircase-based programming strategy is harnessed to achieve adjustable conductance modes; ii) alteration of the cluster-size distribution enables conductance-mode variations; iii) the memristive network utilizes the differences in conductance modes; and iv) image classification is performed, based on a conductance-mode update methodology.b) Schematic representation of the evolution of the cluster-size distribution upon the application of dfferent impetus pulse types.c) Variation of the conductance modes for different pulse numbers.d) Dependence of the conductance mode on the pulse number for different pulse types.e) Evolution of the conductance modes for different voltages/ pulse numbers.

Figure 2 .
Figure 2. Online training algorithm for T-NNs.a) Schematic representation of a two-layered neural network.Every neuron administers a nonlinear activation operation and calculates a weighted aggregate of the inputs.b) The network realization with a memristive-element assemblage.The conductance difference between two memristive elements (orange lines/ columns) represents a synaptic weight in a) (black arrows).The weighted aggregates of the input voltages are calculated in the memristive element.A layer of circuitries, viz., the hidden-layer units, that implements the activation operation, obtains the current from a wire and converts the current value to a voltage, is utilized, between the element ensembles, i.e., between the input-layer units and the output-layer units.c) Flow-chart of the online training.A description of the workings of the algorithm is included in the Methods section.

Figure 3 .
Figure 3. T-NN inference and online training for the MNIST handwritten-digit classification.a) The MNIST dataset's conventional handwritten digits.b-d) The prototypical correctly-identified digit "1"/ "2" results (cyan bars) and the misidentified digit "5" findings (pink bars) upon the online training.b) Pictures of the actual digits from the MNIST test set harnessed as the network input.c) The current values obtained from the output-layer neurons.The largest output current is attained for the neuron describing the targeted digit, e.g., digit "1", output neuron 1; digit "2", output neuron 2, which suggests a correct recognition.d) The associated Bayesian probability of the digits computed using a softmax operation.

Figure 4 .
Figure 4.More complicated case-type computation analysis.a)The computation of the network utilized in this work, viz., the memristive system with fifteen conductance levels, has an accuracy close to that of the network based on the Tensorflow platform, i.e., the approximately 32768-conductance-level system.The data bar describes the recognition accuracy for a test set of 10000 pictures.b) The computation of a large memristive network built with a high number of hidden units could attain an accuracy above 96%, which indicates that extended memristive networks could minimize the accuracyperformance gap between a traditional memristive system and a conventional CMOS platform.c) The influence of non-receptive elements on the inference accuracy using the ex situ and in situ methodologies.The in situ training adjusts to the imperfections, resulting in a stronger imperfection resilience than the ex situ training procedure, i.e., the process with preloaded network weights.The network with an in situ training can still retain an accuracy of ≈90% for the case where 50% of the elements are stuck in the low-conductance state (we call it the LO state).d) The in situ training further exhibits an excellent imperfection resilience for memristive elements stuck in the alternate conductance state.The in situ training procedure is able to update the weights for the case where memristive elements are stuck in the high-conductance state (denoted the HI state).