A Reliable All-2D Materials Artificial Synapse for High Energy-Efficient Neuromorphic Computing

High-performance artificial synaptic devices are indispensable for developing neuromorphic computing systems with high energy efficiency. However, the reliability and variability issues of existing devices such as nonlinear and asymmetric weight update are the major hurdles in their practical applications for energy-efficient neuromorphic computing. Here, a two-terminal floating-gate memory (2TFGM) based artificial synapse built from all-2D van der Waals materials is reported. The 2TFGM synaptic device exhibits excellent linear and symmetric weight update characteristics with high reliability and tunability. In particular, the high linearity and symmetric synaptic weight realized by simple programming with identical pulses can eliminate the additional latency and power consumption caused by the peripheral circuit design and achieve an ultralow energy consumption for the synapses in the neural network implementation. A large number of states up to ≈ 3000, high switching speed of 40 ns and low energy consumption of 18 fJ for a single pulse have been demonstrated experimentally. A high classification accuracy up to 97.7% (close to the software baseline of 98%) has been achieved in the Modified National Institute of Standards and Technology (MNIST) simulations based on the experimental data. These results demonstrate the potential of all-2D 2TFGM for high-speed and low-power neuromorphic computing.


Introduction
With the rapid development of computing technology, computing tasks such as image and speech recognition, game playing, and unmanned driving have become more and more complicated. [1,2] Amid the slowdown of Moore's law scaling, conventional hardware based on the von Neumann architecture faces performance constraints in power dissipation and energy efficiency due to the physical separation of data storage and processing. [3][4][5] Inspired by the architecture and working principle of human brain, which is adaptive, massively parallel, and fault-tolerant, innovative neuromorphic computing systems have been proposed to address the rapidly growing computation power and efficiency requirements. [6,7] Extensive efforts have been devoted to develop electronic devices that are capable of mimicking the behaviors of biological neurons and synapses. [8][9][10] For example, a variety of devices, including memristors, [10,11] phase change memory (PCM), [12] spintronic devices, [13] and synaptic transistors (including floating-gate, ferroelectric-gate, electrolyte-gate, and optoelectronic synaptic transistors, etc.), [14][15][16][17][18][19] have been reported to emulate synaptic functions.
However, reliability and variability issues, for example, linear/symmetric weight update and number of states, are still the major hurdles in practical applications for energy-efficient neuromorphic computing. [20,21] In general, large nonlinearity leads to complex weight modulation as well as high energy and time costs in the training process. By contrast, a linear and symmetric weight update behavior with sufficient number of states can effectively improve the inference accuracy and reliability of neuromorphic computing. [22] Therefore, to build a low power and high-accuracy artificial neuromorphic network, it is necessary to improve the linearity and symmetry of an artificial synapse device. [22,23] To this end, various strategies have been developed. [24][25][26] For example, nearly linear and symmetric weight update were achieved by continuously ramping up the pulse amplitude or other complex programming pulse designs. [16,22] But the generation of such non-identical pulses puts an additional burden on the peripheral circuit design, such as the digital-to-analog converter (DAC) circuits, causing additional latency and power consumption. Thus, the search of an ideal artificial synapse device with good linear/symmetric weight update and low power consumption is still on the way.
Recently, 2D materials have been considered for application to neuromorphic computing systems due to the tremendous potentials related to their atomically ultra-thin body with facile electronic tunability. [27][28][29][30][31][32][33][34][35][36] Here, we fabricated all-2D 2TFGM synapse devices and realized highly symmetric and linear weight update with low power consumption and good reliability. Especially, the linearity, on/off ratio, and the number of analog states of the 2TFGM artificial synapse can be tuned by the amplitude and width of the operation pulses. Image classifications based on the experimental potentiation/depression (P/D) results with different linearities were simulated by using the MNIST dataset and a high recognition accuracy of 97.7% was achieved. Taking advantages of 2TFGM's high linearity/symmetry, large number of states, fast speed, and low energy consumption per spike event, the all-2D materials-based 2TFGM is a very promising candidate for building energy-efficient neuromorphic systems.

Device Structures and Memory Characteristics
Device structures and memory characteristics of 2TFGM are shown in Figure 1. Figure 1a shows a schematic view of the MoS 2 -based all-2D 2TFGM fabricated through layer-by-layer stacking process (see Experimental Section and Figure S2, Supporting Information). Monolayer MoS 2 is used as the channel (see Figures S1 and S3, Supporting Information, for MoS 2 films and device characterizations), exfoliated few-layer graphene (FLG) are used both as the floating gate (FG) electrode and as contact electrode, and h-BN as the tunneling layer. Note that graphene contact can reduce the contact resistance, and the atomically flat surface of h-BN could avoid trapped charges and electron scattering at the interface. [37][38][39] The device is measured by applying operation voltages on drain electrodes with source electrode grounded. Figure 1b shows the current-voltage (I-V) switching hysteresis loops of typical 2TFGMs with different h-BN tunneling layer thicknesses of 7, 10, and 15 nm (channel length/width of 3 µm/10 µm was kept unchanged). The sweeping directions are indicated by the dashed arrows. All 2TFGMs have nearly symmetric current hysteresis loops with ultra-high on/off ratios of ≈10 8 during the programming and erasing process. The operation voltages were reduced from ±20 to ±8 V, respectively, which could be ascribed to the decreasing width of the tunneling barrier due to adopting thinner tunneling layer. Figure 1c illustrates band diagrams of the drain/h-BN/FG in the programming (i) and erasing (ii) processes. For the programming process (Figure 1c(i)), a large potential drop between FG and drain is introduced by applying positive voltages (V ds ), enabling electron tunneling from FG to drain, and making the FG positively charged (i.e., hole accumulation). Owing to the strong electrostatic gating of the MoS 2 channel through the thin h-BN dielectric layer, the device resistance state (RS) is changed from high-resistance state (HRS) to low-resistance state (LRS). For the erasing process (Figure 1c(ii)), electrons tunnel from drain to FG and are stored in FG when a negative voltage is applied, resulting a change from LRS to HRS. As the electron tunneling probability depends strongly on the barrier thickness, higher drain voltages are required for operating a device with thicker h-BN tunneling layers (Figure 1b). The charge tunneling and storage in FG are further confirmed in the 2TFGM device with extended FG. (See Figure S4, Supporting Information).
The retention characterizations after +/−12 V programming/ erasing process of a 2TFGM with 10 nm-thick h-BN were measured at different temperatures of 300, 400, and 500 K with V read = 1 V (Figure 1d, also see Figure S5, Supporting Information, for more details). Both LRS and HRS feature a good retention property of >10 4 s at 300 K. This excellent retention is attributed to high barrier height of h-BN and the stored electrons/holes cannot dissipate from the embedded FG without external stimulations. A large memory window is still preserved, indicating the good robustness to thermal environment of the 2TFGM devices. Note that HRS increases obviously in the initial 1000 s at 400 K, which could be ascribed to electrons tunneling back from FG to drain with higher kinetic energy through the thermal-emission process. The decrease of the on/off ratio and the deterioration of the retention property indicate that hot electron emission plays an important role in addition to the tunneling effect with the increase of temperature. The memory window almost disappears at 500 K. The retention characteristics of HRS/LRS become poorer for the device as decreasing the thickness of hBN (corresponding to narrower tunneling barrier, Figure S6, Supporting Information). Thus, the retention properties are mainly affected by the kinetic energy of the carriers and the tunneling barrier height/width. Our 2TFGM devices also show good endurance characteristic of at least 10 5 cycles under programming voltage pulse amplitude of ±18 V and width of 100 ms with V read = 1 V (Figure 1e). The on/off ratio remains over 10 4 without degradation during the whole process. 2TFGM devices still show stable resistance switching under ±23 V with pulse width of 100/40 ns, indicating high-speed pulse programmability of our devices ( Figure 1f). Good endurance (>10 5 cycles), long retention time (>10 4 s), high switching speed (40 ns) as well as good thermal stability are simultaneously realized, revealing such all-2D 2TFGMs as very promising candidates for digital data storage applications. We also demonstrate a 4 × 4 2TFGM array fabricated from a standard planar fabrication technology (Figure 1g). Figure 1h shows the stored map of "N07" pattern (i-iii) by operating the 2TFGM devices array, suggesting good device uniformity and non-volatile properties (note that the pristine conductance states of the 4 × 4 2TFGM array are shown in Figure S7, Supporting Information).

Multi-Bit Memory States Operation
Tunable conductance states can be achieved when different V ds are applied for programming, enabling the multi-level states as illustrated in Figure 2a. Different conductance states for a device with 10 nm-thick h-BN are achieved by applying voltages from +8 to +12 V in the programming process, owing to different amount of holes stored in the FG under different applied voltages. The device conductance is unchanged for V ds varying from 0 to +7 V, since electrons cannot tunnel from FG into drain through the barrier under small V ds . Multi-level resistance switching behaviors are also observed in the erasing process, as shown in Figure S8, Supporting Information. Good retention longer than 10 4 s for each conductance state tested with 1 V read pulse is shown in Figure 2b. Furthermore, each conductance state shows a low drift coefficient ranging from 0.003 to 0.175, extracted using a power-law equation: [40,41] where G 0 is the initial conductance at t 0 , and α is the conductance drift coefficient. Note that the highly reproducible multilevel resistive switching behavior can also be achieved by applying a series of voltage pulses with different amplitudes as shown in Figure S9, Supporting Information. Quasi-continuous distinguishable conductance states can be realized in these 2TFGMs, as shown in Figure 2c. 131 conductance states (7 bits) from 0.1 to 13 nS with a 0.1 nS interval are obtained successfully by applying different numbers of pulses with amplitudes from +10 to +15 V (pulse width of 100 ms). The corresponding conductance drift coefficients calculated from the retention characteristics are mainly in the range of ≈0.001-0.01 ( Figure S10, Supporting Information). Furthermore, the cumulative distributions of 16 representative conductance states are shown in Figure 2d. The 2TFGM device can be programmed to a target conductance value by controlling the number of identical pulses applied. All these device characteristics indicate excellent reliability and stability of the 2TFGM devices.

Artificial Synaptic Behaviors Investigation
Such reliable multi-bit data storage is highly desirable for highperformance artificial synaptic devices. Figure 3a shows a schematic illustration of biological neurons, consisting of a soma, an axon, and dendrite, and synapses that are the conjunctions between two neighboring neurons. The enlarged area illustrates the synaptic chemical messengers transmission between the axon terminal of the pre-synaptic neuron and the dendrite of post-synaptic neuron. [42] If an action potential arrives at the pre-synaptic neuron, the synaptic membrane is depolarized and triggered neurotransmitters is released, diffusing across the synaptic cleft, docking with receptors on the post-synaptic neuron, and generating an excitatory postsynaptic current (EPSC) or inhibitory postsynaptic current (IPSC), whose amplitudes or intensity are determined by the synaptic weight. Longterm potentiation (LTP) and depression (LTD), which are two essential synaptic functions in learning, were also emulated by tuning the conductance state of the 2TFGM device gradually, as demonstrated in Figure 3b. 3000 quasi-continuous states were programmed for the LTP/LTD process through a series programming/erasing pulses of ±15 V (40 ns pulse width), suggesting that an extremely large number of states are available for neuromorphic computations in this 2TFGM-based artificial synapse. Figure 3c,d shows the transient electrical responses during the LTP/LTD process captured by oscilloscope. Figure 3e,f are the enlarged views of several cycles in Figure 3c,d, respectively, in which EPSC and IPSC biological behaviors are triggered by the pre-synaptic pulse of V ds . The energy consumption for a single pulse event can be calculated as I peak × t × V ds , where I peak , t, and V ds are the peak value of ESPC (IPSC), the pulse width, and the operation voltage, respectively. The energy consumption for a single pulse is estimated to be 18 fJ (30 nA, +15 V, 40 ns), which is much lower than that of conventional CMOS circuit (≈900 pJ). [43] Such all-2D 2TFGM artificial synaptic devices hence exhibit great potential for high-speed and low-power neuromorphic computing.

Linear and Symmetric Weight Update Tuning
A linear and symmetric weight update behavior with sufficient number of states is critical for improving the inference accuracy and reliability of neural networks. [5,44,45] While nearly linear and symmetric weight update can be realized by encoding the input pulse, it leads to restrictive circuit complexity, time delay, and additional energy consumption. [16,22,46,47] Therefore, an ideal artificial synapse device that can be simply programmed with identical pulse is required for a low-power and high-accuracy neuromorphic circuit. As shown in Figures S11 and S12, Supporting Information, the conductance state of a 2TFGM device changes as a function of the pulse amplitude and width. Note that the 2TFGM device is always set to the same starting conductance state before applying the pulses. The dependences of the device conductance change on the programming pulse width and amplitude were systemically studied, as shown in Figure 4a,b. The conductance change (∆G) in both LTP/LTD processes increases with pulse width (amplitude) and ultimately saturates for larger pulse width (amplitude). The short dash lines are an exponential fitting to the experimental data. The saturation behavior is a consequence of the maximum number of trap charges in FG induced by a particular amplitude (width) of voltage. The plasticity of our 2TFGM artificial synapse is tuned by the amplitude of the operation pulse from ±13 to ±10 V (width of 100 ms), as seen in Figure 4c. When the ±13 V pulses are applied, the resulted on/off ratio is 20, while the conductance saturates after only 20 pulses. More conductance states (up to 400) with small variation of conductance per pulse (∆G < 0.1 nS, Figure S13, Supporting Information) can also be realized by reducing the pulse amplitudes to ±10 V. The plasticity (including linearity, on/off ratio, and the number of The changes in the device conductance as a function of pulse width (a) and amplitude (b) during the P/D process. c) The linearity, on/off ratio, and number of states in the P/D process could be tuned by adjusting the amplitude of voltage pulses from ±10 to ±13 V (100 ms). d) Nonlinearity analysis on the weight update of the different P/D curves in (c). The orange dashed lines represent the ideal linearity and symmetry of weight update. e) Cycled P/D operations of the 2TFGM artificial synapse. At least 50 states are programmed using a series of pulses with amplitude of ±12 V and width of 100 ms, demonstrating good reproducibility, linearity, and symmetry in the synaptic weight update. f) The image classification accuracy for hand-written digits from the MNIST database under different P/D processes as a function of the training epoch. A high recognition accuracy of 97.7% is achieved using the P/D process of 400 states and V ds = ±10 V. g) The error rate after 20 training epochs corresponding to different P/D processes in (f). Inset illustrates the simulated neural network structure. Here a three-layer perceptron (including one hidden layer) is simulated with the standard backpropagation algorithm. Furthermore, quantitative analysis on the linearity of weight update behavior in the LTP/LTD processes is shown in Figure 4d. The nonlinearity factor (v), [18,23] which illustrates the nonlinear behavior of the weight update, is calculated based on the normalized conductance (G p or G d ) as a function of pulse number (p): where G max and G min represent the maximum and minimum conductance in the LTP/LTD processes, and p max is maximum pulse number. We found trade-offs among linearity, on/off ratio, and the number of states. The number of states could be increased by shortening the pulse length without sacrificing the nonlinearity and on/off ratio ( Figure S14, Supporting Information). Comparing the nonlinearity of operation pulses (±11 V, 100 ms) with 150 states, 100 states, and 60 states, it is found that good linearity of weight update behavior could be obtained by reducing the number of states as well as the on/off ratios ( Figure S15, Supporting Information). The best nonlinearity factor of v p = 0.18 for the potentiation process and v d = −0.29 for the depression process are achieved in our all-2D 2TFGM devices (Table S1, Supporting Information). Figure 4e shows the cycled LTP/LTD operations of the 2TFGM artificial synapse, demonstrating good reproducibility of the synaptic weight update behavior.
To further demonstrate the potential of 2TFGMs for neuromorphic computing, an artificial neural network has been simulated using the MNIST dataset based on the measured long-term plasticity characteristics. The schematic view of the simulated neural network is shown in the inset of Figure 4g, which is a three-layer perceptron (including one hidden layer), and the standard backpropagation algorithm is used for training. The learning curves are displayed in Figure 4f for different programming conditions. A high recognition accuracy of 97.7% (equivalently ≈2.3% error rate as shown in Figure 4g), which approaches the software baseline accuracy of numerical simulations (≈98%), is achieved after 20 training epochs using 400 states for the LTP/LTD processes under V ds = ±10 V. This excellent performance is mainly attributed to the large number of conductance states available as well as the good linearity and symmetry in the weight update. In comparison with existing neuromorphic devices, such as typical redox, electrochemical, phase change, ferroelectric, magnetoresistive random access memory devices, and other memory devices based on 2D materials, [5,16,28,[44][45][46][47] the all-2D materials-based 2TFGM features plenty number of analog states (≈3000) and low energy consumption per synaptic event (≈18 fJ). Most importantly, high linearity and symmetric weight update behavior (0.18/−0.29 for P/D process) driven by the identical pulse is realized in the device, which has been pursued for the development of high energy-efficient artificial neural networks. See Figure S16, Supporting Information, for more details.

Conclusion
In summary, we demonstrated a high-performance artificial synapse of 2TFGM built from all-2D material van der Waals heterostructures. Ultralow drift coefficients, highly symmetric and linear synaptic weight update plasticity, large number of analog states, and fast updating speed have been exhibited in 2TFGMs, which are critical for accelerating the training efficiency, accuracy, and reducing the energy consumption in artificial neural networks. A high classification accuracy up to 97.7% has been achieved in the MNIST simulations. Our results demonstrate that such highly reliable all-2D 2TFGMs are very promising candidates for high-accuracy, low-energy consumption, and high-speed neuromorphic computing applications.

Experimental Section
Materials: High quality CVD-grown monolayer MoS 2 films [48] were used as channel materials in all-2D 2TFGMs. The growth process of MoS 2 films was carried out in a CVD system with three temperature zones, during which S (Alfa, 99.5%, 8 g) powder and MoO 3 (Alfa, 99.9995%, 30 mg) powder used as reaction sources and c-plane polished sapphire wafers used as substrates. During the growth, carrier gases of Ar (40 sccm) and Ar (240 sccm)/O 2 (10 sccm) were fluxed for S power and MoO 3 individually and the pressure in the chamber was ≈1 torr. The temperature was held at 130, 530, and 900 °C for S-source, MoO 3source, and substrates, and the growth process lasts about 30 min. Thin h-BN flakes and FLG were exfoliated from h-BN and natural flaggy graphite crystals (purchased from NGS Trading & Consulting GmbH, Germany). Large-scale thin FLG and h-BN flake could be obtained by poly-propylene-carbonate (PPC) assisted thermal exfoliation methods.
Device Fabrications: The MoS 2 films on a sapphire substrate was first spin-coated by a polymethyl-methacrylate (PMMA) then etched in KOH solution (1 mol L −1 , 110 °C) for 30 min. The as-received MoS 2 triangles supported on PMMA films, FLG, or h-BN flakes held by PPC films were stacked precisely through layer-by-layer stacking methods in our home-made transfer station. The sacrificing layer of PMMA or PPC could be removed by rinsing in acetone for >1 h at a room temperature. The floating gate layer and contact of FLG and the channel MoS 2 layer were defined by electron beam lithography (EBL) and oxygen reactive ion etching (RIE) process. Devices were finally wired out by Ti (3 nm)/Au (30 nm) electrode for electrical measurements.
Sample Characterizations: Raman and photoluminescence (PL) spectra were acquired from a Horiba Jobin Yvon Lab RAM HR-Evolution Raman system with a 532-nm He-Ne laser (spot size ≈1 µm, power 10 mW) in ambient conditions. Surface morphology was characterized by atomic force microscope (Asylum Research Cypher S instruments) with AC160 TS tip under the taping mode. The electrical measurements were carried out in a close-cycle cryogenic (70-500 K) probe station equipped with Agilent B1500 semiconductor parameter analyzer. All the measurements were operated with a base pressure of 10 -6 Torr under dark condition. linear unit was used as the activation function of the hidden layer and softmax was used as the classification function of the output layer. For each synapse, the normalized conductance difference of a 2TFGM device and a reference resistor was used to represent its weight. All the weights were randomly initialized between −1 and 1 and the stochastic gradient descent (SGD) with the cross-entropy loss function and a mini-batch size of 200 were used to train the network. The learning rate was 0.1. In each iteration, the update value of each synapse was first calculated according to the gradient and learning rate. Then a series of pulses were applied to each 2TFGM device, whose corresponding update value was larger than 0.001, until the exact weight change beyond the update value. The simulator could achieve a numerical accuracy limit of about 98% (baseline accuracy).

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.