Memristive Crossbar Arrays for Storage and Computing Applications

The emergence of memristors with potential applications in data storage and artificial intelligence has attracted wide attentions. Memristors are assembled in crossbar arrays with data bits encoded by the resistance of individual cells. Despite the proposed high density and excellent scalability, the sneak‐path current causing cross interference impedes their practical applications. Therefore, developing novel architectures to mitigate sneak‐path current and improve efficiency, reliability, and stability may benefit next‐generation storage‐class memory (SCM). Moreover, conventional digital computers face the von‐Neumann bottleneck and the slowdown of transistors’ scaling, imposing a big challenge to hardware artificial intelligence. Memristive crossbar features colocation of memory and processing units, as well as superior scalability, making it a promising candidate for hardware accelerating machine learning and neuromorphic computing. Herein, first, crossbar architecture is introduced. Then, for storage, the origin of sneak‐path current is reviewed and techniques to mitigate this issue from the angle of materials and circuits are discussed. Computing wise, the applications of memristive crossbars in both machine learning and neuromorphic computing are surveyed, focusing on the structure of unit cells, the network topology, and the learning types. Finally, a perspective on future engineering and applications of memristive crossbars is discussed.

Another limitation of digital computers is the von Neumann architecture, where the physically separated memory and computing units incur large latency and high energy consumption due to data shuttling. [1][2][3][4][5] This is more evident in machine learning and neuromorphic computing due to frequent transfer of massive network parameters. In contrast, our brain computes in a drastically different way, in which the information is processed and stored at the same place, thanks to the massively intertwined neurons and synapses. [6][7][8][9][10][11][12][13] Numerous efforts have been made to build an electronic brain using traditional complementary metalÀoxideÀsemiconductors (CMOS). [14][15][16] However, no digital computing systems can simultaneously parallel the intelligence and efficiency of a human brain yet. [9,10,17] This is further intensified by the slowdown of Moore's law, because the size of transistors is approaching their physical limit. [9,15] Therefore, fundamental changes to the computing paradigm are required.
A memristor, revealed as the fourth passive electronic element, is a tunable resistor with memory as conceived by Professor Chua [18,19] and demonstrated by researchers from Hewlett-Packard (HP) lab. [20] The HP memristor is essentially a resistive switch which consists of a dielectric layer sandwiched by two electrodes. The unique feature of memristors is that the conductance depends on historical electrical signals, making them capable of working as nonvolatile memory. In addition, memristors may store multibit information with continuously tunable conductance, in contrast to binary states "0" and "1" in traditional digital storage systems, equipping them with higher bit density. Nonvolatility, fast programming, low programming energy, and compact footprint [21][22][23] make memristors a promising solution for the next-generation embedded memory, which may combine the advantage of SRAM and floating-gate transistors. In addition to memory and storage, memristors intrinsically mimic the dynamic behaviors of synapses and neurons, thanks to the bias-historydependent conductance, which has led to various memristorbased artificial and spiking neural networks (SNNs). [24][25][26][27][28] The simple two-terminal metalÀinsulatorÀmetal (MIM) structures of memristors make them capable of being integrated into dense crossbar arrays. [29,30] As shown in Figure 1a, a typical crossbar array consists of parallel metal lines, termed word lines and bit lines, respectively, as the top and bottom electrodes that are perpendicular to each other. The two-terminal memristors are formed at the intersections of word and bit lines. The red cylinder represents a selected cell during the operation to read its conductance (the black solid line). In this readout process, as shown in Figure 1a, a sneak path, represented by the red solid line, carries unwanted current, which is equivalent to series resistors that are parallel to the selected memristor, as shown in Figure 1b. Such sneak paths would lead to extra energy consumption from unselected cells, which also degrades the read margin and thus limits the size of arrays. It shall be noted that the sneak current issue, which is prominent in sequential read-and-write isolated memristors in crossbar arrays, would have a less critical impact on both machine learning and neuromorphic computing. [31] So far, extensive research has been reported to address this sneak-path leakage current in resistive random-access memory (RRAM) and phase-change memory (PCM) arrays. Such solutions include engineering the unit cells, such as introducing an access element to the 1-memristor (1R) cell to form composite cells like one-transistor-one memristor (1T1R), one diode-one memristor (1D1R), one selector-one memristor (1S1R), self-rectifying memristors, etc. [32][33][34][35] The introduction of the access device not only improves energy efficiency during array programming, but may also assist memristors in implementing synaptic plasticity, thus enabling novel analog machine learning and neuromorphic computing. [36][37][38][39][40] In this Review, we explore the low-dimensional materials for memristive arrays, which are promising as the next-generation computing technology. In particular, with the recently reported study on the wafer-scale growth ability of low-dimensional materials, [41][42][43] a complete Review on the recent works including research on both low-dimensional materials and traditional materials-based memristive arrays for information storage and neuromorphic computing becomes essential. Moreover, we present a comprehensive Review of the memory unit cell design for RRAMs and PCMs to resolve the sneak-path current issue, including 1S1R, 1T1R, 1D1R, one-bipolar-junction transistor (BJT)-one memristor (1BJT1R), self-selective cell (SSC), self-rectifying cell (SRC), and complementary resistive switching (CRS) cell, as schematically shown in Figure 1cÀi. The types of bias schemes and the influence of wire resistances on the read/write operations are discussed. Some of the recently reported devices with staircase output electrodes and pillar input electrodes have been proposed, which should be noted as well. [44] Finally, we survey the literature on how 1R and 1T1R arrays physically accelerate machine learning and neuromorphic computing, for example, how they implement different types of neural network topologies and how they perform different types of learning (e.g., supervised, unsupervised, and reinforcement learning, which is either implemented offline or online).

RRAM Writing/Reading Voltage Schemes in the Crossbar Arrays
To avoid programming interference, different bias schemes, as shown in Figure 2, have been proposed to bias the unselected cells with a fraction of the selected cell voltage. [45][46][47][48] Despite the pursuit of memristors with ultralow "off" current/conductance for memory cells in the crossbar arrays, the choice of bias scheme for writing/reading processes could be helpful to mitigate the sneak-path current issue. The voltage schemes could be classified based on the voltages applied to the unselected bit and word lines when the selected cell is always kept under full voltage bias. As shown in Figure 2a,d, the floating scheme leaves all the unselected word and bit lines floating. The read margin of the floating scheme could be much lower than that of the 1 / 2 V scheme because all the sneak currents of the unselected cells will flow toward V if they cannot be suppressed appropriately. In other words, if the sneak current issue in the floating scheme is successfully handled, the crossbar RRAM in the floating scheme can exhibit better energy efficiency while achieving an extremely high density, which is mainly determined by its read margin. In the 1 / 2 V bias scheme, as shown in Figure 2b, e, the selected word line and selected bit line are applied with full voltage and 0 V, respectively, and the unselected word lines and bit lines are applied with 1 / 2 V. Thus, the selected cell (red circle) is under V bias, half-selected cells (green and yellow circles) are under 1 / 2 V, and the unselected cells (blue circles) are under 0 V. However, for the 1 / 3 V bias scheme, shown in Figure 2c,f, the selected word line and selected bit line are applied with full voltage and 0 V, respectively, same as the situation of 1 / 2 V. The unselected word lines are applied with 1 / 3 V, whereas the unselected bit lines are applied with 2 / 3 V. Thus, the selected memory cell (red circle) is under V bias, half-selected memory cells (green and yellow circles) are under 1 / 3 V bias, and the unselected memory cells (blue circles) are under À 1 / 3 V bias.   Therefore, developing nonlinear IÀV curves with a large on/off ratio and ultralow off-state current would be promising to decrease the energy consumption.

1S1R Cell and Crossbar Array
1S1R cell, a two-terminal circuit consisting of one selector and one memristor in series, as shown in Figure 3a, could lead to high-density integration, thanks to 3D stacking ability. [49][50][51][52] 1S1R device structure is considered as the most preferable scheme for high-density 3D integration of RRAM. [34,35,[53][54][55] The ideal selector should have high conductance at a large voltage (on state) and small current (off state) at low voltage simultaneously or a highly nonlinear IÀV characteristic, [56][57][58] as well as a small variation of threshold voltage and hold voltage. [59,60] Moreover, the selectors should be compatible with the memory cell, in terms of operating current and voltage ranges, to ensure limited sneak-path current from the unselected memory elements during both read and write operations, [34,35] as well as enough current to "set" and "reset" memristors. The selectors should also be fast enough to avoid slowing down the operation of memory devices and have high reliability with cycling endurance, array yield, and device variability comparable with that of the memristors. [34,35] Compared with unit cells and transistors, [61,62] which are very challenging to be stacked vertically, and thus have limited ultimate density, [49] the selector is actually a bidirectional highly nonlinear resistor and is promising for high-density integration. Various material systems showing the function of selectors have been intensively studied, like silicon-based selectors, [63][64][65][66] MIMbased selectors, [67][68][69][70][71][72] ovonic threshold switching selectors, [73][74][75][76][77][78] metalÀinsulator transition (MIT)-based selectors, [79][80][81][82][83][84] fieldassisted superlinear threshold selectors, [85,86] and mixed ionicÀ electron conduction selectors. [87][88][89][90][91] Each of them has its merits and demerits, which have been discussed in detail by Aluguri et al. [35] Moreover, to avoid the hard breakdown of materials used for selectors, self-compliance with great nonlinearity properties is desirable for high-density crossbar array applications. [92,93] Figure 3b shows a typical nonlinear IÀV curve measured from an integrated 1S1R cell with an MIM-based selector. The selector enables the low off current at around 10 À12 A and the memory window around four orders of magnitude. In this particular case, the selector turns to on-state at around 0.7 V, and the memory cell turns to on-state at 1.3 V. The following positive sweep verifies the low-resistance state (LRS) of the integrated unit. For the negative voltage sweep, the selector turns to on state at about À0.7 V and the resistance of the united cell goes back to an off state. Figure 3c-e shows the details of the nonlinear IÀV curves from the selector, resistive memory, and their integrated cell, respectively, giving a direct impression of how to generate the nonlinear IÀV curve with a 1S1R device structure from the separated selector and memory device. The device structure of Figure 3. Electrical performance and typical features of 1S1R memory cell. a) Schematic illustration of the 3D crossbar array and the inset showing the structure of the memory cell with the integration of 1S and 1R. b) I-V curves of the 1S1R memory cell integrating the Cu/HfO 2 /Pt memory and a discretedefect graphene selector under 500 μA compliance current level. The inset shows the typical electrical characterization of the Cu/HfO 2 /Pt memory device. Reproduced with permission. [52] Copyright 2018, Wiley-VCH. c) Continuous bidirectional threshold switching of the individual Pd/Ag/HfO x /Ag/Pd selector. d) Repeated bipolar I-V switching curves of the individual memristor with the structure of Pd/Ta 2 O 5 /TaO x /Pd memristor. e) DC I-V curves of the integrated selector and memristor vertically. Reproduced with permission. [51] Copyright 2017, Wiley-VCH.
www.advancedsciencenews.com www.advintellsyst.com the selector opens another general method for designing a selector device using a structurally symmetric Pd/Ag/HfO x / Ag/Pd stack.

1T1R Cell and Crossbar Array
1T1R cell structure remains the most popular choice for RRAM or PCM arrays. The 1T1R crossbar architecture shares a large similarity with that of DRAM. Figure 4a,b shows the schematic of a typical 1T1R structure and the corresponding IÀV curve. [94] The transistor not only allows flexible selection of memory cells but also facilitates the programming for computing-in-memory applications. For 1T1R RRAM crossbars, the cells can either be of an electrochemical metallization type (relying on the electrochemical dissolution and deposition of an active electrode metal to perform the resistive switching operation) or valence-change type (modification of the valence state of anions to induce changes in electrical conductivity, driven by underlying ion transport and redox processes  [96] In contrast, more works have been conducted in regard to the valence-change 1T1R RRAM crossbars, as valence-change RRAMs usually have a larger activation energy of ion migration and thus better reliability. Some of the widely reported material systems with valence change, such as Hf, Ti, and Ta-based transition metal oxides, have been paired with planar transistors. For example, for Hf-based RRAMs, Sheu et al. reported a 4 Mb 1T1R macro built on the 180 nm process of TSMC, with a TiN/Ti/HfO 2 /TiN RRAM structure that has a cross section of 640 nm Â640 nm. The same RRAM also revealed four-level conductance that can encode multiple bits per cell. [97] A similar RRAM material stack was reported by Ho et al. in 1T1R arrays built on Winbond 90 nm process, showing improved reliability and high-temperature compatibility. [98] In addition, Chou et al. from TSMC reported an 11 Mb HfO x -based RRAM 1T1R macro, which was produced using the 40 nm logic process for embedded memory applications. The macro featured a RRAM programming scheme that balanced the data retention and programming energy/time, which also showed robust switching behavior in a wide range of temperatures. [99] For Ti-based cells, Chang et al. designed a 4 Mb RRAM macro for embedded memory application based on TSMC 64 nm technology. The macro was equipped with on-chip low-voltage current sense amplifiers, which worked with TiN/TiON/SiO 2 /Si RRAMs. [100] The same RRAM stack was also integrated with TSMC 28 nm high-κ MG CMOS process to build a 1 Mb 1T1R RRAM macro. The advanced technology node reduced . Nonvolatile memory based on one-transistor-one-resistor structure. a) Schematic of a typical 1T1R structure using a standard 0.13 μm logic process and integrated with memory cell based on a Cu/HfO x /Pt structure. Reproduced with permission. [94] Copyright 2014, IEEE. b) The corresponding I-V curve for the 1T1R cell is shown in (a) in drain voltages (V d ) sweeping mode. c) The cross-sectional TEM image of 40 nm Ir/Ta 2 O 5 /TaO x /TaN resistive memory. Ir and TaN are top and bottom electrodes, respectively. d) The image of a 2 Mbit memory array with 40 nm 1T1R TaO x -based RRAM. Reproduced with permission. [103] Copyright 2015, IEEE. e) The schematic of the 32 Â 32 1T1R array based on Cu/HfO 2 /Pt structure reported by Lv et al. The gates of the regularly arranged transistors and the top electrodes of the memory cells were connected by the word line and bit line, respectively. f ) The corresponding cross-sectional TEM image of 1T1R structure. The transistor was fabricated with the same processes as shown in (a). g) The test conditions of the ECM cell. Reproduced with permission. [108] Copyright 2015, Nature Publishing Group. h) The partial cross section of the memory cell in the 1 Mbembedded RRAM macro. i) The zoom-in image of the memristive cell. Reproduced with permission. [109] Copyright 2017, IEEE.
www.advancedsciencenews.com www.advintellsyst.com the size of the RRAM down to 0.0308 μm 2 cell À1 . The macro also featured improved sense margin and a low-energy RRAM programming scheme. [101] For Ta-based RRAMs, Hawahara et al. from Sony reported a 512 Kb 1T1R RRAM macro consisting of Ir/Ta 2 O 5 / TaO x /TaN RRAM cells. The macro was fabricated using the 180 nm process, which also consisted of a special two-step forming scheme that could better control the filament size and thus lead to improved endurance (10 7 ). [102] The same RRAM device was used in a 2 Mb 1T1R RRAM macro using both 28 and 40 nm process by Hayakawa et al., which used a special process to confine the filament position to the center of the RRAM to improve reliability for embedded system applications (Figure 4c,d). [103] For 1T1R PCM crossbars, the mature Ge 2 Sb 2 Te 5 cells are widely reported. In addition, developing special material combinations that can enhance reliability is also a hot research topic. For example, Close et al. reported a 4 Mbit 1T1R PCM macro built on a 90 nm process. The PCM cells were based on doped-Ge 2 Sb 2 Te 5 that showed multilevel conductance operation capability. [104] A similar 4 Mb 1T1R PCM macro was reported by Sandre et al., which also used a 90 nm process and Ge 2 Sb 2 Te 5 PCMs, featuring a 1 Mb s À1 write throughput. [105] In addition to planar transistors, valence-charge RRAM 1T1R also shows good compatibility with fin field-effect transistor (FinFET) technology, which is suitable for embedded memory applications at advanced nodes. For example, Pan et al. demonstrated the first FinFET 1T1R RRAM crossbar array using a 16 nm process of TSMC. The HfO x RRAM was realized using a similar process as that of the gate stack of a FinFET, with a cell size as small as 0.07632 μm 2 . [106] Jain et al. from Intel showed a case of 3.6 Mb 1T1R RRAM macro using the 22 nm FinFET process. It has achieved one of the largest device densities and the shortest sense time, as well as a low bit-error rate in RRAM programming across a wide range of temperatures. [107] The failure and cycled retention loss in HfO 2 -based electrochemical metallization memory (ECM) cell device with 1T1R structure was systematically investigated by Lv et al. using a 1 Kbit device array (Figure 4eÀg), which paves the way for understanding the mechanism of endurance and retention failure. [108] The 1T1R fabrication cost can also be minimized by engineering the device's structure design. For RRAM, as reported by Lv et al., a 1 Mb 1T1R macro, using transition metal oxide-based RRAM, was developed using a 28 nm Semiconductor Manufacturing International Corporation (SMIC) process with a single extra mask for the integration of RRAMs at small fabrication cost, as shown in Figure 4h,i. The macro shows decent switching performance and high-temperature stability for embedded memory applications. [109] For PCM, Wu et al. demonstrated that only two extra masks were needed for 1T1R PCM integration, which also allows extra footprint shrinking in a 1 Mb 1T1R PCM macro using TSMC 40 nm process. The shrinkage and electrode material engineering lead to low-write current and good resistance control with applications for computing-in-memory. [110] 3.3. 1D1R Cell and Crossbar Array Similar to 1S1R, the 1D1R structure consists of a diode and a unipolar memristor. They could achieve a footprint of 4F 2 , like that of 1R or 1S1R, and may further increase the structure density to n/4F 2 . [111][112][113][114] Due to the self-rectifying function of the diode, the reading error could be avoided as the current mainly passes through the selected memory cell itself. [115][116][117] Thus, 1D1R crossbar arrays feature better 3D stack ability thanks to the simple structure and CMOS process compatibility of the diode selectors. The International Technology Roadmap for Semiconductors (ITRS) also suggested that the combination of a diode and transistor with a resistor in a single chip is indispensable for the prevention of this undesired sneak-path current issue. [118] The architecture of 1D1R or 1T1R can improve reading accessibility in an integrated memory array structure, [112,[119][120][121] whereas the 1D1R architecture is preferred in terms of integration because it occupies less area, and the design and fabrication of 1D1R devices are simpler than that for 1T1R devices.
Based on the types of materials for fabricating diodes, the reported 1D1R could be classified as Si-based diodes, [122][123][124] organic diodes, and oxide diodes. Each of them has its own advantages and disadvantages. For example, Si-based diodes require a high-temperature process for dopant activation or enhanced contact properties, risking the rest of the fabrication processes, particularly that of memristors. Organic diodes could not be fully compatible with conventional semiconductor processes due to their vulnerability to high-temperature treatment. [125][126][127][128] Oxide-based diodes have no CMOS compatibility issue. They can also be fabricated with relatively low-temperature processes; [114,123,124,[129][130][131][132][133] for example, Yoon et al. reported a 1D1R crossbar array shown in Figure 5a using physical vapor deposition methods at low temperature. [134] The top-view and cross-sectional scanning electron microscopy (SEM) images are shown in Figure 5b, showing the device structure consisting of Ti/TiO 2 /Pt/SiO x /Pt. The corresponding initial IÀV curve of the fabricated 1D1R device is shown in Figure 5c and its rectification ratio at V ¼ 2 V is around 4 Â 10 5 . The endurance test with set/reset/read voltages at 8/15/2 V, respectively, is shown in Figure 5d as well. However, this 1D1R configuration has not fully met the requirements of large rectification, high on/off resistance ratios, and low power consumption needs.
So far, there have been some 1D1R memristive arrays reported with a large-scale capacity based on oxide-based diodes. For example, Kawahara et al. from Panasonic reported an 8 Mb RRAM macro made of two-layer 3D-stacked 1D1R crossbars using 180 nm technology. Each 1D1R cell consists of an Ir/Ta 2 O 5 /TaO x /TaN RRAM paired with a bidirectional TaN/ SiN x /TaN diode, with a writing throughput up to 443 Mb s À1 . [135] The density of the storage can be further increased with an advanced technology node. Hsieh et al. demonstrated a three-layer 1D1R RRAM crossbar using TSMC 28 nm HKMG CMOS Cu line process. The material stack of the RRAM is Ta/TaN/TaON/Cu, which is paired with a TaO x diode, as shown in Figure 5e. [136] Liu et al. unveiled a 32 Gb 1D1R RRAM test chip, which is one of the largest capacity RRAM chips developed so far. The chip has two-layer stacked metal oxide RRAM and diodes, fabricated using the 24 nm technology of Sandisk and Toshiba. [137] However, due to the rectifying characteristic of the diode, almost all 1D1R arrays use unipolar memristors, because bipolar memristors demand both positive and negative voltage polarities for switching. [116,[138][139][140][141] Further, the device performance of bipolar memristors is generally better and more reliable compared with unipolar memristors. [142,143] Another factor is that the diode cannot provide self-compliance without a complicated device structure, like the structure of Ni/AlO y /nþ-Si-TiN/HfO x / Ni reported by Liu et al. [144] Thus, the development of high-density-integrated 1D1R is greatly limited. Li et al. reported that the integrated structure of Ni/TiO x /Ti diode and Pt/HfO 2 /Cu bipolar RRAM cell could demonstrate a self-compliance bipolar resistive switching behavior to suppress the undesired sneak current in a crossbar array, [145] which paves the way to designing a highly integrated 1D1R crossbar array with the elimination of inherent obstacles of 1D1R. Thus, designing diodes with high forward current density, high self-rectifying ratio, low-temperature fabrication, and easy integration with memory cell would be the key parameters that should be considered further.

1BJT1R Cell and Crossbar Array
BJT has been widely reported as the selecting devices for PCM crossbar arrays. Seravalli and Villa et al. demonstrated a 1 Gb PCM test chip based on 1BJT1R crossbar arrays. The chip is manufactured using a 45 nm process of Humonyx. Each cell has a vertical PNPÀBJT selector and a Ge 2 Sb 2 Te 5 PCM cell. The chip offers a 266 Mb s À1 read throughput and a 9 Mb s À1 write throughput. [146,147] For the RRAM, due to the limitations of CMOS processes and planer structures of transistors, it is difficult to utilize the metalÀoxideÀsemiconductor field-effect transistors (MOSFETs) to satisfy all requirements of low-voltage operations, high scalability, and large current drivability with one single cell. Hua et al. reported a new logic-compatible BJT with a vertically formed stack underneath the resistive stacked film of TiN/Ti/HfO 2 /TiN as a high-performance current The zoom-in schematic shows the device structure of one memory cell including Ti/TiO 2 /SiO x /Pt. b) SEM images showing the top-view and cross-sectional view of the fabricated 1D1R device. c) The representative I-V curves of the fabricated 1D1R device. d) Endurance performance of the fabricated 1D1R device. The set, reset, and read voltages are 8, 15, and 2 V, respectively. Reproduced with permission. [134] Copyright 2018, Wiley-VCH. e) Illustration of large-scale industrial crossbar arrays. Cross-sectional SEM view of 28 nm TaON-based cross-point 3D via RRAM and the zoom-in TEM image of a 3D via RRAM (30 nm Â 30 nm) in (e) with a stacked TaO x diode in 28 nm Cu single damascene process. Reproduced with permission. [136] Copyright 2013, IEEE.
www.advancedsciencenews.com www.advintellsyst.com driver and bit-cell selector, as shown in Figure 6a. [148] The corresponding 3D RRAM array arrangement with BJT structure is shown in Figure 6b. The shallow and tiny n-type lightly doped domain (NLDD) acts as the bit line in connection with the RRAM film and the very thin and self-aligned p-pocket implant works as the word line ( Figure 6c). Such a new 3D RRAM cell could be easily implemented in advanced CMOS logic platforms for the ultrahigh-density and very-low-voltage non-volatile memory (NVM) applications due to its area-saving device structure and efficient operation driven by the high-gain BJT with a low voltage of 2 V for reset and 1.5 V for the set processes ( Figure 6d).

CRS Memory Cell and Crossbar Array
CRS provides another way to avoid sneak-path current without extra access elements, at the cost of duplicating the number of memristors. Each CRS cell usually has two antiserially connected bipolar memristors in a back-to-back way. [149][150][151][152] As they share a common electrode, when one of the memristors is programmed into LRS, the other will be programmed into a highresistance state (HRS). [149] To achieve the stability on a window, a series resistor is normally required for entertaining an asymmetry for the set and reset device voltages, making a level read operation possible, as shown in Figure 7a. [150] So far, most CRS cells reported previously could be classified into two groups: 1) CRS using two symmetric memory cells. Lee et al. exhibited a CRS cell in the oxide-based RRAM device based on the inverse materials order (Pt/ZrO x /HfO x /metal/HfO x /ZrO x /Pt) of two symmetric memory cells, [153] where the oxygen ion motion between the ZrOx and HfOx oxides contributed to resistive switching. Wang et al. reported a CRS device consisting of two symmetric memory cells based on Ti/TiO x /Cu/TiO x /Ti structure, as shown in Figure 7b. [154] Other reports of symmetrically connected pair of memory cells have been demonstrated, like Pt/BTO/LSMO/ BTO/Pt, [155] Au/a-C/CNT/a-C/Au, [156] Pt/TiO x /TiO y /TiO x / Pt. [157] 2) CRS using two asymmetric memory cells. As the former one with two same memory cells connected usually have the fixed operation voltages and thus limited operation voltage windows, Lee et al. demonstrated a CRS cell with a structure of W/ZrOx/HfOx/TiN connected with TiN/Ir/TiOx/TiN, consisting of two asymmetric memory cells, as shown in Figure 7c. [158] The set/reset switching is positive/negative for HfO x -based memory cell, which is opposite to the switching of TiO x -based memory device. Both of them show larger reset voltage than the set voltage, and a wide voltage-operating window in the positive-bias region has been achieved from the superimposed IÀV feature of two merged cells. Similar results have been observed in Al/Al 2 O 3 /Au/GO/ITO [159] and ITO/GO/graphene/GO/Al. [152] Although the CRS with two antiserially connected memory cells can effectively solve the sneak-path current, the integration complexity due to extra fabrication steps, rapid degradation of the common active internal electrode, etc. prohibits the implementation of large-scale CRS crossbar memory. A potential solution Reproduced with permission. [148] Copyright 2010, IEEE.
www.advancedsciencenews.com www.advintellsyst.com is a truly single memristor instead of two that can exhibit CRS. Nardi et al. proposed a CRS device based on a single memory device with the structure of TiN/HfO x /TiN. [160] However, CRS could only be observed with a uniform Hf concentration profile within the HfOx active layer. [160] Yang et al. have reported the CRS in Pd/Ta 2 O 5Àx /TaO y /Pd memory cells with two designed different stoichiometric TaO x layers: an oxygen-rich layer and an oxygen-deficient layer, and the exchange of oxygen vacancies between two layers with the gradient of oxygen composition plays a vital role in the implementation of CRS (Figure 7d). [161] Similar structures have also been reported in Au/BaTiO 3 / NiO/Pt, [162] W/Nb 2 O 5Àx /NbO y /Pt, [163] Al/GO/ITO, [164] IrO x / GdO x /Al 2 O 3 /TiN, [165] Pt/HfAlO x /TiN, [166] Pt/HfO x /TiN, [167] and Pt/TiO 2Àx /TiN x O y /TiN, [168] etc. Although there are many preliminary works on different CRS cells, several issues should be addressed before developing a high-density CRS RRAM array. In CRS, the read operation for one of the HRS involves a set transition, which requires a solution to limit the high programming current. Although the proper operation of a CRS crossbar memory array could be ensured by connecting each memory cell in series to a selector/transistor, [138,[169][170][171][172] that defeated the motivation of CRS that is selector free. A typical approach is to embed a "series resistor" into the CRS memory cell, which would limit the increase in current with the formation of a conducting filament in the switching layer. [173][174][175][176] Tappertzhofen et al. reported a novel method to realize a nondestructive readout based on a CRS cell consisting of two memory cells with similar switching properties and distinguishably different capacities. [177] Another issue is the narrow read voltage window of CRS. To our best knowledge, most of the reported RRAM devices with CRS characteristics generally exhibit a narrow read margin (%0.5 V), like Pt/SiO 2 /GeSe/Cu/SiO 2 /Pt, [178] Pd/Ta 2 O 5Àx /TaO y /Pd, [161] W/Nb 2 O 5Àx /NbO y /Pt, and [163] TiN/HfO x /TiN. [160] Pt/ZrO x /HfO x /TiN/HfO x /ZrO x /TiN [153] and W/ZrO x /HfO x / TiN/Ir/ZrO x /TiN. [158] To address this limitation, Zhang et al. proposed a new approach with ITO/HfOx/TiN memristor to enlarge the difference between the set and reset voltages, relying on the inherent asymmetry in the O-ion exchange processes between interfaces because of the different reactivities of metal electrodes. [179] This work solves the key challenge of demonstrating array-level CRS.

SRC and Crossbar Array
The aforementioned solutions to alleviate the sneak-path current issue using an additional selector, diode, or transistor would increase the complexity of the fabrication process and the cost, increase the read/write voltage, degrade the stability of memory, as well affect the scaling limitation because of the complicated device structures. Self-rectifying resistive memory could avoid the issues addressed earlier without extra rectifying devices.
The typical structure of a self-rectifying RRAM is metalÀ insulatorÀinsulatorÀmetal (MIIM) or MIM. The large work function difference between the top and bottom electrodes is essential for the asymmetric effective barrier seen in the top and bottom electrodes to enable the rectifying feature. So far, the self-rectifying memory devices with such bilayer device structures have been intensively studied, for example, NiSi/HfO x / TiN, [180] Ge/HfO x /Ni, [181] He-LiNbO 3 /Pt/SiO 2 /LiNbO 3 , [182] Pt/ Ta 2 O 5 /HfO 2Àx /TiN, [183] Ni/HfO 2 /SiO 2 /Si diode, [184] Pt/TaO x / n-Si, [185] Al/MoO x /Pt, [186] (ITO)/InGaZnO/ITO, [187] Pt/ HfO 2Àx /TiN, [188] Pt/amorphous InÀGaÀZnÀO (a-IGZO)/ TaO x /Al 2 O 3 /W, [189] Ti/SiO x N y /AIN/Pt, [190] Pd/HfO 2 /WO x / W, [191] Ag/a-Si/p þ -Si, [192] Au/ZrO 2 :nc-Au/n þ Si, [193] Au/ LiÀZnO/ZnO/Pt, [194] Ni/SiN/HfO 2 /Si, [195] Pd/HfO 2 /TaO x / Ta, [196] Ni/Al 2 O 3 /p-Al doped GaN (p-AlGaN), [197] Si 3 N 4 /SiO 2 / Si, [198] Pt/Ta 2 O 5 /HfO 2Àx /Hf, [199] Ti/GaO x /NbO x /Pt, [200] Ti/ NiO x /Al 2 O 3 /Pt, [201] etc. Li et al. reported a p-Si/SiO 2 /n-Si memristor. The optical images and the cross-sectional transmission electron microscope (TEM) image are shown in Figure 8aÀc, and the typical nonlinear IÀV curve with unipolar behavior is shown in Figure 8d. Such a novel SRC exhibits repeatable unipolar resistance switching with a rectifying ratio of 10 5 and on/off ratio of 10 4 (Figure 8e) and the retention time up to 2 Â 10 5 . [202] Moreover, the authors also demonstrated the 3D crossbar array of up to five layers of 100 nm memristors using fluid-supported silicon membranes and experimentally confirmed the successful suppression of both intra-and interlayer sneak-path currents through the built-in diodes. Kim et al. reported a forming-free memristive system based on the stacked Pt/NbO x /TiO y /NbO x / TiN with a 30 nm contact, showing a programming current as low as 10 nA and 1 pA for the set and reset switching, respectively. [203] The self-rectifying ratio is about 10 5 . This work revealed that the programming power can be decreased to 8.0% of power consumption of a conventional biasing scheme when the device is used in a 1000 Â 1000 crossbar array with the asymmetric voltage scheme (AVS), and a power consumption reduction could be decreased possibly to 0.31% of the reference value if the AVS is combined with a nonlinear selector. This kind of low-voltage operation of the memristive device is of strong potential to be used for low-power applications such as embedded memory of low voltage or power-restricted chips.
To satisfy the strict requirements of SCM, Hsu et al. reported a forming-free and self-compliance bipolar Ta/TaO x /TiO 2 /Ti RRAM cell with extremely high endurance over 10 12 cycles. [204] The self-rectification ratio achieved in this work could be up to 10 5 required for ultrahigh-density 3D vertical RRAM. In addition, the multiple-level-per-cell capability, room-temperature processes, and fabrication-friendly materials demonstrated in this memristive system make its potential promising to realize high-density and high-performance SCM.
Normally, the growth of bilayer dielectric structure increases the cost and complexity of manufacturing. Therefore, lowtemperature compatible processes should be developed. Oh et al. reported a forming-free and self-compliance resistive switching device based on Au/Ni/FeO x ÀGO/Si 3 N 4 /n þ -Si The bias voltage-dependent on/off ratio conductance ratio and the rectifying ratio. f ) Retention behaviors test at room temperature. The conductance states could be maintained for more than 2 Â 10 5 s. Reproduced with permission. [204] Copyright 2017, Nature Publishing Group.
www.advancedsciencenews.com www.advintellsyst.com structure with an excellent resistive switching ratio (greater than 10 4 ) and a rectification ratio higher than 10 4 . [205] The solutionprocessed FeO x ÀGO active layer showed comparable performance with those devices fabricated using vacuum deposition processes, making possible the lower fabrication cost of self-rectifying memory devices. Although the typical bilayer dielectric layer structure has been investigated successfully for developing self-rectifying resistive switching, developing a single material with concurrent highperformance switching and self-rectification would decrease the fabrication complicity and increase the integration level. Recently, Yao et al. reported a RRAM device based on a chiral metalÀorganic framework (MOF) FJU-23-H 2 O with switched hydrogen bond pathway within its channels, exhibiting an ultralow set voltage (%0.2 V), a high ON/OFF ratio (%10 5 ), and a high rectification ratio (%10 5 ). [206] Its resistive switching behavior originated from the turn on/off of the switched hydrogen bond pathway under the stimulus of DC voltages. This work is not only the first MOF with voltage-gated proton conduction but also the first single material showing both rectifying and resistive switching effects.

SSC and Crossbar Array
To date, most solutions like 1S1R, 1D1R, 1T1R, SRC, and CRS are achieved by connecting two MIM cells in series. Each solution has its unique advantage that cannot be combined with that of alternative solutions, thus unable to completely resolve the sneak-path current issue. For example, 1) the 1S1R AND 1D1R cell cannot be integrated with a high capacity due to complex fabrication (including etching issue), 2) the SRC cannot provide sufficiently low sneak currents, which are essential for large integration, and 3) the CRS cell exhibits destructive read operation and high sneak currents due to its intrinsic device structure. [48] All the former solutions are stuck at an integration capacity of megabit (10 6 bits). Indeed, a conceptually new memory cell has to be developed.
The concept of self-selective resistive switching in a single cell offers a new strategy to overcome the sneak-path current issue of a memory device in the crossbar array structure without additional stacking of active devices. By integrating two oxide layers as an insulating layer, it exhibits a selective functionality with an engineered nonlinearity. Other candidates like vanadium oxide (VO x ), [207] with self-selecting resistive switching performance for crossbar memory array was demonstrated by Myungwoo et al. due to the first-order MIT property. The nanoscale VO x device exhibited self-selective switching and memory switching after electroforming. Ma et al. reported other self-selective resistive switching memory cells with a thermal-oxidized HfO x layer in combination with a sputtered Ta 2 O 5 layer configured as an active stack, [208] which represents high-on-state half-bias nonlinearity of %650, a sub-μA operating current, and high on/off ratios above 100Â. Kwon et al. reported a selector-less memristor for high uniformity and low power consumption using the structurally engineered nanoporous Ta 2 O 5Àx and achieved ultralow power consumption (%2.7 Â 10 À6 W). [209] Wang et al. utilized a VO 2 /TaO x bilayer structure to realize the volatile threshold switching and multilevel nonvolatile resistive switching and applied such hybrid self-selective switching to the self-activation neural network. [210] Xu et al. reported a TiN/TiO x / HfO 2 /Ru self-selective device formed by a self-aligned technique, with the off-state leakage current as low as 0.1 pA and operating current below 1 μA. [211] The LRS exhibits high nonlinearity (10 3 ). The programming and erasing speeds are 100 and 400 ns, respectively, and the excellent endurance shows 10 7 cycles. A 4 Â 8 Â 32 3D vertical RRAM array was further demonstrated with a sufficient read margin up to 10 Mb. Eight-layer 3D vertical RRAM with excellent scalability toward SCM was reported by Luo et al. from the same group. [212] This work successfully extended the SSC design into the eight-layer 3D array and explored the scaling limit of this architecture of 5 nm cell size and 4 nm pitch in the vertical dimension demonstrated experimentally. Recently, Sun et al. realized fast and energyefficient 2D self-selective memory cells using a high-quality van der Waals heterostructure of h-BN and graphene, as shown in Figure 9a, which is compatible with an integrated capacity of 10 12 . [48] A current of 10 fA at a low voltage bias (<3 V) and abruptly a current of 10 mA at a high voltage bias in a stable memory device was achieved (Figure 9b). The atomically sharp and chemically inert interface between the h-BN and graphene layers created a rapid reading/writing process with a time constant of tens of nanoseconds (rising time: %50 ns and falling time: %15 ns), as shown in Figure 9c, outperforming the current flash memory technology. The origin of such a memristive behavior is that Ag ions migrate through the h-BN layer during the memory operation and their further migration is blocked by the strongly bonded graphene; then, the boron vacancies contribute to the conductive path in another h-BN layer with the continuously increased voltage. [48] The endurance and retention behaviors of the involved three resistance states are shown in Figure 9d,e up to 10 6 switching cycles and 10 6 s, respectively. Such a new conceptual memory device based on a novel 2D heterostructure will open up a new research field, lowdimensional nanomaterials-based memory and neuromorphic computing.

Comparison of Various Architectures
In this part, we compare the strengths and weakness of each architecture. 1) For the 1T1R architecture, it is compatible with basic operations for in-memory logic, machine learning, and neuromorphic computing, featuring mature process flow derived from DRAM technology. However, it has a relatively small device areal density due to the large footprint of planar FETs, and the device density is further limited by the difficulty to integrate 1T1Rs in 3D. 2) For the 1BJT1R architecture, it is compatible with basic operations for in-memory logic, machine learning, and neuromorphic computing, which has a smaller footprint compared with planar FETs with the use of vertical BJTs and a lower fabrication cost compared with FETs. However, BJT selectors are of lower input impedance and current gain compared with FET selectors and tend to show lower switching frequency compared with FET selectors. 3) For CRS architecture, it features large device areal density when it is integrated in 3D, which is also compatible with operations for in-memory logic. However, CRS reading may be destructive, incurring extra rewriting energy, and suffer from integration complexity due to extra fabrication steps. It is also vulnerable to the rapid degradation of the common active internal electrode. 4) For SSC and 1D1R architecture, both of them feature large device areal density when they are integrated in 3D. In addition, 1D1R-based storage has been commercialized by Intel and Micron, branded as Optane memory. However, both SSC and 1D1R are less compatible with basic operations for inmemory logic, machine learning, and neuromorphic computing. 5) For SSC and 1S1R architecture, they feature large device areal density when they are integrated in 3D. Their bidirectional nonlinearity in their IÀV characteristics allows them to work with bipolar memristors but faces the same issue similar to SSC and 1D1R.
To clearly compare the performances of the discussed architectures in this Review, we summarize with key parameters like on current, on/off ratio, V set /V reset , polarity, operation temperature, retention, and endurance in Table 1.

Impact of Wire Resistance
In large crossbar arrays, the current passing through the metal wires would lead to significant voltage degradation, decreasing the voltage drop on the farthest cell in the crossbar array, and this finally results in write failure, which is also known as the "IR drop" issue. Such resistance affects both memory readout margin and the precision of vector-matrix multiplications. The latter poses a technical challenge to applications such as machine learning and signal processing in the analog domain.
To illustrate the impact of the wire resistance, Hu et al. use the mapping of a discrete cosine transformation matrix as an example and assume that the 64 Â 64 discrete cosine transformation matrix is linearly mapped to the conductance of a memristor array in the range [0, 1 mS]. [213] In case that there is no wire resistance, the voltages are constants along red row electrodes and blue column electrodes. The transformation from the forced Figure 9. Self-selective crossbar memory array based on van der Waals heterostructures. a) Schematic figure of the van der Waals heterostructure integrated with crossbar memory array architecture. b) I-V curve of a typical memory cell in the memristor array. The four numbers represent four different resistance states of the memory cell. The selectivity of this one-body self-selective memory cell is 10 10 , and the memory window is around 10 4 . The Au electrode was kept in connection with the ground. c) The switching speed of the self-selective memory cell is about tens of nanoseconds. d) Endurance of switching behavior of the involved three resistance states, with voltage pulse trains of 10 6 measurement cycles. e) Retention behaviors of the three resistance states at a time of 10 6 s. Reproduced with permission. [48] Copyright 2019, Nature Publishing Group. where G target is the conductance matrix of the memristor array.
In case the electrodes are of nonzero resistance, such as 1 Ω/ block, the currents flowing through the electrodes produce voltage drops. As a result, the memristor that is far from the voltage sourcing and/or current-sensing edge receives reduced bias. The effect of the wire resistance can be absorbed by I eff where G eff is the effective conductance matrix that is clearly different from G target , as shown in Figure 10, particularly the memristors far from the voltage sourcing and/or current-sensing edge. In addition, as shown in Figure 10, the increase in the wire resistance, for example, to 10 Ω/block, will lead to a larger deviation between G eff and G target , which further degrades the precision of the vector-matrix multiplication. The wire resistance impact can be tackled by engineering the conductance range of the memristors. For example, a large ratio between the wire and memristor conductance can reduce the voltage drops across the wires. In addition, circuit and algorithm-level techniques have been invented to mitigate the impact of the wire resistance for machine learning. Hu et al. proposed a conversion method to compute the actual memristor crossbar conductance matrix that can approximate a targeting conductance matrix, based on numerically solving the Kirchhoff equations. [213] In addition, Jeong et al. developed a compact analytic compensation scheme that rescales each element of the sensed current vector by a constant. The scheme is based on the observation that the majority of the current deviation can be accounted by a model assuming constant input voltage and conductance. [214] Liao et al. demonstrated diagonal matrix regression, where two diagonal matrices approximate the impact of row and column wire resistance, which can balance the computational complexity and the accuracy of vector-matrix multiplication. [215] There are some other circuit techniques to deal with the voltage drop issue, by adding write drivers at both sides of bitlines, as written by Zhang et al. [216] Another factor is that the crossbar line capacitance could add both read/write delay time and extra current sneak paths, [48,[217][218][219] which will further degrade the performance of the memory array. Thus, in real application with consideration of line resistance, the position of the selected cell will have a significant influence on the voltage margin.

Applications in Machine Learning and Neuromorphic Computing
In addition to storage class and embedded memory, 1R-and 1T1R-type resistive memory crossbars are frequently applied to machine learning and neuromorphic computing.
So far, 1R and 1T1R crossbars have been used for machine learning by hardware implementation of ANNs. In addition, they are also used in neuromorphic computing or the SNNs which mimic how our brain works. As schematically shown in Figure 11, the SNN is a bioinspired neural network, consisting of two types of building blocks, the neurons and the synapses. Figure 10. The equivalent circuit of a memristor crossbar array with parasitic wire resistance. The color maps illustrate the effective conductance matrix G eff that gradually deviates from the targeting conductance matrix G target (discrete cosine transformation matrix mapped to [0, 1 mS] with increasing wire resistance. Figure 11. Illustration of 1R and 1T1R cells for being used as synapses in both SNNs and artificial neural networks (ANNs). In an SNN, the neurons communicate in spikes, which are modulated by synapses interfacing neurons. The neuron integrates incoming spikes and fires its own spike if the stimulation exceeds a threshold. In an ANN, the neurons and synapses are abstracted to nodes and arrows of computational graphs, representing weighted summation followed by activation and scalar(scalar multiplication, respectively. Reproduced with permission. [40] Copyright 2018, AAAS.
www.advancedsciencenews.com www.advintellsyst.com The latter are junctions interfacing two neurons, which can modulate the signal transmission strength between neurons, forming the basis of our memory. Each neuron accumulates incoming spikes from upper-stream neurons through synapses. Once the stimulation exceeds a threshold, the neuron fires its own spike or action potential that propagates along its axon to reach the downstream neurons. Resistive 1 R and 1T1R cells have been widely reported for their potential to serve as compact hardware synapses, by mapping the signal transmission strength to their conductance. [12,13,[220][221][222][223][224][225][226][227][228][229][230] In addition, chemical synapses own the capability to change connection strength depending on the historic signal that has transmitted through them. This could be replicated using ionic or electronic switching dynamics of 1 R or 1T1R resistive memory cells, which exhibit various short-and long-term synaptic plasticities. Such plasticity is the foundation of the learning capability of biocreatures. In contrast, ANN is an abstraction of SNN, essentially a computational graph where arrows usually represent scalarÀscalar multiplications, whereas nodes stand for summation followed by nonlinear activation functions. (see the left panel of Figure 11) The cascaded nonlinear transformations endow ANNs with the capability to approximate arbitrary functions, provided the size and depth of the network are sufficiently large. [231] Likely in SNNs the 1R and 1T1R cells can serve as the synapses in ANNs. As the current flowing through a 1R or 1T1R is governed by Ohm's law, the multiplication of its conductance and voltage can be naturally mapped to the multiplication of the synaptic weight and the value of the upper-stream node. In addition, the summation can be automatically fulfilled by Kirchhoff 's current law in crossbars, as will be discussed in the next paragraph.
Either an SNN or ANN usually consists of a stack of assorted layers. Typical layer topologies that 1R and 1T1R crossbars have implemented comprise a fully connected layer, convolutional layer, and recurrent layer. As shown in Figure 12a, in a fully connected layer, each input neuron (node) is connected to all output neurons. Therefore,ỹ ¼ Wx, wherex andỹ are the vectors of input and output neurons, respectively. For simplicity, bias and activation are ignored here. W denotes the weights of all the black arrows in the form of a matrix, for example, W i,j stands for the connection strength between the i-th input neuron and Figure 12. Different topologies of neural network layers that have been implemented by 1 R and 1T1R crossbars. a) Fully connected layer. In a fully connected layer, each input neuron connects to all output neurons. The output neuron vector is the multiplication between the input neuron vector and the weight matrix which can be mapped to the conductance of a 1R or 1T1R crossbar. b) Convolutional layer. An input image is scanned by a convolution window. The pixels within the window are element-wise multiplied with a set of kernels before accumulation. The flattened kernels can be mapped to the conductance of a 1R or 1T1R crossbar. c) Recurrent layer. Here an example of a long short-term memory (LSTM) layer is used. An LSTM node has its internal state that is updated by four gates. The vector-matrix multiplications of LSTM nodes can be physically implemented by two 1R or 1T1R subarrays, one for the external input and the other one for recurrent input.
www.advancedsciencenews.com www.advintellsyst.com j-th output neuron. Therefore, the weight matrix W can be conveniently mapped to the conductance matrix of a 1R or 1T1R crossbar. By doing so, the vector-matrix multiplication (or weighted summation) will be physically carried out by Ohm's law for multiplication and Kirchhoff 's current law for summation in one computational cycle, regardless of the dimension of the matrix. This may offer a large throughput and efficiency boost over conventional digital systems, as the data are both stored and processed on the same resistive memory element, which avoids the frequent data shuttling between physically separated memory and processing units in conventional digital hardware that incurs large latency and energy consumption. [1,28,[232][233][234][235][236][237][238] In addition to the fully connected layer, a convolutional layer is shown in Figure 12b, which is mostly famous for its applications in computer vision. The input such as a 2D image will be scanned by a convolution window that is outlined by the green box. The subarray of the input falling to the window will be multiplied element wise with a set of kernels, followed by kernel-wise summation, which completes a stride of the convolution. As flattened kernels can be concatenated as a matrix and mapped to the conductance of a 1R or 1T1R crossbar, such a convolutional stride again becomes a vector-matrix multiplication that can be physically accelerated by crossbars like a fully connected layer. Moreover, Figure 12c shows an LSTM layer, a widely used recurrent layer with nodes that connect to themselves via feedback loops. Such looped connections make a recurrent layer a dynamic system, which has an internal state, which can remember the historic inputs, with wide applications in temporal information processing. Here, each LSTM node consists of four gates, which adds and removes information from its internal state at each time step. The vector-matrix multiplication involved in LSTM can be conveniently mapped to a 1R or 1T1R crossbar with two subarrays. One of the subarrays is multiplied with an external input vector at each time step, whereas the other subarray handles the recurrent input that depends on the output of the crossbar at the last time point. The associated learning of the 1R and 1T1R crossbars can be offline, online, or a hybrid. As shown in Figure 13a, in the process of offline learning, the parameters/weights of a neural network are first learnt on an alternative computing system, such as a digital computer, before being converted to the conductance of 1Rs or 1T1Rs and physically programmed into the crossbars. The crossbar will then be able to work with unseen data or the inference dataset. This approach features the least frequent programming of 1R or 1T1R crossbars, but it has difficulty adapting to the hardware nonidealities, such as bad devices of the crossbar, and is unable to undertake learning in real time. As shown in Figure 13b, online learning refers to the process where the conductance of 1R and 1T1R crossbars is updated during the course of learning, which is considerably challenging as there are concurrent requirements on the programming linearity, precision, energy, and speed.
The learning can also be classified according to the available information. For example, as shown in Figure 13c, the learning can be supervised with example inputÀoutput pairs, and the neural network will be able to learn a mapping between the input and output. In case the input data is not labeled, as shown in Figure 13d, the learning can be unsupervised, which understands the internal structure of the dataset that is frequently used to cluster data. Figure 13e shows the scenario of reinforcement learning, where a learning agent interacts with an unknown environment. The agent receives some information about the environment (so-called state) and a reward signal at each time point. The agent learns the strategy to apply an action to the environment to maximize the accumulated reward signal. Such learning has triumphed over human players in games that were believed humans would long dominate. [239,240] Figure 13. Different types of learnings that have been implemented on 1R or 1T1R crossbars. a,b) In terms of where the neural network parameters are optimized, the learning can be offline, as shown in (a). The optimization is done on a digital platform before converting the parameters to conductance and crossbar programming. In contrast, the learning can be online, as shown in (b), where the crossbar conductance is updated along the course of learning. c-e) In terms of the available information, the learning can be supervised, given the data with paired labels, and the learning aims to find out the mapping between them. Or the learning can be unsupervised if the input data is not labeled, which discovers the structure of the data, for example, by clustering them. Or the learning can be reinforcement, where an agent interacts with an unknown environment to find out a strategy to maximize the accumulated reward.
www.advancedsciencenews.com www.advintellsyst.com We would like to point out that different cell structures are mainly used to mitigate the sneak-path currents in reading and programming a single device. This may be less compatible with the parallel programming operations required by logic-inmemory, such as the IMPLY [241] and MAGIC [241] protocols, as well as the parallel reading used in vector-matrix multiplications [242][243][244] for both machine learning and neuromorphic computing. Thus, we discuss the required performance one by one as follows for data storage applications.
ON/OFF ratio and/or Nonlinearity: The ON/OFF ratio or currentÀvoltage nonlinearity of selecting devices dictates the storage capacity or the size of the memristor array. [245][246][247][248][249] An ideal selecting device would possess infinite resistance when it is unselected (e.g., biased at V half-select ) and zero resistance when it is selected (e.g., biased at V select ). In contrast, a small ON/OFF ratio will clearly impact both read margin during reading [249] and voltage/current delivery during programming. [247] Retention: Threshold resistive switching selectors, such as those based on MIT, [82,250] ovonic switching, [251] and metal-filament formation/rupture, [51] feature nonzero delay of relaxing their conductance back to OFF states upon the cease of selecting signals. Therefore, the retention time affects the read/write throughput, particularly if the reading or writing is conducted in a row-by-row or column-by-column fashion. Diode and tunneling [252] selectors ideally have zero retention, although, in reality, the time to establish the proper bias will be dependent on the parasitic capacitance.
Endurance: Like retention, for those selectors based on threshold resistive switching, they usually exhibit finite endurance or a number of switching cycles before the breakdown of the permanent dielectric layer, which limits the lifespan of the underlying data storage system. Record high endurance of 10 12 has been demonstrated on NbO 2 MIT selectors. [253] Up to 10 8 cycles have also been observed on ovonic [251] and metal-filament formation/ rupture selectors. [51] In contrast, diodes and tunneling selectors ideally have no limit on their lifespan as no resistive switching is needed.

Example of 1R Crossbars
ANNs at UCSB: The team of Professor Dimitri Strukov is among the first in demonstrating fully connected and recurrent ANNs using RRAM 1R crossbars, which applied to both offline and online supervised learning in pattern classification and optimization. Alibart et al. reported the first single-layer fully connected ANN made of TiO 2Àx RRAM crossbars to learn 3 Â 3 binary patterns, via both offline and online supervised learning, [254] whereas a larger Al 2 O 3 /TiO 2Àx RRAM crossbar was built by Prezioso et al. to classify similar patterns. [242] A two-layer fully connected network was developed by Bayat et al. to classify 4 Â 4 patterns with a crossbar of similar RRAMs, using offline supervised learning. The crossbar was paired with analog-hidden neurons to get rid of the tedious analogÀdigital conversions. [255] In addition to fully connected ANNs, a restricted Boltzmann machine, a recurrent stochastic network, has been realized on a 20 Â 20 RRAM 1 R crossbar by Mahmoodi et al. [256] The key feature is the tunable stochasticity using external noisy current injection. As the amplitude of the injected noise can be correlated with the "thermal fluctuation" in an Ising model, a Hopfield network made of 64 Â 64 RRAM 1 R crossbar was used by Mahmoodi et al. to implement stochastic simulated annealing, chaotic simulated annealing, as well as exponential annealing, which shows fast convergence to the global energy minimum than the case without noise injection. [257] ANNs at GIST: The team of Professor Byung-Geun Lee developed a RRAM 1 R crossbar made of Pr 0.7 Ca 0.3 MnO 3 (PCMO) RRAMs in collaboration with POSTECH. Using 192 PCMO cells, Park et al. implemented a single-layer fully connected ANN to classify electroencephalography signals via offline supervised learning. [258] ANNs and SNNs at UMich: Professor Wei Lu's group developed various RRAM 1 R crossbars that have pioneered many novel applications of ANNs and SNNs.
ANN-wise, dimensionality reduction was conducted by Choi et al. using online unsupervised learning on a TaO x RRAM 1R crossbar for principal component analysis of the breast cancer dataset. [259] A similar crossbar used by Jeong et al. was for the classification of the IRIS dataset, which implemented unsupervised K-means clustering through online learning. [260] In addition, Sheridan et al. creatively found sparse representations via a locally competitive algorithm on an offline learnt dictionary physically mapped to a 32 Â 32 WO x RRAM 1R crossbar. [261] Moreover, Cai et al. developed the first integrated RRAM computing system that comes with a 108 Â 54 RRAM 1R crossbar array with on-chip sourcing and sensing circuitry as well as a reduced instruction set computer (RISC) processor built on a 180 nm technology node. [3] Moreover, for optimization tasks, Shin et al. solved a 2D spin-glass problem by mapping the coupling matrix to TaO x RRAM crossbars. The total energy was minimized by flipping a random spin if it reduces the total energy or was decided by a stochastic Cu-based RRAM.
In terms of SNNs, a liquid-state machine, a special SNN is rooted on the concept of reservoir computing, which has been demonstrated by Du et al., Moon et al., and Zhu et al., using the short-term memory of RRAM. Such systems have revealed their advantages in online supervised learning of temporal sequences, with applications in spoken number recognition, [262] chaotic series prediction, [263] and neural firing pattern classification. [25] SNNs at Southampton: The group of Professor Themis Prodromakis creatively devised a scheme to simulate synaptic plasticity using the switching dynamics of TiO 2 RRAMs. Serb et al. demonstrated a simple fully connected SNN with hardware-encoded spike-timing-dependent plasticity (STDP) for online unsupervised learning of pattern clustering. [264] ANNs from Polimi: Professor Daniele Ielmini's team implemented linear and logistic regressions for the first time with RRAM 1R crossbars. Sun et al. reported the training of both linear and logistic regressions on an RRAM 1R crossbar with feedback configuration, which can fast optimize the output layer of an ANN. [265] 7. Examples of 1T1R Crossbars In terms of ANNs, Burr et al. first used 165 000 cells of a PCM 1T1R crossbar with an integrated peripheral circuit to build a three-layer fully connected ANN, which classified the modified national institute of standards and technology (MNIST) dataset using online supervised learning. [266] To resolve the programming linearity and symmetry challenges in online learning, Ambrogio et al. developed a novel hardware synapse by pairing PCM cells with three-transistor-one-capacitor structures, leading to accurate classification of the MNIST dataset with four-layer fully connected ANN and CIFAR-10/100 datasets with a convolutional ANN. [243] Besides online learning, using a novel offline supervised learning, including noise injection and adaptive batch normalization, Joshi et al. classified CIFAR-10 and ImageNet datasets with a ResNet, which makes it powerful enough to handle the very challenging ImageNet with the PCM 1T1R crossbars. [267] In addition to fully connected and convolutional networks, recurrent networks, such as LSTM, were used for offline supervised modeling of language, such as the Penn Treebank dataset, by Tsai et al. [268] Moreover, Karunaratne et al. reported hyperdimensional computing where one PCM 1T1R crossbar stores the high-dimensional correspondents of low-dimensional symbols and computes n-grams using inmemory logic, whereas the other works as an associative memory for inverse-hamming distance, for one-shot supervised learning of language classification. [269] PCM 1T1R crossbars have also been used to implement SNNs. Kim et al. reported a 256 Â 256 2T1R crossbar built on 90 nm CMOS technology equipped with hardware-encoded leakyintegrate-and-fire (LIF) neurons and STDP-capable synapses for autoassociative memory. [270] An upgraded version, consisting of 1.4 Mb PCMs in 6T2R (a variant of 1T1R) units, was reported by Ishii et al. using the same technology node, which physically practiced STDP with asynchronous stochastic CMOS LIF neurons and which experimentally implemented a spiking restricted Boltzmann machine for MNIST classification. [39] In addition, SNNs were used to detect spatiotemporal correlations by Pantazi et al. and Sebastian et al., using either single-layer fully connected SNN on PCM 1T1R crossbar [271] or PCM neurons in the same crossbar, [272] respectively. In addition, Wozniak et al. invented a spiking neural unit characterized by its internal integration dynamics, with applications in both ANNs and SNNs. A fully connected network on PCM 1T1R crossbars paired with such spiking neural units predicted music using online supervised learning. [273] ANN from ASU: Teaming up with Tsinghua, Professor Shimeng Yu reported a 16 Mb computing-in-memory macro that accommodates integrated TaO x /HfO x RRAM 1T1R crossbars and sourcing/sensing circuits using 130 nm CMOS process, which conducted offline and online training of a fully connected ANN for MNIST classification. [274] In addition, convolutional kernels were simulated based on another computing-in-memory macro developed by Professor Jae-sun Seo's team. The chip consists of a 128 Â 64 RRAM 1T1R crossbar with on-chip sourcing/ sensing circuitry, as reported by Yin et al, showing a large energy efficiency in classifying the CIFAR-10 dataset with offline supervised learning. [275] ANNs and SNNs from Tsinghua: Professor Huaqiang Wu, Professor He Qian, Professor Jianshi Tang, and Professor Bin Gao's team explored various applications using ANNs and SNNs based on RRAM 1T1R crossbars.
For fully connected ANNs, Yao et al. used 1T1R crossbars made of HfAl y O x RRAMs to build a single-layer fully connected ANN to classify the Yale face database using online supervised learning. [276] They also teamed up with National Tsinghua in developing a computing-in-memory RRAM macro consisting of a 158.8 Kb 1T1R crossbar fabricated on a 130 nm process, using TaO x analog RRAM and achieving energy efficiency of 78.4 tera operations per second per watt (TOPS/W) (1 bit input/output) in offline supervised learning of MNIST classification. The chip also features innovative sign-weighted 2T2R cells that can largely mitigate the impact of parasitic wire resistance. [277] Such fully connected networks, combined with RRAM crossbar-based finite impulse response (FIR) filters, can recognize epilepsy-related signals using offline supervised learning. [24] Besides supervised learning, Lin et al. demonstrated online unsupervised training of a generative adversarial network on a 1 Kb 1T1R crossbar to generate digits that are like those of the MNIST dataset. [278] For convolutional ANNs, the same team also implemented supervised hybrid learning, a mixture of offline learning and online learning, on a LeNet-5 convolutional network to classify MNIST datasets with duplicated convolutional kernels that further speed up the convolution operation. [244] Recurrent network wise, Zhou et al. conducted image reconstruction with a Hopfield network implemented on a 128 Â 8 1T1R crossbar. [279] Probabilistic models such as Bayesian neural networks have been realized on a 160 Kb RRAM crossbar by Lin et al., thanks to the tunable Gaussian distributions of the read noise of multiple RRAM cells, which classified MNIST handwritten digits. [280] For SNNs, Li et al. experimentally developed a novel biorealistic SNN chip that possesses artificial dendrites made of TaO x / AlO δ RRAMs. These dendrites are paired with HfO x RRAM crossbar synapses and NbO x RRAM artificial somas. The introduction of the dendrite enables hierarchical processing of postsynaptic signals in SNNs. [27] In addition, Liu et al. used RRAM crossbars to parallelly encode the multichannel neural signals, thanks to the nonlinear resistive switching of RRAMs to extract amplitude and variation of inputs as the conductance changes of RRAM 1T1R crossbars. [281] ANNs and SNNs from HPE-UMass: Dr. John Paul Strachan and Dr. Miao Hu from HPE, together with Professor Joshua Yang and Professor Qiangfei Xia from UMass, have codeveloped a 128 Â 64 RRAM 1T1R crossbar. The system has been used to implement offline and online learning in ANNs and SNNs, which explores different network topologies and types of learning.
ANN wise, supervised and reinforcement learning have been implemented on the fully connected networks. Hu et al. [282] and Li et al. [283] implemented single-layer and two-layer networks to classify MNIST datasets, using offline and online supervised learning, respectively. In addition to supervised learning, Wang et al. demonstrated online reinforcement learning with three-layer fully connected networks on the same 1T1R crossbar to solve classical control problems, including cart-pole and mountain-car. [2] For convolutional networks, Wang et al. implemented a LeNet-5-like network that classified the MNIST dataset using online supervised learning. [284] Recurrent network wise, Li et al. [285] and Wang et al. [286] implemented LSTM and convolutional LSTM, respectively, to classify human walking gait extracted from the USF-NIST gait dataset and small synthetic videos, respectively. For the optimization task, Cai et al. used the intrinsic random telegraph noise as a random signal source in a similar RRAM 1T1R crossbar, which translates to tunable temperature in simulated annealing via tuning the signal-to-noise ratio. [286] Li et al. further downsized RRAMs to nanoscale in a computing-in-memory macro using TSMC 180 nm technology node. [287] In addition to accelerating SNNs, Wang et al. developed diffusive memristors that feature spontaneous filament rupture due to minimization of interfacial energy. [13] Such devices have been integrated with 1T1R crossbars to perform autonomous online learning using simplified synaptic plasticity to cluster patterns [61] and used as spiking neurons in a liquid-state machine to classify MNIST. [288] ANNs by Panasonic: Mochida et al. have developed two computing-in-memory RRAM macros, one with 2 Mb 1T1R crossbars whereas the other with 4 Mb, using 180 and 40 nm technology node, respectively. These macros classified the MNIST dataset while revealing an energy efficiency up to 66.5 TOPS/W. [289] SNNs from Polimi: Professor Daniele Ielmini's group invented a novel solution to address the stochasticity of RRAM in reliably implementing a supervised variant of STDP rule using RRAM 1T1Rs, as reported by Wang et al. The SNN powered by 1T1R synapses has been applied to spatiotemporal pattern detection and sound localization. [40] ANNs from National Tsinghua: A series of computing-inmemory RRAM macros have been developed by the team of Professor Marvin Chang from National Tsinghua University using TSMC CMOS and RRAM technology, including 1 Mb 1T1R crossbars macro using a 65 nm process, [290,291] 1 Mb 1T1R crossbars macro using a 55 nm process, [292] and 2 Mb 1T1R crossbars macro using a 22 nm process. [293] All the reported macros have been experimentally benchmarked in accelerating either fully connected ANNs or convolutional ANNs for pattern recognition via offline supervised learning, such as ResNet for the CIFAR-100 dataset, with a record high energy efficiency up to 121.38 TOPS/W (1 bit input) demonstrated. [293] SNNs from Duke: Professor Hai Li and Professor Yiran Chen's team has pioneered architecture design and algorithms for resistive memory crossbars in machine learning and neuromorphic computing. [294,295] Recently, with joint efforts from National Tsinghua University, their team developed a 64 Kb RRAM macro based on TiN/Ti/HfO 2 /TiN RRAM crossbars built on TSMC 150 nm process, as reported by Yan et al. [296] This macro has hardware spiking LIF neurons, which lead to energy efficiency of 16.9 TOPS/W in offline supervised learning of classifying CIFAR-10 images.
SNNs from CAS and Fudan: Professor Qi Liu, Professor Hangbing Lv, Professor Shibing Long, Professor Dashan Shang, Professor Ming Liu, and their colleagues have made important contributions to RRAM mechanisms, [297] electrical property engineering, [52,298,299] and novel material crossbars, [300] which have also led to innovations in SNNs based on 1T1R crossbars.
For example, Zhang et al. reported a single-layer ANN-to-SNN conversion enabled by compact NbO 2 RRAM spiking neurons, which implemented rectified linear units (ReLUs). [301] The neurons are paired with a 640 Â 10 RRAM 1T1R crossbar to classify the MNIST dataset using offline supervised learning. Besides offline training, Zhang et al. developed a hybrid analogÀdigital spiking neuron powered by Ag-RRAMs, which not only realized LIF neural function but also enabled hardware-encoded synaptic plasticity in a two-layer fully hardware SNN that practiced online unsupervised learning for pattern clustering. [302] To further explore the efficiency of SNN, Zhang [303] Also, to make the SNN interact with the environment, the same group demonstrated an artificial spiking afferent nerve based on a NbO 2 device for converting sensed analog signals to spiking frequency processed by SNN, which paves the way to building a selfaware SNN machine. [26] ANNs from NJU: Professor Feng Miao and Professor Shijun Liang's group invented an integrated sensingÀprocessing system consisting of retinomorphic sensors made of WSe 2 /h-BN/Al 2 O 3 heterostructure and Pt/Ta/HfO x /Pt RRAM 1T1R crossbars, which implement a fully connected ANN and a recurrent ANN for letter recognition and object tracking. [304] ANNs from UPenn and CEA-Leti: Professor Jing Li's team worked together with CEA-Leti on the development of liquid silicon, the codename of a hybrid digitalÀanalog processor that contains HfO 2 RRAM 1T1R crossbars built on the 130 nm CMOM process. As reported by Zha et al., the processor achieved a 60.9 TOPS/W energy efficiency in conducting a binary ANN inference. It also comes up with a compilation framework that interfaces with high-level programming language while optimizes hardware resources. [305] In addition to deterministic models, the stochastic programming of HfO 2 crossbars has been used by Dalgaty et al. to implement Markov chain Monte Carlo, specifically the MetropolisÀHasting algorithm. They physically sampled the posterior distribution of a Bayesian model using the conductance of the 1T1R crossbar, with applications in online reinforcement learning. [305] ANNs and SNNs from Stanford: The work of Professor Philip Wong's team has a long-lasting impact on the advancement of PCM and RRAM technology, as well as their computing applications. [306,307] In terms of ANNs and 3D integration, Li et al. reported one-shot learning to classify European language with high-dimensional computing, where multiplicationÀadditionÀpermutation are experimentally carried out by four-layer 3D 1T1R crossbars. [308] In addition, the joint efforts between Professor Subhasish www.advancedsciencenews.com www.advintellsyst.com Mitra and Professor Philip Wong led to the birth of the first 3D nanosystem, which consists of a vertically stacked RRAM crossbar layer, carbon nanotube transistor layers, as well as a digital logic layer, which is of interleaved sensing, computing, and data storage with dense connections across layers. [309] Yang et al. demonstrated the integration of 2D molybdenum disulfide (MoS 2 ) transistors with RRAMs into a 1T1R memory cell, which has low fabrication temperature and is suitable for monolithic 3D integration. [310] They have further integrated 2D MoS 2 transistors with RRAMs into ternary content-addressable memory (TCAM) cells, which are suitable for parallel in-memory search of massive data. [311] Moreover, Feng et al. reported a fully printed flexible MoS 2 memristive artificial synapse with femtojoule switching energy, showing its potential ability of demonstrating energy-efficient artificial neuromorphic computing, [312] and Chen et al. proposed an ideal memristive device based on 1 T-phase MoS 2 nanosheets, exhibiting a unique memristive behavior due to voltage-dependent resistance change. [313] In terms of recurrent SNNs, Eryilmaz et al. reported a Hopfield network consisting of a 10 Â 10 PCM 1T1R crossbar, which implemented Hebbian plasticity for associative learning of simple patterns. [314] In collaboration with National Tsinghua, the team reported a computing-in-memory RRAM macro built on the 130 nm technology node. A unique feature of this macro, as reported by Wan et al., is that there are 16 Â 16 subcores, where each subcore possesses a 16 Â 16 1T1R crossbar and an associated CMOS LIF neuron, on a reconfigurable communication fabric allowing flexible dataflow. It demonstrated an energy efficiency of 74 TMACS/W in implementing a restricted Boltzmann machine for image reconstruction. [315] ANNs and SNNs from PKU: Professor Yuchao Yang and Professor Ru Huang's team and Professor Jinfeng Kang's team have not only advanced the resistive switching mechanisms [316,317] and materials, [317,318] but also ANNs and SNNs made of RRAM crossbars.
For fully connected ANNs, Jiang et al. reported a single-layer network that interfaces with a digital camera through an FPGA for offline supervised learning to recognize printed digits. [320] In addition, Zhou et al. developed a 1 Kb TaO x /HfO x RRAM crossbar using a 130 nm technology node, which can implement online supervised training of a binary multilayer fully connected ANN for MNIST recognition. [321] A new scheme of this binary network is its capability to mitigate the RRAM stochasticity in encoding weights, where the weights are determined by the comparison of conductance between a pair of 1T1R cells. The same crossbar has been applied to convolutional ANNs, as reported by Zhang et al., using a digital propagation module in addition to the RRAM crossbars and extra circuit-level techniques to mitigate the RRAM stochasticity. [322] For recurrent ANNs, Yang et al. devised a novel Hopfield network to conduct chaotic simulated annealing. The network is mapped to Ta/TaO x /Pt RRAM crossbars. A unique feature is that the diagonal RRAMs were programmed along the course of optimization and the nonlinear conductance evolution would enlarge the probability of finding global optimum, while achieving fast convergence, with applications in problems like Max-cut. [323] In addition to ANN, Duan et al. reported a fully RRAM-based SNN, consisting of NbO x -based RRAM neurons with unique spatiotemporal integration capability and neural gain, which leads to online supervised learning of simple pattern classification and coincide detection. [324]

Conclusions and Perspective
Memristive device represents a promising solution for nextgeneration SCM due to its simple device structure, excellent scalability, fast programming, large program/erase endurance, long retention, and good compatibility with CMOS process. To address the sneak-path current issue, different unit cell designs including 1S1R, 1T1R, 1D1R, 1BJT1R, CRS, SRC, and SSC have been systematically surveyed. Each unit cell design has its own ceiling and cannot simultaneously offer all aforementioned merits of resistive memory at the same time. For example, 1T1R and 1BJT1R lose the advantage of high-density crossbar arrays because of the additional space required for the transistor and complicated high-temperature fabrication processes. CRS inevitably results in a destructive reading issue. 1D1R and SRC can only be paired with the unipolar memories in most cases, limiting their applications. 1S1R needs further optimization of nonlinearity, on/off ratio, etc. Therefore, the search for novel material systems, device structures, and electrical operation schemes to completely unleash the potential of resistive switching memory would be of ultimate importance for high-density storage memories.
On the one hand, the same set of electrical properties of memristors are critical for in-memory machine learning and neuromorphic computing, which has the potential to solve the von-Neumann bottleneck and the scaling issue of transistors. 1Rs or 1T1Rs have been used as building blocks to physically implement hardware ANNs and SNNs. 1R crossbar arrays possess better scalability compared with 1T1R crossbar arrays, although the programming is usually more expensive in terms of time and energy due to the presence of sneak-path currents. In contrast, transistors in 1T1R crossbar arrays can impose current compliance, which benefits the forming process and analog programming of resistive switches, improving the array yield. Moreover, transistors together with memristors have implemented complicated synaptic plasticity on a large scale. These advantages have lead to the flourish of 1T1R crossbar array-based computing.
However, the high energy consumption due to the high current, larger-than-expected cell size due to the transistors, and device stochasticity are the main obstacles that hinder the commercialization of this technology. To address such issues, novel resistive switching materials such as lowdimensional materials, new device structures for synapses and neurons, as well as innovative circuit and algorithm designs, are promising to be the next transformative computing technology.