Back‐End‐of‐Line Integration of Synaptic Weights using HfO2/ZrO2 Nanolaminates

In artificial neural networks, the “synaptic weights” connecting the neurons are adjusted during the training. Beyond silicon, functionalizing the back‐end‐of‐line (BEOL) of CMOS circuits with novel materials is a key enabler for deploying neural network accelerators. The hardware implementation of the synaptic weights requires linear and reprogrammable resistive elements. In ferroelectric tunnel junctions, the resistance is programmed by controlling the configuration of the ferroelectric domains with electrical pulses. From ferroelectric HfZrO4 to HfO2–ZrO2 nanolaminates (NL), the crystallization temperature lowers below the upper limit of 400 °C required for CMOS BEOL. The device footprint is reduced, and the maximum‐to‐minimum conductance ratio increases from 7 to 32. Operated with pulses in the ultra‐fast (20 ns) and biological (500 µs) timescales, the synaptic plasticity exhibits several regimes. Dynamic hysteresis mode characterization after up to 1011 switching cycles indicates the coexistence of ferroelectric and non‐ferroelectric effects such as defect rearrangement. Temperature‐dependent transport measurements in the Ohmic (linear) regime support these conclusions. Multi‐level resistive switching is achieved in HfO2–ZrO2 NL co‐integrated to CMOS in the BEOL. 1T‐1R operation is demonstrated, paving the way for hardware implementation of synaptic weights for in‐memory neuromorphic computing.


Introduction
Information technology is facing an urgent challenge, as global data traffic increases faster than the performances of DOI: 10.1002/aelm.202300649conventional central processing units.In particular, the data transfer between the processor and the memory limits the scaling of compute performances, known as the "Von Neumann bottleneck." [1]Collocating memory and processing, neuromorphic architectures emerged as an efficient solution for processing large data-sets. [2]In particular, when high precision is not required, the vector-matrix multiplications executed at each layer of deep neural networks can be implemented in the analog domain: the vector is encoded in voltage signals, which drop through a crossbar array of non-volatile, reprogrammable resistive memories. [3]By analogy with the brain, the latter are the "synaptic weights" of the neural network.Provided that the resistance is a linear function of the voltage, and according to Ohm's and Kirchhoff's laws, measuring the output currents provides the result of the sum of the individual multiplications performed at each resistive memory element.
Ferroelectric materials emerged as interesting candidates for the hardware realization of neuromorphic architectures. [4]The basis of the hardware primitives for neuromorphic hardware are the three different memory concepts based on ferroelectric materials.Namely, ferroelectric field effect transistors (FeFET), ferroelectric random access memories (FeRAMs), and ferroelectric tunneling junctions (FTJs). [5]Fe-FETs have three terminals: the ferroelectric field effect is used to deplete or accumulate carriers in a channel.FeRAMs [6] consist of ferroelectric capacitors.The state is stored in the configuration of the ferroelectric domains.They have been commercially available since 1993, but function with a destructive read, leading to reliability limitations.In ordinary usage, "FTJs" do not only refer to ferroelectric layers across which electrons actually tunnel.It can refer in a broader way to two-terminal, ferroelectric, memristive structures in which the state (or the "synaptic weight" for neuromorphic applications) is read non-destructively by measuring the resistance. [7]The ferroelectric switching occurs only above a certain threshold, which confers excellent retention and limits crosstalk in passive and active crossbars. [8]It also makes ferroelectric synapses compatible with bio-inspired schemes such as spike-timing-dependent-plasticity (STDP) [9][10][11] or paired-pulse facilitation/depression. [12] Finally, ferroelectric switching is a purely field-driven mechanism, conferring excellent endurance potential (>10 14 cycles) [13] for the memories.In neuromorphic circuits, ferroelectric materials are not only considered for the implementation of synaptic functionality: during a switching event, the displacement current emulates the integrate-and-fire function of neurons. [14]he interest in ferroelectric materials for non-volatile memories and neuromorphic computing is further nourished by the Complementary Metal-Oxide-Semiconductor (CMOS) compatibility of hafnia-based ferroelectrics. [15]The first portion of integrated circuits manufacturing is also known as front-end-ofline (FEOL).28 nm technology-based FEOL FeFETs are available from GlobalFoundries CMOS facility: on such scaled devices, the number of ferroelectric domains is small and synaptic weights of 2bits/cell are typically reported. [16]Integrating the synaptic weights in the back-end-of-line (BEOL) instead, relaxes some constraints on the footprint.BEOL integration allows for larger devices, and consequently a larger number of ferroelectric domains.It also provides more flexibility in the choice of materials, thereby constituting an ideal platform for research on novel materials.Finally, it would allow to functionalize the upper levels of the integrated circuits, with gains in connectivity and circuit footprint.Several concepts of synaptic weights compatible with a BEOL integration exist such as the ferroelectric/dielectric doublelayer tunnel junctions, [17] the ferroelectric field-effect transistors with an oxide channel, [18,19] or the "FeMFET." [20]In this last concept, a ferroelectric capacitor is connected by a plug to the dielectric capacitor of a Metal-Oxide-Semiconductor Field Effect Transistor.
There are, however, some challenges proper to BEOL integration: to avoid damage to FEOL components and interconnects, the process temperature must be lower than ≈400 °C.For ferroelectrics, the most critical step in terms of thermal budget is the crystallization.Zr doping favors the crystallization at lower temperatures compared to other dopants such as Si. [21]Most examples of BEOL-compatible ferroelectric devices are made from a 10 nm thick hafnia-based layer: the crystallization is performed at 400 °C or less by rapid thermal annealing, [22,23] nanosecond laser annealing, [24] or millisecond flash lamp annealing. [25]BEOL integration of FTJs based on 10 nm HfO 2 :Si/1.5 nm Al 2 O 3 "doublelayer" technology was recently achieved by Grenouillet et al.,   demonstrating multi-level memory windows in scaled capacitors having an area of 0.36 μm 2 . [26]afnia-based ferroelectrics and their fluorite structures are not only interesting for their CMOS compatibility. [27]In contrast with classical perovskites, the ferroelectric, orthorhombic Pca2 1 phase in hafnia is promoted by surface energies. [28]In 1-nm thick films, not only ferroelectricity was observed [29] but also the negative differential capacitance effect was reported. [30]Reducing the ferroelectric film thickness in ferroelectric memristors has three main technological advantages: i) smaller voltages are then required to fully switch the ferroelectric polarization.A large polarization of 2P r = 108 μC cm −2 was recently demonstrated in 4 nm thick HfZrO 4 capacitors operated at 1.5 V, [31] and multi-level resistive switching was obtained below 1 V as the thickness becomes smaller than 3 nm. [32]ii) The resistance of synaptic weights based on ultra-thin films can on one hand vary linearly with the reading bias and on the other hand non-linearly with the programming bias.Indeed, the linearity during reading (or Ohmic behavior) is required for multiplying and accumulating multi-level operations. [33]Whereas during programming, a rectifying behavior is beneficial to prevent crosstalk. [34]Finally, reducing the thickness of the ferroelectric layer is iii) a path to meet requirements in terms of available area and minimal read currents, in particular in the "OFF" or "high resistive state" (HRS).Lancaster et al. further underline the importance of increasing current densities in FTJs-based neuromorphic circuits, rather than focusing on the On/Off ratio alone. [35]However, the thermal budget required for the crystallization of ultra-thin films increases as the thickness decreases, [36] preventing the demonstration of HfZrO 4 two-terminals, multi-level synaptic weights effectively integrated into the BEOL of CMOS.
In this work, the above-mentioned challenges are tackled by partitioning the HfZrO 4 layer in nanolaminates (NL) of HfO 2 and ZrO 2 thin films.Atomic layer deposition (ALD) allows the precise control of oxide layer thicknesses, with a growth rate below 5 Å per cycle.Ferroelectricity was recently reported in HfO 2 /ZrO 2 bilayers [37] and nanolaminates. [38]Epitaxial relationship between Hf-rich and Zr-rich nanolayers was observed in ALD-deposited nanolaminates, [39] also referred to as "superlattices."Hafniabased superlattices were investigated to tailor the distribution of coercive fields for non-volatile memory applications, [40] and to increase the domain density. [41]In general, ferroelectric oxide superlattices are still relatively unexplored but are an exciting playground for the investigation of exotic phenomena [42] such as topological effects. [43]he manuscript is organized as follows.First, the synaptic weight functionality of the nanolaminates on silicon is electrically characterized and compared to standard HfZrO 4 -based devices.Second, the BEOL integration of the synaptic weights based on the nanolaminates is demonstrated.Multi-level resistive switching is shown in 1R and 1T-1R configurations.Finally, the mechanisms contributing to the resistive switching are investigated.

From Solid Solutions to Nanolaminates
In view of their BEOL integration, the fabrication of HfZrO 4 -based synaptic weights on Silicon was previously As sketched in (c), after each pulse of amplitude V w and duration t w = 5e-4s, the resistance R 100 mV is measured by a DC read from −100 to 100 mV.For the same WO x interlayer thickness (60 ALD cycles) the On/Off ratio increases from 7 for the SS to 18 for the NL (red squares).By decreasing the number of ALD cycles for WO x to 45, the On/Off further increases to 32 (red circles).
demonstrated. [44]The two-terminal devices consisted of a ferroelectric HfZrO 4 /WO x bilayer, between two TiN electrodes.In this bilayer, the asymmetric energy band diagram of the ferroelectric junction (and as a result the conductance) varies depending on the direction of the ferroelectric polarization (toward or outward the WO x layer).The weight is thus stored in the configuration of the ferroelectric domains.The choice of WO x as an interlayer is motivated by two aspects.First, in contrast with double-layer FTJs based on interlayers, such as Al 2 O 3 , using a transition metal oxide avoids reliability issues due to charge trapping in the dielectric.Second, compared to other metallic or semiconducting oxides, WO x has the advantage of being BEOL compatible, as various technologies use W metal for the interconnects.TiO x was also evaluated as an interlayer, but the fabricated devices required a wake-up process and exhibited a limited On/Off ratio. [45]The TiN/HfZrO 4 /WO x /TiN stack was deposited by ALD, a method enabling a conformal deposition, suitable for the BEOL integration on pre-processed wafers.The amorphous, as-deposited HfZrO 4 was crystallized into the ferroelectric phase by a milli-second flash lamp annealing at only 400 °C.Later, the developed process was applied to an actual CMOS wafer (XFAB 180 nm technology).On this substrate, the 3.5 nm thick HfZrO 4 film could not crystallize at 400 °C.Thicker HfZrO 4 of 5 nm requires a smaller crystallization temperature (375 °C on silicon, [46] compared to 400 °C for 3.5 nm films), and this option was evaluated.However, the read currents of the scaled devices based on 5 nm HfZrO 4 on CMOS were below the targeted values of 10 pA.Therefore, a novel route for the fabrication of BEOL-compatible ferroelectric synaptic weights was investigated.
The crystallization temperature of ZrO 2 is lower than that of HfO 2 , [47] and ferroelectricity was observed in nanoscale ZrO 2 films grown by ALD at 300 °C. [48]Recently, ferro-and antiferroelectric orders were demonstrated in as-grown ZrO 2 -HfO 2 superlattices deposited at only 270 °C. [49]Inspired by these results, the functionality of TiN/ZrO 2 -HfO 2 NL/WO x /TiN synaptic weights was investigated.It is compared to TiN/HfZrO 4 solid solution (SS)/WO /TiN devices.In Figure 1a, the synaptic plasticity is characterized in circular devices (diameter: 100 μm) for three stacks, schematized in Figure 1b.The programming scheme is represented in Figure 1c.The weight is programmed with a voltage pulse of duration t w = 5e-4s, and an increasing amplitude V w (upper panel in Figure 1a) on the top electrode.After each programming pulse, the resistance R 100 mV is read at a DC bias of 100 mV.The reference (blue diamonds) was fabricated using 60 ALD cycles for WO x -and 30 supercycles for the HfZrO 4 solid solution, alternating one cycle with HfO 2 , and two cycles with ZrO 2 .The HfZrO 4 solid solution was replaced by a nanolaminate of five supercycles alternating five cycles with HfO 2 , and ten cycles with ZrO 2 (open red squares).The estimated as-deposited thickness is 5 nm for the solid solution and 4 nm for the nanolaminate.Despite this difference, the devices exhibited a similar resistance R 100 mV in the "Off" state (R OFF ), that is, after being programmed with 4.4 V.As the read current scales with the area and is limited by the electronics sensitivity, the R OFF determines the footprint of the devices in the circuit.The resistance in the "On" state is significantly lower for the nanolaminate, with an On/Off ratio increasing from 7 to 18, two desirable features for neuromorphic applications. [35]For the same ZrO 2 -HfO 2 nanolaminate, the R OFF and the On/Off ratio are further improved by After each pulse of amplitude V w and duration t w = 5e-4 s, the resistance R 100 mV is measured by a DC read from −100 to 100 mV.Underneath a threshold of V w = 300 mV, R 100 mV is constant.For pulses increasing V w (ii,iii), R 100 mV increases by a factor On/Off = 26.Further pulses of positive, but decreasing, are applied: in sector (iv) R 100 mV increases due to contributions that are probably related to another mechanism than ferroelectricity.These contributions are reversible: the cycle-to-cycle variation, represented in the right panel, is overall +/− 5%.
reducing the number of cycles for WO x to 45 (red circles).Solid solution and NL devices of the same oxide thicknesses could not be directly compared, as they required different programming biases.Current-voltage characteristics represented in Figure S1, Supporting Information, however, indicate an improved R OFF for the solid solution compared to the NL, but a smaller On/Off ratio.

Cumulative Contribution Under Long Pulses
From Figure 1, we observe three regimes for the "synaptic depression" (i.e., the change from On to Off states).For V w increasing from 0 to 300 mV, R 100 mV is below the threshold and is constant.For V w increasing from 300 mV to 2 V, the depression is steeper than for V w increasing from 2 to 4.4 V.For the "synaptic potentiation" (pulses of negative amplitude) the threshold is also of 300 mV, followed by a steep decrease in the resistance and a saturation.We further investigate the existence of different regimes by applying a sequence of pulses with increasing amplitude from −3.2 to 3.6 V, followed by pulses of decreasing amplitude from 3.6 to −3.2 V, with a step of 100 mV.The sequence is shown in the upper panel of Figure 2. Again, the resistance R 100 mV was measured after each programming pulse (black circles).We observe in this fourth regime (iv) that R 100 mV continues to slightly increase upon pulses of decreasing, positive amplitude.The effect is reversible: the cycle-to-cycle variation is overall +/− 5%.For pulses of 2 μs, this fourth regime is not visible as R 100 mV remains constant upon pulses of decreasing, positive amplitude (Figure S2, Supporting Information).
We further lower the pulse duration t w to the limit of our pulse generator.Ultra-fast programming is achieved in the synaptic weights, as t w reaches 20 ns (Figure 3).R 100 mV is measured after pulses of amplitude V w increasing from 0 to 6 V and decreasing from 0 to −4 V. We note that for the three t w investigated (5e-4s, 2e-6s, 2e-8s), the variation of R 100mV exhibits a clear saturation in the ON state, but not in the OFF state, despite applying larger bias in the positive polarity.As observed in Figure 3, for this short t w , only two regimes are observed: i) a constant R 100 mV below the threshold of 2 V, and ii) a steep depression as V w increases from 2 to 6 V.This observation is addressed in the "Discussion" section.Similar to the results of Figure 1, the NL stack exhibits a smaller R OFF and a larger On/Off ratio than the SS measured in the same conditions (Figure 3).

Retention in the Long Pulse Regime
The retention of the programmed weights with long pulses (t w = 5e-3 s), was investigated using ten distinct devices.Devices #0-8 were first set in the low resistive state (LRS) by a DC voltage sweep from 2.4 to −1.6 V. Device #9 was kept pristine.After each pulse of amplitude V write , the resistance R at V read = 100 mV is measured by an I-V sweep from −100 to 100 mV.On device #0, a full R-V write loop was performed: V write increases from 0 to V max = 2.4 V by step of 100 mV, then decreases to −1.6 V and increases again to 0. On devices #1 to 7, V write , increases to an intermediate maximal value V max , then decreases but remains positive.As seen in Figure S3a, Supporting Information, for devices #1(V max = 2 V), #6, and #7 (V max = 2.4 V), the resistance keeps increasing upon pulses of decreasing, positive amplitude.For devices #2 (V max = 1.6 V), #3 (V max = 1.2 V), #4 (V max = 0.9 V), and #5 (V max = 0.4 V), the resistance stays constant upon pulses of positive, decreasing amplitude.To quantify the retention of these devices programmed under different conditions, the resistances of the devices are occasionally measured for 3 weeks.The results are represented as a function of the time elapsed between the programming and the measurement in Figure S3b, Supporting Information.The resistance is measured by an I-V sweep from −100 to 100 mV, visualized in Figure S3c, Supporting Information, for all the devices and dates.We observed that for the first group, programmed with a large, positive pulse, the resistance slightly drifts toward higher resistances: device #1: +7% change from the first to the last data point, #6: +10%, #7: +10%.Their resistance after 3 weeks exceeds the resistance of the device which was kept pristine (#9, −4%).This change of resistance in the pristine device could be caused by the read measurements, despite the small bias of +/−100 mV.The devices #0 and #8, programmed in the LRS with a positive bias followed by a negative bias (−1.6 V), see their resistance increasing by +7% and +9%.The devices programmed with a moderate, positive pulse see their resistance vary by: #2: −5%, #3: 4%, #4: 4%, and #5: 7%, showing that the intermediate configurations have overall better retention than the LRS and the HRS.Regardless of the programmed state, the devices exhibit good retention properties, with well-defined states maintained after several weeks.

Resistive Switching Dynamics
To further investigate the dual dependence of the resistive switching on the amplitude and on the duration of the pulses, hysteresis resistance loops were performed with various pulse durations, represented in Figure S4a, Supporting Information.The device was circular with a diameter of 60 μm.After each pulse of amplitude V write and duration t w , the resistance at V read = 100 mV is measured by a DC sweep from −100 to 100 mV.The step for V write is 100 mV.The maximum voltage was adjusted to avoid the breakdown of the device.For each t w , three loops were measured to ensure that the device was in a steady operation, and only the third loop was represented.The switching branches of the hysteresis loop (for pulses of decreasing and negative amplitude, and pulses of increasing and positive amplitude) are then derived (Figure S4b, Supporting Information).In Figure S4c, Supporting Information, the logarithm of the pulse duration t write is represented as a function of the inverse of V C , defined as the voltage for which the derivative is maximal.For the negative bias, the relation is linear, following the Merz law describing switching dynamics in ferroelectrics.For positive bias, a larger slope is observed as t w ex-ceeds the microsecond, confirming the existence of two regimes for this polarity.

Two Programming Schemes: Increasing Amplitude or Increasing Duration
The programming of the synaptic weight was also achieved using pulses of constant amplitude V w , but of increasing duration t w , as sketched in Figure S5a, Supporting Information.The resistance R 100 mV was measured after each programming pulse (black circles, Figure S5b, Supporting Information).The change in the resistance for the "potentiation" (from the Off to the On state) is more gradual than for the increasing amplitude scheme, but the On/Off ratio is smaller.Both schemes require programming voltages below 5 V.
To compare the two schemes, we feed the Neurosim MLP+ Simulator [50] with our experimental results.In particular, the model requires "non-linearity" parameters for the long-term potentiation (LTP) and depression (LTD) and is estimated using an exponential fit, as described in ref. [51].The fit for the increasing V w scheme and the increasing t w scheme can be found in Figure S5c,d, Supporting Information.The predicted learning accuracy on the MNIST dataset of a multilayer perceptron with 400, 100, and 10 neurons in the input, hidden, and output layers is represented in Figure 4.The complete list of parameters used for the simulation can be found in the Supporting Information.The increasing amplitude (yellow circles) and the increasing duration (blue squares) schemes both reach an accuracy of 85%: however, the increasing amplitude scheme requires smaller numbers of epochs.The use of an exponential fit for the potentiation data (blue symbols in Figure S5c,d, Supporting Information) is not optimal.We nevertheless verified that the simulation results were mostly governed by the On/Off ratio and by the "worst" non-linearity parameter, that is, that of the depression data (red symbols).

Back-End-of-Line Integration of Synaptic Weights
The TiN/ZrO 2 -HfO 2 NL/WO x /TiN were integrated into the BEOL of XFAB 180 nm technology.The simplified process flow for the definition of the devices is found in Figure S6, Supporting Information.Optical lithography was used to pattern the layers.The devices were measured using access pads.The contacting vias on the as-received wafer have a dimension of 260 × 260 nm 2 , whereas the junction area is 215 μm 2 .Figure 5a represents the current-voltage characteristics of BEOL integrated synaptic weight after DC programming with V w = −2 V (ON state, yellow squares) and with V w = 1.8 V (OFF state, black squares).The read currents are well above the target value of 10 pA.After programming the devices in the LRS, we observed that the onset of the resistive switching occurred at a DC bias of 100 mV.Consequently, the reading bias was lowered to +/−75 mV.In Figure 5b, the read resistance R read is represented against the programming voltage V w for V read = −75 mV (filles diamonds) and +75 mV (open circles), demonstrating multi-level resistive switching.
In another test structure, the top electrode of a BEOLintegrated synaptic weight is connected to the drain of an NMOS transistor.In Figure 6, the multi-level 1T-1R operation of the co-integrated elements is shown.Before each test, the synaptic element is programmed in the ON state.In Figure 6a, the current I N1 through the synaptic weight is represented as a function of the programming bias V PL , while the source of the NMOS (BL) is grounded.For a gate voltage V WL = 0.5 V (red curve) on the NMOS, the current through the synaptic weights increases with V PL and saturates at 0.3 nA.The value of 0.3 nA corresponds to the I d -V G characteristics of the transistor, represented in Figure 6b.It confirms that the latter is functional and acts as a current compliance on the synaptic element.In Figure 6c, the potential V PLN1 at the drain of the NMOS is represented as a function of V PL .As the transistor limits the current, the potential at the node N1 rises (drain of the transistor, see Figure 6d).As V PL decreases from 2 to 0 V, no hysteresis is observed (Figure 6b,c, red curves).
For V WL = 0.6 V (Figure 6b, orange curve), in addition to a saturation of I N1 at 5 nA, a hysteresis is observed, indicating that the synaptic element partially switched.For this regime, a memory window of 45 mV opens in the V N1 -V PL representation (Figure 6c, orange curve) which under this read condition also depends on the selected gate voltage of the NMOS.As V WL increases (green, blue, and purple curves), the compliance current during the V PL sweeps to 2 V increases.As a result, the synaptic weight is further programmed from the ON toward the OFF state, and a memory window of up to 200 mV in the V N1 -V PL representation can be expected for a read voltage of 0.5 V applied to the gate of the NMOS.The co-integration of BEOL HfO 2 /ZrO 2 synaptic weights and CMOS in this configuration is for example the building block of a 1T-1R array, paving the way to the hardware implementation of neuromorphic computing. [52] Discussion: Resistive Switching in ZrO 2 -HfO 2 Nanolaminates/WO x Structures Further electrical characterization was performed on the nanolaminate stack on silicon.First, a ferroelectric analyzer was used to cycle eleven distinct devices.Triangular pulses of amplitude 3 V were applied with a frequency of 5e5 Hz (Figure 7a).Nine times per decade of cycles, a dynamic hysteresis mode (DHM) measurement (3 V, 5e5 Hz) was executed to measure the DHM remanent polarization (DHM Pr).It is calculated from the current-voltage measurements (Figure 7b), during four bipolar triangular excitation signals, as explained in ref. [53].The contribution of the leakage currents dominates, resulting in a DHM Pr orders of magnitude larger than the actual ferroelectric polarization.In such leaky thin films, the latter cannot be quantified using the PUND technique.[53] The endurance test shows good reproducibility from device to device and no dielectric breakdown after 1e11 cycles.The effect of the endurance test on the device functionality is shown in Figure S7, Supporting Information.The devices were programmed in the low (resp.High) resistive state (resp.HRS) with −3VDC (resp.+3VDC).The resistance-area product was measured by a DC sweep from −100 to 100 mV.A large On/Off ratio (>20) is maintained for devices cycled 107 times.However, it drops to 2 for the device cycled 1011 times, along with a decrease of the OFF resistance by a factor of 390.
From Figure 7b, we observe that the pristine cycle (black curve) shows two local maxima for the positive bias, which could indicate the presence of an as-grown anti-ferroelectric phase, as observed in HfZrO 4 -ZrO 2 nanolaminates. [54]From 5e5 to 1e11 cycles, we observe the presence of one local maximum for each polarity, characteristic of ferroelectric materials.The voltage at which the negative maximum is observed shifts from −0.85 to −0.36 V upon cycling, whereas no shift is observed for the positive branch.This could indicate a different voltage drop across the WO x interlayer before and after cycling.The stoichiometry of the WO x interlayer and the eventual presence of an antiferroelectric phase are currently under study.
From Figure 7b, we also see that the ferroelectric switching for the positive branch clearly occurs below 2 V. Earlier, we underlined that no saturation of the resistive switching was observed upon the application of a higher bias.We also observed one (resp.two) additional regime(s) for the depression, as the pulse width increased from 20 ns to 2 μs (resp.500 μs).Based on the ferroelectric characterization this is understood as follows: for short pulses, the resistive switching is purely dominated by electrostatic effects, caused by the switching of ferroelectric domains.As the pulse width and the pulse amplitude increase, other mechanisms than ferroelectricity contribute.As observed by Halter et al. in HZO/WO x ferroelectric field-effect transistors, [25] it could be linked to oxygen exchanges between the HfO 2 /ZrO 2 nanolaminate and WO x .In the proposed stack, the ALD growth of the oxide layers is enhanced by an oxygen plasma.In HfO 2 /ZrO 2 superlattices, Zhao et al. reported on the effect of plasma time on the dominance of the ferroelectric over the antiferroelectric character, underlining the crucial role of oxygen vacancies in functionality. [55]ast, we investigate the origin of the resistive switching by means of temperature-dependent transport measurements.First, the device is heated to 80 °C and thermalized for 5 min.A DC sweep from 2 to −2 to 2 V is applied to set the device in the OFF, or HRS.Then a high-resolution I-V sweep is taken between −100 and 100 mV, with a step of 5 mV.A DC sweep from −2 to 2 to −2 V is applied to set the device in the ON, or LRS.Again, a highresolution I-V sweep is taken between −100 and 100 mV.The experiment is repeated for decreasing temperatures.The logarithm of the corresponding current densities J as a function of Log (V) are reported in Figure 8a.For each temperature, the experimental data is fitted by a linear regression in the range [0, 50 mV].At small bias, the conduction through the synaptic weights is linear (Ohmic), ideal for implementing vector-matrix multiplications.At a larger bias, a rectifying behavior appears (Figure 8b).Direct tunneling conduction also exhibits a linear regime for small bias, and this mechanism was considered, but discarded as being the dominant mechanism for two reasons.First, as seen in Figure 8a, a large change in the current density (≈2 decades) is observed for a moderate temperature change (60 K), which is unexpected: direct tunneling conduction is not thermally activated.Second, fitting the experimental data with a Brinkman model, as proposed by Gruverman et al. [56] (see Figure S8, Supporting Information) required to assume a thickness of only 3 nm for the insulating layer, which is 25% thinner than the expected thickness.These figures are consistent with the observation of direct tunneling current in 2.8 nm thick HfZrO 4 films deposited by sputtering in high vacuum conditions. [57]In insulators, the Ohmic conduction is due to the drift of a small number of mobile electrons in the material's conduction band. [58]In ALD thin films, they can originate from hydrogen incorporation. [59]Ohmic conduction is analytically described by where J is the current density,  the electrical conductivity, μ the electron mobility,  the electronic charge, N C is the effective density of states of the conduction band, t is the sample thickness, E C − E F is the energy difference between the conduction band and the Fermi level, k the Boltzmann constant, T the absolute temperature, and E is the electric field across the ferroelectric layer.
In Figure 8c, the intercepts of the linear regressions obtained from Figure 8a are plotted as a function of the inverse of the temperature (1/T) for the LRS (blue symbols) and the HRS (red symbols).Again, the data is fitted by linear functions: y = a•x + b.Assuming μN C independent on the temperature, we see from Equation (2) that = a and Log(qN C /t) = b.We measure a change in the energy barrier of 0.26 eV in the LRS to 0.41 eV in the HRS, which could originate from the change in the electrostatic screening of the polarization charges.In addition, the qN C /t product decreases by more than one order of magnitude from the LRS to the HRS.This contribution could point to a modification of the bulk properties of the insulating stack, supporting the scenario of oxygen exchanges between the SL and the WO x interlayer.
Based on the results and discussion, the following model is proposed.After applying a negative bias on the top electrode, the polarization points toward the latter.The WO x is rich in oxygen.For thick WO x (Figure 1), the resistance of the WO x compares to that of the HZO when the device is in the LRS.After applying pulses of short duration or moderate amplitude, the polarization partially switches, resulting in a change in the energy of the conduction band in HZO.This effect depends on the polarization, and, as observed in fatigued devices (Figure S7, Supporting Information), on the leakage.For larger, positive pulses, oxygen ions move from the WO x into the HZO-NL, filling the vacancies responsible for the LRS.In this configuration, the WO x layer is more resistive than in the LRS, but the resistivity of the device is dominated by the HZO-NL.The oxygen transfer from WO x to HZO is assisted by the ferroelectric domain switching: the effect is remanent, and the resistance remains constant in time until a sufficiently large negative field pushes the oxygen back into the WO x layer.This change in resistance is localized at the interface with WO x , therefor it is more visible in thin HZO layers.

Conclusion
From silicon to the back-end of integrated circuits, the window for the stabilization of ferroelectric thin films narrows.In earlier work, FTJ structures intended to be used as ferroelectric synaptic weights based on solid solutions of HfZrO 4 (in combination with WO x ) showed promising properties but turned out to be difficult to co-integrate with CMOS.Instead, in this work ZrO 2 -HfO 2 nanolaminates were investigated.The synaptic functionality was improved: the larger read currents allow a smaller footprint, and the On/Off ratio increased from 7 to 32.The synaptic weights can be programmed using pulses of varying amplitude or duration; neural network simulations predicted a similar learning accuracy for both schemes, but a faster learning rate for the constant duration/increasing amplitude scheme.The nanolaminate stack crystallizes at a lower temperature than the solid solution.2-terminal structures were successfully added to the back-end of XFAB 180 nm technology, exhibiting multi-level resistive switching.The co-integration with CMOS was demonstrated with 1T-1R operation, in a design connecting the drain of an NMOS to the synaptic element.The synaptic plasticity showed different regimes depending on the pulse duration, from ultra-fast operation (20 ns pulses) to biological timescales (500 μs).Ferroelectric and temperature-dependent transport measurements indicate a cumulative and reversible contribution of ferroelectric field effects and defect rearrangement.

Experimental Section
Device Preparation: The as-received CMOS wafer was treated by a 100 W oxygen plasma during 2′.The active stack was then deposited by plasma-enhanced atomic layer deposition (PE-ALD) on the CMOS wafer and on the conductive Si++ test substrates: 20 nm of TiN was deposited at 300 °C with tetrakis(dimethylamino)titanium and N 2 as precursors.60 or 45 cycles of WO x was deposited at 375 °C with (BuN) 2 W(NMe 2 ) 2 and O 2 .Then for the solid solution: 30 supercycles were deposited at 300 °C alternating one cycle with tetrakis(ethylmethylamino) hafnium (IV) and O 2 , and two cycles with bis (methylcyclopentadienyl) (methyl) (methoxy) zirconium (IV) and O 2 .For the nanolaminates: five supercycles were deposited at 300 °C alternating five cycles with tetrakis(ethylmethylamino) hafnium (IV) and O 2 , and ten cycles with bis (methylcyclopentadienyl) (methyl) (methoxy) zirconium (IV) and O 2 .Ten additional nanometers of TiN were deposited.The crystallization was performed with the millisecond flash lamp annealing technique: [60] the sample was preheated to 400 °C for the solid solution, and 375 °C for the nanolaminate.Then a 20 ms long energy pulse of 90 J cm −2 was applied.A 50 nm thick W metal electrode was then deposited by sputtering.The top electrode, defining the area of the junction, was defined by optical lithography and reactive ion etching (RIE) of the W and top TiN layers.Using this method, the HZO layer acted as an etch stop.For the synaptic weights co-integrated to the CMOS, the bottom electrode was then defined by optical lithography and ion beam etching of the SL, WO x , and TiN layers.A 100 nm thick SiO 2 passivation layer was deposited at 300 °C by plasma-enhance chemical vapor deposition (PECVD).Vias to the device's top electrode and top electrode contact were defined by optical lithography.The SiO 2 layer was etched by RIE, and then the SL and the WO x were etched by ion beam etching, expos-ing the TiN layer to air.The etch was immediately followed by the sputtering of 100 nm of W (second metal layer, M2).Metal lines were then defined by optical lithography and etching by RIE.
Materials Characterization: The growth rate for as-deposited HfO 2 and ZrO 2 was calibrated using ellipsometry.A Si wafer was thermally oxidized and diced into chips.On one chip, 100 cycles of HfO 2 were deposited.On the other chip, 200 cycles of ZrO 2 were deposited.The experimental data (solid lines in Figure S9, Supporting Information) was fitted with a WVASE model.For the HfO 2 /SiO 2 //Si stack (a), the layers were 18.6 and 231 nm, respectively: five ALD cycles corresponded to 0.93 nm.For the ZrO 2 /SiO 2 //Si stack (a), the layers were 19.4 and 231 nm, respectively: ten ALD cycles corresponded to 0.97 nm.
Electrical Characterization: Electrical measurements were performed on an Agilent B1500A semiconductor analyzer with a B1530A waveform generator/fast measurement unit (WGFMU).Write pulses were generated by the remote-sense and switch unit (RSU) module of the WGFMU close to the probe and applied to the top electrode while the bottom electrode was grounded.During programming, the current flowing through the device was not measured.The smaller current measurement range of the WGFMU was 1 μA and the latter cannot be used for measuring the device resistance after programming (10-100 nA).Instead, the device resistance was measured at V = +/−100 mV with a high-resolution source measurement unit (SMU) at the top electrode while the bottom electrode was grounded.Ferroelectric characterization (DHM and Fatigue mode) was performed on an AixACCT TFA2000.

Figure 1 .
Figure 1.a) Plasticity in synaptic weights with HfZrO 4 solid solution (SS, blue curve) and HfO 2 /ZrO 2 nanolaminates (NL, red curves).The atomic layer deposited (ALD) stacks are schematized in (b).As sketched in (c), after each pulse of amplitude V w and duration t w = 5e-4s, the resistance R 100 mV is measured by a DC read from −100 to 100 mV.For the same WO x interlayer thickness (60 ALD cycles) the On/Off ratio increases from 7 for the SS to 18 for the NL (red squares).By decreasing the number of ALD cycles for WO x to 45, the On/Off further increases to 32 (red circles).

Figure 2 .
Figure 2.After each pulse of amplitude V w and duration t w = 5e-4 s, the resistance R 100 mV is measured by a DC read from −100 to 100 mV.Underneath a threshold of V w = 300 mV, R 100 mV is constant.For pulses increasing V w (ii,iii), R 100 mV increases by a factor On/Off = 26.Further pulses of positive, but decreasing, are applied: in sector (iv) R 100 mV increases due to contributions that are probably related to another mechanism than ferroelectricity.These contributions are reversible: the cycle-to-cycle variation, represented in the right panel, is overall +/− 5%.

Figure 3 .
Figure 3. Resistance R 100 mV measured after programming with V w and t w = 20 ns.Ultra-fast switching is achieved in HfZrO 4 solid solution (SS, blue curve) and HfO 2 /ZrO 2 nanolaminate (NL, red curves), but with a larger On/Off in the latter case.The pulse width of 20 ns is the resolution limit.

Figure 4 .
Figure 4. Results from the Neurosim MLP+ simulator: the accuracy of an ANN with 400, 100, and 10 neurons in the input, hidden, and output layers is estimated (MNIST dataset) as a function of the number of epochs, for the increasing amplitude (yellow circles) and increasing duration (blue squares) schemes.

Figure 5 .
Figure 5. a) DC I-V read after programming the BEOL integrated synaptic weight in the ON state (yellow squares) and the OFF state (black squares), above the 10 pA target.b) After programming the device with an amplitude V w , the read resistance is measured at −75 mV (filles diamonds) and +75 mV (open circles): multi-level resistive switching is obtained.

Figure 6 .
Figure 6.Multi-level, 1T-1R operation.a) Current I N1 through the synaptic weight, as a function of the programming V PL , for various gate voltage V WL on the NMOS transistor.b) Drain current-gate voltage characteristics of the NMOS, showing that the compliance observed in a) originate effectively from the later, giving rise c) to a potential V N1 at the source of the NMOS.

Figure 7 .
Figure 7. Endurance tests under high stress (3 V, 5e5 Hz) on eleven devices.a) The remanent polarization DHM P r is measured using the dynamic hysteresis mode method, showing no dielectric breakdown after 1e11 cycles.b) DHM P r is calculated from the current DHM I, to which the leakage current and the displacive current contribute.The first loop (black curve) shows two current peaks.

Figure 8 .
Figure 8. Temperature-dependent measurements.a) Current density J as a function of the voltage after setting the synaptic weight in the LRS (circles) and the HRS (diamonds).At small bias, linear regressions in the Log(J)-Log(V) representation have a slope of 1 (Ohmic conduction).b) At higher bias, the synaptic weights have a rectifying behavior.c) Arrhenius plot of the intercepts of the linear regressions obtained in (a) for the LRS (blue symbols) and the HRS (red symbols).