Deep Emulators for Differentiation, Forecasting, and Parametrization in Earth Science Simulators

To understand and predict large, complex, and chaotic systems, Earth scientists build simulators from physical laws. Simulators generalize better to new scenarios, require fewer tunable parameters, and are more interpretable than nonphysical deep learning, but procedures for obtaining their derivatives with respect to their inputs are often unavailable. These missing derivatives limit the application of many important tools for forecasting, model tuning, sensitivity analysis, or subgrid‐scale parametrization. Here, we propose to overcome this limitation with deep emulator networks that learn to calculate the missing derivatives. By training directly on simulation data without analyzing source code or equations, this approach supports simulators in any programming language on any hardware without specialized routines for each case. To demonstrate the effectiveness of our approach, we train emulators on complete or partial system states of the chaotic Lorenz‐96 simulator and evaluate the accuracy of their dynamics and derivatives as a function of integration time and training data set size. We further demonstrate that emulator‐derived derivatives enable accurate 4D‐Var data assimilation and closed‐loop training of parametrizations. These results provide a basis for further combining the parsimony and generality of physical models with the power and flexibility of machine learning.

models (Eyring, Bony, et al., 2016) provide derivatives, and even advanced operational models use approximations and lower resolutions to do so (Trémolet, 2004).
Three main strategies presently exist for obtaining simulator derivatives. First, for simple simulators with low-dimensional system states, finite differences have been used (Smith et al., 1985) but are inefficient and numerically fraught and scale poorly. Second, gradient calculation routines have been implemented for specific simulators such as numerical weather prediction systems (Wedi et al., 2015;Weng & Liu, 2003), and some tools exist for automating this task (Bischof et al., 1992). These "by-hand" routines can be efficient and stable but are cumbersome to maintain, require immense effort for realistic Earth science simulators, and limit experimentation with new models, parametrizations, or discretizations. Third, the entire simulator can be reimplemented in an automatic differentiation ("autodiff") framework (Linnainmaa, 1976), which uses a fixed set of numerical operations to support automatic differentiation. While a promising long-term strategy Hu et al., 2019;Rackauckas & Nie, 2017), this has not been demonstrated for realistic simulators and would discard decades of software development.
Here, we pursue a different approach for calculating missing derivatives: "translating" the simulator into an autodiff environment as an emulator, a deep neural network (NN) that learns from simulations. Such emulators have been studied in Earth science for their potential to provide faster and computationally cheaper numerical simulations, sensitivity analysis, and model uncertainty estimates (Dueben & Bauer, 2018;Fablet et al., 2018;Reichstein et al., 2019). We show that the trained emulator's autodiff derivatives closely match the missing derivatives of the original simulator, without requiring gradient routines or autodiff support for simulator code. This approach can leverage the mature autodiff environments developed and tested by the machine learning (ML) community (Abadi et al., 2016;Bezanson et al., 2017;Hu et al., 2019;Paszke et al., 2019;Rackauckas et al., 2018).
We develop and test our approach using the Lorenz-96 model (L96) (Lorenz, 1996), a system of nonlinear differential equations modeling atmospheric chaos and a standard benchmark for data analysis and ML tasks in Earth science Dueben & Bauer, 2018). L96 is a good test case for our purposes because it can be precisely differentiated without ML to measure the accuracy of emulator derivatives. We systematically investigate how accuracy depends on integration time and training data set size and show how emulators can learn from small system state fragments. We also demonstrate how designing the emulator's architecture to match the simulator's mathematical structure improves accuracy and data efficiency (Bocquet et al., 2019). Our work incorporates insights and techniques from previous studies using NNs to predict or emulate Earth science simulators (Chattopadhyay et al., 2019;Dueben & Bauer, 2018;Fablet et al., 2018;Scher & Messori, 2019) but is to our knowledge the first to estimate derivatives through simulation-trained emulators.
We further demonstrate the utility of emulator derivatives for two important downstream applications in Earth science. We first test strong-constraint 4D-Var (Lorenc, 1986), a simulation-based data assimilation technique (Carrassi et al., 2018;Ide et al., 1997) that estimates system states from noisy and incomplete observations. It is used in state-of-the-art weather prediction systems (Wedi et al., 2015) but requires the simulator's derivatives (Courtier & Rabier, 1997;Errico, 1997). We show that emulator derivatives can produce 4D-Var forecasts as accurately as true simulator derivatives.
We also apply emulator derivatives for learning parametrizations, corrective terms approximating physical, chemical, or biological processes not spatially resolved in the system state (Kain & Fritsch, 1993;Stensrud, 2009). Designing these requires extensive work and domain expertise (Gross et al., 2018;Hourdin et al., 2016), and DL-based replacements have garnered considerable attention (Karpatne et al., 2017;Reichstein et al., 2019;Schneider et al., 2017). Without available simulator derivatives, parametrizations are trained on their immediate inputs and desired outputs before being coupled to the simulator (Brenowitz et al., 2020;Gross et al., 2018;Seifert & Rasp, 2020). This "offline" training mode cannot account for simulator-parametrization interactions at runtime, potentially leading to unrealistic behavior and numerical instabilities (Rasp, 2020;Yuval & O'Gorman, 2020). We show that emulator derivatives allow highly accurate parametrizations to be learned in an online, "solver-in-the-loop" mode that was previously restricted to a narrow set of simulators with available derivatives (Obiols-Sales et al., 2020;Ramadhan et al., 2020;Sanchez-Gonzalez et al., 2020;Um et al., 2020).

Simulators
We consider a simulator with time-varying system state  K x  (e.g., pressure or temperature across a spatial grid). The simulator uses an explicit, fixed-time-step numerical scheme to integrate tendencies f that are functions of the current state x t , resulting in a deterministic state update : with step-size Δ > 0. The integration scheme can be forward Euler ( ) or a higher-order method (cf., Supporting Information Text S2). A simulation is a state sequence {x(t 0 ), x(t 0 + Δ), …, x(t 0 + nΔ) …, x(t 0 + NΔ)} generated by repeated application of . For fixed step-size, we will denote the state sequence as {x 0 , x 1 , …, x n , …, x N } with x(t 0 + nΔ) = x n .

Problem Statement
Our task is to obtain simulator derivatives. Given an initial state x 0 and time delay nΔ, we must calculate derivatives of the future state x n with respect to x 0 , given by the input-output Jacobian matrix of the iterated state update : We assume at least one simulation is available, and the integration scheme and Δ are known. However, to facilitate flexible application to Earth science models without extensive analysis of governing equations, discretizations or source code, we do not assume code or formulas are available for f, or that we can freely evaluate f or  on new inputs (see Section 4). Thus, we must differentiate the unknown state-update function based on a fixed data set it has generated.

Emulators
Our overall strategy is to estimate simulator Jacobians using emulator Jacobians. Essentially, a NN emulator learns to reproduce the simulator's dynamics through training on simulation data, and autodiff tools are used to provide the derivatives of emulator outputs with respect to the inputs.
For a simulator with a K-dimensional system state, our emulator is a NN with K inputs and outputs, trainable parameters ϕ, and input-output function  f that estimates the simulator tendencies f. While we have not assumed knowledge of f(x) for any simulation state, we can plug our emulator into the simulator's integration scheme and compare the resulting state update    to simulations. For example, forward Euler We train the emulator (details in Supporting Information Text S3) by minimizing the objective function: x at time step n in simulation s. After training  f , we plug it into an integration scheme to obtain    . We then use autodiff to minimize DYN  , applying the chain rule in Equation 3 efficiently through autodiff backpropagation.

Model Architectures for Emulation
When we have no knowledge about the simulator beyond the above assumptions, fully connected deep NNs provide a generic, flexible option for  f . However, partial knowledge about the tendency functions f or the space of system states x can often be easily obtained from the simulator's description or documentation without extensive analysis of its equations or code.
In the present work, our emulator architectures incorporate knowledge about three common properties in simulators of physical systems. First, when the elements of the system state are spatially structured by assigning each element a location on a regular spatial grid, we use convolutions to increase efficiency, accuracy, and scalability. Second, when the simulator tendencies exhibit local dependence, such that f(x) k depends only on the system state in a local neighborhood U k around x k , we impose the same structure on the emulator. To do this, we limit the number of convolutions with kernel width >1, so that the remaining convolutions operate independently at each grid point. Third, we explore the use of specialized activation functions that mimic mathematical operations used by the simulator ( Figure S1). Full details of all networks are provided in Supporting Information Text S4, and code is available at github.com/m-dml/emulator_L96/.

Experiments
We trained emulators on simulation data, evaluated their accuracy, and tested them in downstream applications using the L96 simulator, a common benchmark model in Earth science.

Test Case: L96 Simulator
We generated simulations from the one-level Lorenz-96 simulator (Lorenz, 1996). This simulator exhibits wave-like patterns that interact nonlinearly while moving persistently clockwise through the spatially structured, 1-D, periodic system state (Figure 1a x  for both simulator and trained emulator and the initial state from (a). (c) Sensitivity analysis demonstrating emulator differentiability over multiple time steps. Partial derivatives ∂x(1) 20 /∂x(t) k of the system state at location 20 and time t = 1 au (red cross in (a)), with respect to all locations k and at previous times t. Differences (right) between simulator and emulator derivatives increase with the number of differentiated time steps.
Each element of the state's tendency is a function of the current state x k = [x(t)] k at that location k, together with up to two neighboring state values in either direction: where F is a static parameter of the simulator and we use the periodic boundary conditions x K+1 = x 1 , x 0 = x K , and x −1 = x K−1 . We used a step-size of Δ = 0.05 in a fourth-order Runge-Kutta integration scheme, so that calculating the full state x n+1 from x n requires four recursive applications of f. Except where otherwise noted, we used K = 40 and F = 8, which is known to result in chaotic dynamics with a leading Lyapunov exponent of λ 1 ≈ 1.67 .
While the purpose of our emulation framework is to provide unavailable simulator derivatives, for testing and measuring accuracy we can obtain the Jacobians 0 ( , ) J n x  of the L96 simulator through hand-written routines (details in Supporting Information Text S2).

Emulation of L96 Dynamics
We trained emulators to estimate the L96 tendency function (Equation 5) for arbitrary system states x. In designing our emulator architectures, we sought to exploit high level conceptual properties of the simulator (Section 2.4), analogous to those that can be easily identified in more complex and realistic simulators, without tailoring our network's components excessively to the simulator. First, we used periodic 1-D convolutions to capture the system states' spatial structure. Second, to the capture local dependence, we used 3 × 1 convolutions in the first two layers, with all subsequent layers consisting of 1 × 1 convolutions that operate independently across activation channels at each spatial location. We used eight-layer networks with ReLU activations (details in Supporting Information Text S4).
We also experimented with additional specializations of the emulator's connectivity structure and activation functions to more closely match the mathematical operations of Equation 5. These specializations strongly improved performance, reduced network depth and trainable parameter count, and decreased the amount of training data required (see Supporting Information Text S5, and also Bocquet et al., 2019), but we chose not to use them for our main results since such problem-tailored architectures might not be easy to identify for more complex and realistic simulators.

Accuracy of Emulations
After using L96 simulations to train the emulator, we plugged the emulator's estimate  f of the tendencies i into the RK4 integration scheme to define a state update    (Bocquet et al., 2019;Wang & Lin, 1998). Starting with an initial state for a simulation not included as emulator training data (Figure 1a, top), we applied    iteratively to generate an emulation (or "rollout"), a new system state sequence resembling a simulation ( Figure 1a, middle). Visual inspection of emulations showed a close match to the original simulation for about one Lyapunov time, beyond which the emulation systems states generated by the emulator continued to strongly resemble simulations in their amplitude, smoothness, continuity, and wave-like appearance, but with different wave positions and amplitudes. The eventual appearance of this mismatch is inevitable even when the emulator's errors approach machine precision, since L96 is a chaotic system . Nonetheless, emulations were stable for over 10 5 time steps and resulted in the same distribution of system state values as the original L96 simulation (see Figure S3).
We further quantified these errors over multiple emulations as a function of integration time (Figure 2a) for networks trained on different data set sizes (N = 1,200, 4,000, or 12,000 time steps). Initial errors were much smaller than typical state values but grew over about 3 Lyapunov times until the emulation no longer matched the simulation better than a randomly chosen system state. Emulator errors on both rollouts and single state updates decreased as a function of training data set size (Figure 2b).

Accuracy of Emulator Derivatives
We next considered the task originally motivating emulator development: the estimation of simulator derivatives. We used the autodiff capabilities of NNs to differentiate emulator outputs with respect to their inputs and used the results as estimates of simulator derivatives. Because we have chosen a simple test case where simulator derivatives can be easily implemented by hand, we were able to compare emulator derivatives directly to their target values. To evaluate the accuracy of emulator-derived derivatives across longer simulations, we calculated simulator and emulator derivatives of the system state at one location and time with respect to all locations up to 20 time steps in the past. The spatial and temporal structure of this dependence of future on past states was visually similar when comparing emulator and simulator derivatives ( Figure 1c). As expected from the chaotic nature of L96, the size of errors relative to the derivatives grew with increasing time in the past. Due to the limited-range local dependencies built into our emulators   , the size of the region with nonzero derivatives grew linearly over time into the past.

Training on Partial System States
A considerable challenge in training NNs on Earth science simulations is posed by the sheer size of their system states. For example, a global latitude/longitude grid at 0.25° spacing with 100 vertical levels contains over 10 8 locations, each of which can store a dozen or more physical quantities. In such cases, training on complete system states poses major difficulties for machine learning frameworks and the computing hardware that supports them (Kurth et al., 2018). However, this limitation can be overcome by training the emulator on partial system states. We take advantage of the fact that L96 dynamics exhibit local dependence, a property shared by many other simulators. For example, many Earth system models combine location-specific physical effects coupled to fluid dynamics (Gross et al., 2018). When these dynamics are integrated explicitly with discretization stencils for spatial derivatives, updates exhibit local dependence (Zängl et al., 2015).
In the specific case of L96, the time derivative f(x) k of the kth location in the L96 system state depends only on x k−2:k+1 , a neighborhood of four values, while for a fourth-order Runga-Kutta solver the state update ( ) k x  depends on 13 values (i.e., x k−8:k+4 ). By imposing the same local dependence on the emulator NONNENMACHER AND GREENBERG 10.1029/2021MS002554 6 of 13 computing  ( ) f x through convolutional network layers, the problem of learning a function with K inputs and K outputs reduces to the much easier problem of learning a function with 13 inputs and 1 output. We investigate how to exploit this for emulator training on partial system states in Supporting Information Text S6.

4D-Var Data Assimilation
To further test the accuracy and usefulness of emulator-derived derivatives, we applied them to data assimilation (Apte et al., 2008), aiming to identify the L96 system state sequence most consistent with noisy and incomplete observations. Observations y n consist of 10 randomly locations from each x n with Gaussian noise (Figure 3a We carried out data assimilation using the strong-constraint 4D-Var algorithm. Making use of the fact that all future states depend deterministically on x 0 , gradient-based numerical optimization is used to reduce prediction error for each y n while regularizing with a prior distribution (see Supporting Information Text S8). Concretely, we seek to minimize where observation error covariance matrices R n , background error covariance matrix B, and the background state x f are known, and n  is known with available derivatives.
Several approaches to minimizing the loss Equation 6  is unavailable for many important simulators. We therefore investigated the utility of emulation-based 4D-Var, replacing  by a trained emulator    and calculating DA  in an autodiff environment. For L96, for which true simulator derivatives are available, we were able to directly compare emulator versus simulator derivatives using the same data assimilation task, algorithm, and data. Our emulators target f instead of , and hence they are not tied to a specific step length Δ, so we reused an emulator previously described in Section 3.2 and trained on 1,200 time steps with Δ = 0.05.
4D-Var data assimilation recovered system state trajectories visually resembling true system states (Figure 3a) throughout the period of available observations, both when using state updates and derivatives from the true simulator (Figure 3c) or those from the trained emulator (Figure 3d). Errors within the observation window decreased as a function of window length (Figure 3f, and increased as expected when forecasting further into the future beyond the last observation (Figure 3g). Remarkably, we observed no difference whatsoever in forecast quality when using the emulator and its derivatives instead of the true simulator, for observations windows up to 72Δ and consecutive forecast windows up to 80Δ = 1au (Figures 3f and 3g). One arbitrary time unit corresponds to roughly 5 days of "weather" (Lorenz, 1996). These results demonstrate that emulation can provide missing derivatives for 4D-Var data assimilation without degrading forecast accuracy and that emulators learned for one value of Δ can be successfully applied with another.

Parametrization Learning
We further tested our emulators and their derivatives in a parametrization learning task. In Earth science, a critical and ubiquitous question ask how we can account for physical effects at spatial scales below the grid spacing of our system state. For example, how can we account for convective processes involving moisture and temperature variables z at spatial scales <100 m, when computational limits impose a 100 km spacing on our atmospheric system state? A parametrization adds a corrective term to coarse-scale tenden- x to mimic the effects of coupling to fine-scale variables: where g(x,z) x denotes the tendency of the coarse variables in the full coupled model and ψ free parameters of the corrective term.  ( ) x  typically is chosen to have local structure, meaning that for every location k, depends only on x k .
In parametrization learning, we seek the ψ for which Equation 8   In terms of predictability, 1 time unit corresponds to roughly 5 days of "weather." (b) Twenty-five percent of each system state is randomly observed with additive Gaussian noise. We aim to determine the state trajectory from (a) within the integration window (analysis, orange) and beyond the last measurement (forecast, purple). (c) 4D-Var reconstruction of system states from the observations in (b), using 4D-Var with the original simulator and a hand-implemented gradient calculation routine (top), with reconstruction error over time (bottom). (d) As in (c), but using state update and derivatives from an emulator trained on a separate simulation data set. (e) 4D-Var reconstruction of the initial state x(t 0 ) from the example in (a) and (b) using the emulator. (f) Mean ± 1 SD of RMSEs for x 0 as a function of integration window lengths. (g) Mean ± 1 SD of forecast RMSEs as a function of time beyond the integration window, with window length 0.8. RMSE, root-mean-square error. To address this, the parametrized model can be numerically integrated during learning to define a state update   for dynamics This "solver-in-the-loop" mode accounts for coupling effects and requires only coarse variables for training, halving storage requirements. However, minimizing PAR  over multiple time steps (or one Runge-Kutta step) requires simulator derivatives.
To test emulator derivatives in a solver-in-the-loop parametrization learning task, we used a two-level L96 model with coarse and fine variables (Lorenz, 1996): with Δ = 0.01, K = 36, J = 10, F = b = c = 10, h = 1 and   , . This model has been previously used as test for parametrization learning and is simple enough that parametrizations trained offline provide a numerically stable, reasonably accurate baseline in coupled simulations (Crommelin & Vanden-Eijnden, 2008;Orrell, 2003;Pawar & San, 2020;Rasp, 2020). Parametrizing out z in Equation 10, we have To learn an L96 parametrization, we first trained an emulator  f on 1,200 time steps of a coarse-only L96 model (Equation 5), fixed ϕ and substituted  f for f in Equations 8 and 9. We then minimized  ( ) PAR  with r max = 10 using autodiff on 500 time steps of fine-scale simulations as training data, with 20% held out for validation. For evaluation, we coupled the trained parametrization to the true dynamical model (Equation 12).
This "solver-in-the-loop"-trained L96 parametrization closely matched the coarse-scale variables of a two-level L96 simulation (Figure 4a). The learned parametrization also closely resembled coarse two-level L96 variables in terms of root-mean-square error ( Figure 4b) and power spectral density (Figure 4c). For L96, offline-and solver-in-the-loop-trained parametrizations were strikingly similar (Figure 4d) despite different losses and training data. We emphasize that our goal was not to improve existing learned parametrizations for L96, which is sufficiently simple that offline training is adequate (Rasp, 2020). Rather, these results demonstrate that emulator derivatives allow solver-in-the-loop parametrization training without compromising accuracy.

Discussion
We trained differentiable emulators on full or partial states of the simple but chaotic L96 system and applied them for data assimilation and parametrization learning. Emulation extends gradient-based reasoning and analysis to important simulators lacking derivatives. By training on simulator outputs alone, we avoid painstaking analysis of formulas or simulation code, which can be complex and idiosyncratic for simulators of weather, climate, and other important Earth system phenomena.

Related Work
Learning and correcting dynamics. Our results build on a growing literature describing networks that learn system dynamics (Grzeszczuk et al., 1998;Sanchez-Gonzalez et al., 2020), arising from ordinary differential equations (Chen et al., 2019;Fablet et al., 2018) and PDEs (Long et al., 2018;Rudy et al., 2019), some of which have applied numerical integration schemes during training (Wang & Lin, 1998). Several studies have used ML to learn parametrizations for L96 (Dueben & Bauer, 2018;Gagne et al., 2020;Watson, 2019) and other Earth science models but have either used offline training or employed Ensemble Kalman filtering and related approaches that do not require derivatives (Brajard et al., 2021;Pawar & San, 2020;Rasp, 2020).
A related line of research expands the task of parametrization learning to include the full dynamical model and learns updates directly from noisy and incomplete observations without a simulator, effectively combining emulator learning and data assimilation into a single task (Bocquet, 2012;Bocquet et al., 2020;Brajard et al., 2020). Long et al. (2018) approach this task by learning a PDE and discretized differential operators are learned from data. Fablet et al. (2020) learn dynamics while solving a weak-constraint 4D-Var problem, where minor violations of the learned dynamics are allowed, and train second network to solve the resulting optimization problem. Farchi et al. (2020) learn an ML-based correction to an existing simulator based on noisy observations but rely on available simulator derivatives. Like these studies and most Earth science simulators, we used time discretization, but network outputs can also be integrated in continuous time, resulting in a neural ordinary differential equation (Chen et al., 2019).
While building on these previous studies, our work is to our knowledge the first to design and evaluate emulators as derivative estimators, train them on partial system states, or systematically measure the accuracy NONNENMACHER AND GREENBERG 10.1029/2021MS002554 10 of 13 in comparison to a reference trained "offline" (see text). Blue dots show a subset of training data for offline parametrization training, which requires access to the unresolved fine-scale variables. NN, neural network; RMSE, root-mean-square error. and performance of derivatives in downstream tasks. However, many of these approaches can be combined with ours.
Unsupervised methods. At the opposite end of the data versus physics spectrum, a new class of unsupervised methods train networks to solve PDEs by constructing an objective function directly from symbolic equations, without requiring any simulations for training (Raissi et al., 2019;Sirignano & Spiliopoulos, 2018). These approaches use autodiff to calculate spatial and temporal derivatives and avoid discretization in space or time but cannot change initial conditions after training. Wandel et al. (2020) instead constructs an objective function from discretized PDEs but can generalize to new initial conditions. These methods address a fundamentally different task, require careful attention to model equations, and must be validated by numerical integration.

Future Outlook
While this and other studies successfully trained emulators on simple models such as L96, it is less clear whether they can be scaled up to more complex models, larger grids, additional spatial dimensions, and more interacting variables. Key challenges include interactions across multiple space and time scales, strongly nonlinear effects, and the coupling of fluid dynamics to in-place physics. While emulation learning remains an open challenge for more complex models, a number of strategies may prove useful in these scaling up efforts.
We found that domain-specific architectural features such as spatial structure and locality help reduce the size of the function space in which we are searching for an optimal emulator. Learning tendencies within a higher-order integration scheme Fablet et al., 2018;Wang & Lin, 1998) provides stronger locality constraints than learning state updates and reduces numerical instability (Scher & Messori, 2019). Location-dependent effects (e.g., Coriolis force) could also be captured by providing each grid coordinates as an input channel.
A concern when applying trained emulators with unfamiliar initial conditions or optimizing DA  in Equation 6 is that the input to the network may not resemble the training data, possibly degrading performance. Proposed solutions include penalizing deviations from a desired region of the space of system states (Ren et al., 2020) and adding noise to inputs during training (Sanchez-Gonzalez et al., 2020). For data assimilation, we found the regularization provided by the prior p(x 0 |x f ) to mitigate this problem.
Weather and climate models typically have modular structure with multiple interacting components, but optimizing a specific physical parametrization would also requires derivatives for the dynamical core and other parametrizations. One solution to this problem could be building modular emulators that mimic simulator subroutines for local dynamics and in-place physics. In these cases, it might be possible to train some parts of the emulator using simulator runs with certain physical parametrizations turned off.

Data Availability Statement
Code, simulated data, and results of our study can be found at Zenodo (https://doi.org/10.5281/zenodo.4638267). Simulation data for Lorenz-96 were generated with code from https://github.com/m-dml/ L96sim. Code for emulator definition & training, as well as all numerical experiments, is also found at https://github.com/m-dml/emulator_L96. Acknowledgments M. Nonnenmacher and D. Greenberg were supported by the Helmholtz AI initiative. We thank Johanna Baehr, Bedartha Goswami, and Vadim Zinchenko for comments on the manuscript. Open access funding enabled and organized by Projekt DEAL.