An efficient and robust method for parameterized nonintrusive reduced‐order modeling

A method of constructing parameterized nonintrusive reduced‐order models (NIROMs) is given. The approach relies on a geometrical interpretation of NIROM, requires only a single layer of interpolation to be applied for both system state and parametric dependence of the model and is applicable to systems characterized by any number of parameters spanning arbitrary orders of magnitude. The method is applied to three representative test problems and evaluated in terms of accuracy and speed, showing good performance.

is fully data-driven and therefore independent of any physical model source code. Additionally, the method naturally avoids the issues of instability and linear inefficiency of classical reduced order models. Up to date applications of NIROM include modeling of the Navier-Stokes equations, 1-3 blasting, 4 fluid-structure interactions, 5 shallow water equations, 6 and different aspects of petroleum reservoir modeling. [7][8][9][10] In principle, any multidimensional interpolation method allowing for irregular sample structure can be employed for NIROM construction. Examples of such include the radial basis function (RBF) or polyharmonic spline (PS) methods, least-squares fits, Kriging, and artificial neural networks (ANNs). Most recently, particular emphasis has been put on the latter, namely machine learning-based interpolation approaches. [11][12][13][14] While ANNs are shown to be versatile, their structure is typically a black box, meaning that little to no information can be extracted from the interpolant itself. Moreover, the interpolation process relies on a large number of data samples which are not guaranteed to be reproduced exactly. The same problem of nonexact input reproduction arises if least-squares fit methods are employed, practically implying that the resulting NIROM is not able to reproduce the simulation data it was trained on. On the other hand, Kriging reproduces the data samples exactly, but typically uses a single (Gaussian) radial function and often requires a costly optimization problem to be solved for the optimal data fit. 15 Due to the reasons listed above we choose to investigate the much simpler, fully explicit and computationally cheaper techniques, namely RBF and PS. Their benefits include exact reproduction of the input data, a large degree of freedom via the choice of the RBF and a relatively very low interpolant construction and evaluation cost.
Applicability of NIROM to parameterized systems in which the dynamics is governed by a collection of fixed properties varying between realizations has only been studied very recently. Up to authors' best knowledge the only fairly general method of constructing parameterized RBF/PS-based NIROMs has been proposed by Xiao et al 7 in the context of porous media flow. However, the method was not analyzed in depth and applied to a single test problem with limited performance measures used. This manuscript proposes a more robust, consistent, and general approach together with its geometrical interpretation and more extensive performance evaluation. The design is motivated by first examining the two-stage nature of the method used in Reference 7 via geometrical representation of the two-stage interpolation process. Subsequently, a severe issue of limited communication between data samples is exposed and addressed. Additionally, a mechanism allowing for accurately modeling parameters with values spanning multiple orders of magnitude is also provided.
The rest of this work is organized as follows. In Section 2 the generic framework for parameterized NIROM is given together with its geometrical interpretation. In Section 3.1 the RBF-based model construction approach given previously by Xiao et al is presented and its limitations, error sources and inconsistencies are discussed. Subsequently, in Section 3.2 a new method addressing those issues explicitly is introduced. The method relies on an automatic variable rescaling procedure and enables a single interpolation process to be sufficient for capturing both system state and parameterization influence on the dynamics. In Section 4 the new method is applied to three parameterized test problems and evaluated with respect to accuracy and speed. Discussion and conclusions follow in Sections 5 and 6, respectively.

NIROM OVERVIEW
NIROM can be understood as a proxy for a high-fidelity forward numerical model (simulator). Generally, the role of any simulator is to take a discretized state of the system modeled and propagate it forward in time by a time step Δt. A discretized state of the system at time t can be represented as a high-dimensional vector, here denoted u(t), with the number of its entries being equal to the number of mesh nodes/grid cells. A series of such dicretized states (referred to as data snapshots) separate in time by a fixed time interval Δt is denoted as U where A forward numerical model can therefore be understood as a function F satisfying where p encodes the time-invariant parameters of the system such as fluid/domain properties (viscosity, density, diffusivity, etc.).
Intuitively, Equation (2) says that given the current system state u(t), the time step Δt and the system parameters p the simulator is capable of predicting the future state of the system at time t + Δt. In our notation the problem is expressed in an explicit fashion, but in practice this is hardly ever the case. In numerical simulation evaluating u(t + Δt) is often formulated in an implicit way and associated with solving a large sparse linear system. Relevant examples include discrete approximations of solutions to elliptic equations for pressure in Navier-Stokes or Darcy flow simulations. As a consequence, (2) is a high-level symbolic representation that is true for some F encapsulating all components of a numerical simulator. The main objective of NIROM is constructing a computationally cheap explicit proxy for F just from a set of simulation data. Low cost and speed of the method are achieved by employing a dimensionality reduction process and a cheap interpolation method, as discussed below.
First, the high-dimensional state of the system u(t) is represented in an approximate low-dimensional form (t). The mapping is generally arbitrary, as long as dim( (t)) ≪ dim(u(t)) and the reverse mapping can be provided. In practice, probably the most common choice of such dimensionality reduction is the proper orthogonal decomposition (POD), which is known to be the optimal reduction process with respect to the L 2 norm. In our notation POD can be briefly summarized as follows. Given a snapshot matrix U we first construct the mean data snapshotūū and the mean-shifted data arrayŨŨ giving rise to the so-called kernel matrix K where N is the dimensionality (length) of each data snapshot. Next, since K is square and positive-semidefinite it has an associated eigenvector basis and without loss of generality it is assumed that 2 j are given in descending order. Subsequently, the reduced dimension d (typically with d ≪ N) is defined as the smallest positive integer satisfying where ≪ 1 is a small user-specified tolerance. After d is identified, the transformation matrix V is defined as and the mapping (3) is given by while the inverse mapping is given by We point out that applying the series of transformations u(t) → (t) → u(t) (ie, compression followed by decompression) will produce a data snapshot different from the original u(t) if it is not spanned by the columns of V. However, in most practical applications the discrepancy is minimal and can be minimized by decreasing the threshold parameter . With the above formulation POD has been applied in a variety of applications in NIROM and other contexts; for more details the reader is kindly referred to Reference 16.
A low-dimensional representation (t) gives rise to a new simulator-defining functionF operating in a lower number of dimensions and satisfying (t + Δt) =F( (t); Δt; p). (13) SinceF takes inputs that are significantly lower dimensional than those of F the former is generally much cheaper and therefore faster to evaluate, but may still rely on a fairly expensive problem such as solution of a dense linear system. In order to address this, NIROM aims at locally approximatingF in a cheap and fully explicit fashion. This is achieved by first providing a collection of simulation snapshots U and reducing them to (0), (Δt), (2Δt), … ; assuming that (13) holds, we write ((n + 1)Δt) =F( (nΔt); Δt; p). (14) In the above the time step Δt is assumed to be constant and may be dropped for simplicity of notation. Since the samples of the reduced system state variables are extracted from U (known), the only unknown in the above equation iŝ F itself which is approximated via multidimensional interpolation methods such as the RBF. The interpolant is fully explicit and should be computationally much cheaper than the function it is approximating. It is clear that the NIROM approach is beneficial only if the computational cost of building and running the proxy is much smaller than the cost of running the high-fidelity model. In practical applications this is almost universally guaranteed, especially for high grid resolutions. It should be pointed out thatF approximated in this fashion can in principle be used to propagate system states that are different that the ones used for interpolation (training), assuming that the surrogate is sufficiently accurate.

PARAMETERIZED RBF-BASED NIROM
In this section we discuss the previous approach to parameterized RBF-based NIROM and point out its limitations by interpreting the construction procedure geometrically. Subsequently, a new method mitigating the aforementioned limitations is proposed.

Previous approach
The RBF-based parameterized NIROM construction procedure was first introduced by Xiao et al 7 in order to study a parameterized porous media flow problem. Generally, NIROM can be understood as a process of constructing a surrogate function f (⋅;⋅), which we refer to as the propagator, satisfying where n is a representation of the state of the system at time nΔt, Δt is a fixed time step for the model specified a priori and p is a vector of parameters characterizing the system. Typically, a low-dimensional representation of the system state is constructed via POD. Clearly for the problem to be truly parameterized we require p to be allowed to vary, otherwise the dependence on p can be dropped. F I G U R E 1 Visualization of the two-stage interpolation process. A collection of function samples (dots) representing the training data snapshots are given at a number of fixed parameter levels p (a). In the first step interpolation is performed separately at each level of p (b), so that f is fully defined at each constant parameterization. Practically, this step constructs a separate nonparametric nonintrusive reduced-order model, characterized by f s , for each individual dynamical system with a fixed sampled parameterization p s and satisfying f s (⋅) = f (⋅,p s ). In the second step (c) the interpolation of f s ( ) with respect to p can be performed at any fixed value of . The latter has to be supplied a priori and is treated as a constant at this stage [Colour figure can be viewed at wileyonlinelibrary.com] It is clear that for a parameterized f (⋅,⋅) multiple training simulations characterized by different parameterizations must be provided. We denote the number of such simulations to be N s with the corresponding compatibility condition where n s is the system state in training run s at time instance nΔt and p s is the parameterization associated with simulation s. In 7 the process of constructing the propagator is split into two steps via the RBF interpolation method 2. For a generic system state to be propagated and its parameterization p first evaluate for all s. Afterward, construct (via the RBF method) the function g(⋅) satisfying where p s is the parameterization corresponding to f s . Finally evaluate which is the estimate of the system state at the next time instance The collection of surrogates {f s } N s s=1 can be constructed a priori and stored as a collection of RBF coefficients. For a generic system state , N s values of f s ( ) are cheaply evaluated first (18) and their values are stored. Next, a new surrogate dependent on parameterization only is constructed locally (for this specific ) from {(p s , f s ( ))} N s s=1 and evaluated at p. Intuitively, the method just described separates the interpolation process into two processes applicable to two disjoint subsets of the total number of dimensions and requires a parameter-dependent interpolant to be constructed and evaluated for each system state modeled. A schematic representation of this procedure is given in Figure 1.  (1,2) and sensitivities of the interpolant to p (1,2) the radial basis function method may be severely ill-conditioned. A more uniform sample structure (B) is desirable, with the ideal case being a regular square grid of samples It can be observed that the method discussed suffers from a severe problem of miscommunication in the parametric space, precisely due to two split interpolation processes. In order to illustrate this, consider a toy problem where ∈ R 2 , p ∈ R 1 as in Figure 1. Let three uniformly spaced values of p, p 1,2,3 , be given in an ascending order, and let f 1,2,3 (⋅;⋅) be sampled at positions 1,2,3 with 1 = 3 , and || 2 − 1 || ≫ ||p 2 − p 1 || (Figure 2A). It is expected that f 2 ( 1 ) = f ( 1 ; p 2 ) will be severely inaccurate if no samples of it are provided in the vicinity of 1 at the parameterization level p 2 . However, this inaccuracy can in principle be lowered significantly if the information from the adjacent layers 1 and 3 could influence the interpolant at layer 2. The method discussed does not allow for the information to propagate across layers in this fashion. Moreover, if the error at position 1 in layer 2 is large, the second-stage interpolation across layers may be close to non-smooth, which can further lower the accuracy at other adjacent layers ( Figure 2B).
We point out that that the unified notation ||⋅|| is used throughout this work for the Euclidean norm in arbitrary number of dimensions. Intuitively, the inequality || 2 − 1 || ≫ ||p 2 − p 1 || indicates that the Euclidean distance between the two reduced representations of system states is much greater than the separation between the (hyper)planes in which evolves corresponding to different parameterizations p.
We also observe that a naive RBF approach applied to the parameterization vector p is another source of inconsistencies and potential errors. We demonstrate this by considering a two-dimensional parameter vector p = [p (1) ,p (2) ] T with parameter ranges where p (1,2) may represent two physical parameters spanning different orders of magnitude. It follows that ||p || ≈|p (1) |, and since the RBF method relies on pairwise distances between datapoints this implies that the influence of p (2) on the interpolant is negligible. However, if the system is sensitive to the values of p (2) then the resulting interpolant will necessarily fail to capture this dependency and the interpolation problem becomes severely ill-conditioned ( Figure 3).

New approach
Our approach mitigates the two issues discussed in the previous section, namely the problems of miscommunication and non-uniform data scaling. This is achieved by renormalizing the parameterization p and the samples n s in an appropriate fashion and allowing any data sample n s to influence the value of f (⋅;⋅) at any position in the ( ; p) space. First, instead of employing the ansatz (15), we modify it to be which is accurate up to zeroth order in the time interval Δt. Second, the normalization is applied to the system states . We write A s to be the array of discretized system states where s = 1, … ,N s , N s is the number of training runs provided and N s t is the number of system state snapshots in training run s. For each s we compute and then evaluate It is clear that Δ is the mean magnitude of f (⋅;⋅) defined as in (22). All system state data is then renormalized via and modeled in such nondimensional form with the typical value of f (⋅;⋅) being normalized to unity. Next, the parameterization p is normalized. The parameter vector p is split into a number P of separate, and potentially multidimensional, parameters An appropriate splitting has either P = dim(p) with each {p (i) } P i=1 being one-dimensional or separates quantities with different physical meaning (eg, density, viscosity). In order to illustrate the two alternatives consider the parameterization where g x,y,z are three gravity components, is the fluid viscosity and is the fluid density. The first (trivial) type of splitting separates p into five independent parameters. The second type of splitting is and takes into account the fact that g = [g x ,g y ,g z ] can be treated as a single vector parameter with all entries representing the same physical quantity.
For i = 1, … ,P each set {p (i) s } N s s=1 is rescaled separately. This is achieved by first computing the mean separation Δ (i) p between the nearest neighbors in each set as follows and then renormalizing Finally, for the interpolation process we define the notion of distance d(⋅;⋅) between two normalized state-parameter pairs as the above definition is used in conjunction with the RBF interpolation method and applied to the state-parameter pairs extracted from the collection of training data. The resulting structure of the data samples is schematically visualized in Figure 3B. We point out that the rescaling factors Δ , Δ (i) p can be premultiplied by an additional factor of order O(1) before the normalization process. Such treatment will emphasize the influence (importance) of the corresponding variable/parameter if the factor is >1 and make the problem less sensitive to the corresponding variable for factors <1.
The new approach can be therefore summarized as follows from all available combinations (s,n) by employing the joint state-parameter notion of distance (32) in the RBF method 3. Given a generic state-parameterization pair ( , p) appropriately renormalize the inputs by Δ , Δ (i) p and propagate one step forward in time by employing the RBF/PS surrogate with the notion of distance (32) We point out that our method is conceptually much simpler than its predecessor since it relies on just a single (universal) surrogate being evaluated once per simulation time step. The model is fully constructed a priori and no local surrogates need to be constructed or evaluated, which leads to a much lower computational cost and a significant speedup, as will be demonstrated later.

RESULTS
In this section the method proposed is tested and evaluated on three test problems. A comparison with the previous method is also provided. The generic workflow for each test problem can be summarized as follows 1. Set the simulation time T and the time step Δt (output time step for the high-fidelity model and the forward time step for NIROM). Set the ranges of all model parameters and define the box parametric domain. 2. Specify a number of points in the parametric domain corresponding to an ensemble of parameterizations for the system modeled. In this work the regular (uniform) grid sampling is used. 3. For each parameter combination specified in the previous step generate a number of high-fidelity snapshot data arrays (training runs) with potentially different initial conditions. 4. Generate a NIROM from the training runs via the two different methods discussed. In this work POD with = 10 −3 (9) was employed for the reduction process.

One-dimensional advection-diffusion
First, for validation purposes, we model the one-dimensional advection-diffusion equation on [0, 1] with two Dirichlet boundary conditions where U, are the (constant) advection velocity and diffusivity, respectively. The time interval Δt between two consecutive snapshots was set to 1/60 and the RBF used for interpolation was (r) = r 3  At each point in the (U, ) parametric space the accuracy is estimated as the time-averaged relative L 2 error averaged over 20 random initial conditions. More precisely, given a parameter combination [U 0 , 0 ] we first generate 20 appropriate random initial conditions and project them onto the reduced space to generate their 20 reduced representations (0). Each full-resolution initial state is then propagated via the high-fidelity model while each reduced initial condition is propagated via NIROM and backprojected onto the full resolution grid. For each simulation the time-averaged relative L 2 error is then computed, and averaged over 20 realizations. A representative example of the high-fidelity and NIROM solutions is given in Figure 4, and the relative error map in the (U, ) space is shown in Figure 5.

Two-dimensional porous media flow
Next, the method was applied to two porous media flow problems. The governing equations for the underlying dynamics are where i = 1, … ,M is the phase index, q is the fluid flux, g is the gravity vector, , , k are fluid density, viscosity, and permeability, respectively, Q is the source term and is the rock porosity. In this work M = 2 immiscible phases (eg, water and oil) were modeled. It is assumed that both phases are incompressible (hence no equation of state is required) and that capillary effects are negligible leading to p being the same in all phases (p 1 = p 2 ). Both of these assumptions make the model simpler and therefore computationally cheaper without affecting the objective of this study. The fluid permeability k is a function of phase saturation according to Corey's model with all critical and residual saturation values set to 0.0 and all Corey exponents set to 2.0. Each porous media flow problem is modeled on a rectangular domain with no flow domain boundaries. Additionally, the nondimensional notion of time (pore volume injected, P/V) was used, so that the volumetric flux Q = 1 corresponds to the amount of fluid injected over unit time is equal to the pore volume of the computational domain. Equations (35) were solved by an in-house finite-difference implicit pressure explicit saturation porous media flow simulator 17 with first-and second-order accuracies in time and space, respectively. The P/V time step Δt = 0.02 is enforced to be constant between each pair of consecutive data snapshots and is employed by both high-fidelity model output and NIROM.

Asymmetric four-spot injection
The first case modeled was a two-dimensional system with a single injector and three producers ( . We point out that arbitrary (in particular nondimensional) units of length can be used due to the nondimensional notion of time. The mesh resolution was set to 50 × 50. Initially the system is fully saturated with the first phase (oil), hence the initial condition is common across all simulations of interest. The second phase (water) is injected at a constant volumetric rate Q = 1 and the pressure boundary condition P = 0 is specified at each producer. Three parameters were varied in this model, namely two gravity components g x ,g y ∈ [− 1.0, 1.0] and the water to oil viscosity ratio ∈ [0.5, 1.1]. The water to oil density ratio was set to 1.2. The ranges just specified ensure that a nontrivial range of physical behaviors are modeled, as shown in Figure 7.
The resulting 3D parametric space was sampled in a regular fashion with equal resolution 3,4,5 along each dimension. Consequently, the corresponding NIROMs were constructed based on 27, 64, and 125 training runs. The RBF used in conjunction with the PS method was (r) = r. Each model was tested against the high-fidelity simulator on a set of 1000 random, uniformly distributed and unseen parametric combinations. The notion of error was again the relative time-averaged L 2 discrepancy between the NIROM and high-fidelity runs. The histograms of the relative error are given in Figure 8 and the histograms of the NIROM vs high-fidelity model speedup are given in Figure 9.

Four barriers case
The second multiphase flow problem investigated is a reconstruction of the Edward's model studied in References 7,18. In this example the 2D square domain exhibits a degree of heterogeneity by including four barriers ( Figure 10), with each barrier characterized by the isotropic permeability value between 10 −4 and 10 2 relatively to the red background. We point out that the permeability range used in this study is much wider than in Reference 7 and leads to a nontrivial range of flow patterns (Figure 11), not present if the original range is used.

DISCUSSION
NIROM captured the dynamics of the advection-diffusion validation problem with high accuracy. The relative error is typically below 10 −3 , except for very low diffusivities . Such behavior can be expected, since in practice diffusion has a significant impact on the shape of the long-time solution profile. In particular, low allows for maintaining higher gradients in the solution profile which are more problematic to capture with a reduced POD representation. A significant

F I G U R E 13
Nonintrusive reduced-order model problem simulation speedup histograms for the barrier test cases in comparison with the high-fidelity model (corresponding numbers of training runs as in Figure 12) source of error in the advection-diffusion problem is the random and non-smooth nature of the initial condition u(x;t).
The low-rank POD representation of the system is unable to capture the fine details of the initial condition, in particular high-frequency variations. However, those are dissipated relatively quickly due to nonzero diffusion and do not affect the long-time solution profile, which is captured correctly by NIROM. One of the main advantages of NIROM, namely the speedup in comparison with the high-fidelity model, is practically nonexistent in the advection-diffusion study. The reason is the fact that the forward model is very cheap to start with; in practice, NIROM would be applied to much more computationally expensive models.
Two subsequent test cases were more representative of real-life applications of NIROM, in particular since the parameter ranges were chosen so that physically significant variations in the dynamics were present in the training data. In such scenarios our method significantly outperformed that of Xiao et al both in terms of accuracy and speedup, with the latter expected due to a simpler model evaluation process. Very low accuracy of the latter method can be attributed to the miscommunication issue discussed previously and explicitly addressed in our formulation. The typical value of the NIROM approximation error was observed to be around 10% while enjoying the speedup of up to three orders of magnitude in comparison to the high-fidelity model. Naturally, as the number of training runs increases the corresponding speedup is decreased since the underlying data-driven model becomes larger in size. It should be pointed out that the latter two test problems were characterized by fixed initial conditions across all training and test simulations. The effect of varying the initial condition can be included with the model in the same fashion as for the advection-diffusion problem, but generally requires a larger number of training runs to be provided and is not a crucial aspect of this study.
The normalization procedure is another reason for increased stability and accuracy of our approach. Despite the parameters p being of the same size and having the same span, their rescaling had to be performed to match the scaling of the system state data and construct a set of more suitable samples for the RBF method. Data normalization still had a non-trivial effect, especially in the two porous media flow examples, even when all parameters in our examples spanned the same orders of magnitude (near-equal Δ (i) p , all of order O (1)). This is because the system state normalization factor Δ was of order O (10), meaning that without the normalization the sample structure would be too squeezed along the parameter dimensions resulting in a sub-optimal and highly non-uniform sample structure.
It can also be observed that the average approximation error does not always behave intuitively. More precisely, in Figure 8 the error histogram gets skewed toward right (larger error) as the amount of training data increases. This is in contrast with the intuitive expectation where a larger amount of data points used for interpolation should increase the overall accuracy and consequently decrease the average error.
Finally, our choice of the reduction method (POD) is not mandatory. Alternative reduction processes such as dynamic mode decomposition 6 are potentially viable but it should be noted that they will necessarily be less efficient at capturing the dynamics. In practical terms, a higher-dimensional reduced representation will be required for the same level of accuracy.

CONCLUSIONS
In this work a new and simple method of constructing parameterized NIROMs via radial basis interpolation methods was presented. A previous method by Xiao et al was first discussed and the miscommunication problem and its consequences were pointed out. Subsequently, two improvements were proposed, namely the rescaling procedure and a single-layer interpolation approach. The resulting method can handle an arbitrary number of parameters spanning a range of orders of magnitude. Additionally, the sensitivity of the model to state variables or parameters can be controlled easily by rescaling the normalization factors Δ , Δ (i) p . Any radial basis method can be used for the interpolation process, including the RBF, PS, or Kriging. The approach was tested on three parameterized problems, the one-dimensional advection-diffusion equation and two porous media flow examples with different kinds of model parameterizations. NIROM provided speed ups up to three orders of magnitude in comparison to the high-fidelity model at the cost of relative errors of order 10%. The NIROM implementation used in this work used MATLAB programming language for model construction and propagation. Due to its interpreted nature even higher speed ups are expected if a compiled language (eg, Fortran, C) is to be used. Our method outperformed the previous approach both in terms of speedup and accuracy.
It was observed that in some cases the approximation error increased with the amount of training data provided. It was concluded that the reason behind such behavior is the nonsmooth nature of the field f propagating the system state forward in time. A more detailed analysis of this phenomenon, a method for predicting it just from training data and a detailed analysis of suitability of NIROM in the context of reservoir engineering will be addressed in our future work.