Adaptive approximation of nonlinear eigenproblems by minimal rational interpolation

We describe a strategy for solving nonlinear eigenproblems numerically. Our approach is based on the approximation of a vector-valued function, defined as solution of a non-homogeneous version of the eigenproblem. This approximation step is carried out via the minimal rational interpolation method. Notably, an adaptive sampling approach is employed: the expensive data needed for the approximation is gathered at locations that are optimally chosen by following a greedy error indicator. This allows the algorithm to employ computational resources only where"most of the information"on not-yet-approximated eigenvalues can be found. Then, through a post-processing of the surrogate, the sought-after eigenvalues and eigenvectors are recovered. Numerical examples are used to showcase the effectiveness of the method.


Introduction
Let T : C → C n×n be a matrix-valued function, with n a (large) integer.We are interested in identifying eigenpairs (λ, w λ ) ∈ C × C n such that T(λ)w λ = 0, with w λ = 0. ( More precisely, we seek all the eigenpairs whose eigenvalues λ lie in a certain region Z ⊂ C of the complex plane, e.g., in a half-plane, on a line segment, or inside a Bernstein ellipse.Problems of this kind appear in several branches of science and engineering.For instance, λ might denote the frequency of vibration when studying the time-harmonic behavior of mechanical or electrical systems.In such cases, the eigenvector w λ provides information on the resonating modes of the system at hand.Many algorithms have been proposed for finding eigenpairs satisfying (1).Among these, we mention methods based on (affine or rational) approximations of T (•), which then cast an approximated version of (1) as an augmented linear generalized eigenproblem A w λ = λB w λ .See, e.g., [4].One of the main drawbacks of this class of methods is that the linearized eigenproblem might be rather large, especially when the matrix-valued function T is "hard to linearize", i.e., when many terms are necessary in its affine/rational approximation.On the other hand, we can find approaches based on integration of T(•) −1 , or of T(•) −1 V, with V ∈ C n×m a tall and skinny matrix (m n), over the contour ∂Z.See, e.g., [2].Such methods rely on the fact that, under standard assumptions (see, e.g., Theorem 1), eigenvalues of (1) coincide with poles of T(•) −1 .Here, one of the main difficulties is finding a reliable quadrature formula to approximate the necessary contour integrals.Notably, one should strive to reduce the number of quadrature points, since an expensive evaluation of T(•) −1 V (i.e., the assembly of T(•) and the solution of m linear systems of size n) is needed at each of them.
In this work, we follow a logic similar to that used in contour-based methods: given a vector v ∈ C n , we consider the vector-valued function 1 u : C z → T(z) −1 v ∈ C n and approximate it using the minimal rational interpolation method [7].This yields a vector-valued rational function u that approximates u over the region of interest Z.As we will see in Theorem 1, the poles of u can be related to the eigenvalues of (1), and the corresponding residues to the eigenvectors.As such, poles and residues of the surrogate u, which are much simpler to compute than those of u (see Section 2), provide approximations of the eigenpairs of interest.
In our approach, the accuracy of the approximation of the eigenpairs is, in a loose sense, inherited from the accuracy with which u approximates u.For this reason, we aim at obtaining the "best" (in some sense to be specified) approximation u, given a computational budget.More precisely, we assume that we can only evaluate u a fixed number of times.
In order to achieve our claimed "optimality of approximation" for a given computational budget, we endeavor to place the sample points (i.e., the locations z 1 , z 2 , . . .∈ C where u is to be evaluated) in such a way as to maximize the "amount of information" that each sample provides.For this, we use the adaptive version of minimal rational interpolation, originally introduced in [8].The method is presented in Section 2 and, in summary, consists of an incremental strategy for adding new sample points, each chosen in a "greedy" way based on the previous ones, until the computational budget is exhausted.
Before proceeding further, we remark that, due to its non-intrusive nature, cf.Section 4, our method may also be applied in settings more general than the one presented above.For instance, we might use it even for nonlinear eigenproblems with eigenvector nonlinearities, e.g., T(λ, w λ )w λ = 0, with T : C × C n → C n×n , cf. [5].Of course, this is feasible only under the assumption that an (approximate) solver is available for the non-homogeneous nonlinear version of the problem, i.e., Otherwise stated, one must be able to evaluate u(z) (the solution of the above non-homogeneous problem) at any value of z.

Approximation method
We start by describing the strategy for approximating a generic function u : C → C n , given the samples {(z j , u(z j ))} S j=1 , with z 1 , . . ., z S being given distinct complex points (although our method can be modified to account for point confluence), using minimal rational interpolation.We only provide the bare algorithm here, referring to [3,7] for more details.
We seek a rational approximation u : C → C n that interpolates the data exactly.For this, we rely on the extremely compact barycentric format: which achieves interpolation of the data for all (nonzero) complex coefficients q 1 , . . ., q S .The minimal rational interpolant of the data is defined as a rational function as in (2) whose barycentric coefficients q 1 , . . ., q S satisfy (q 1 , . . ., q S ) = arg min The last step follows by a simple scaling argument q j → q j S j =1 q j .The bar • and the subscript • H denote complex conjugation and conjugate transposition, respectively.We remark that, since (2), this can be interpreted as minimizing the value of u 2 at infinity.
The minimization in ( 3) is a quadratic programming problem, whose first-order optimality (KKT) conditions read Gq = λ1, Note that λ ∈ C is a Lagrange multiplier, introduced to enforce the linear constraint.An explicit solution of (3) can be obtained by left-multiplying 2 the first KKT condition by G −1 (thus finding q as a function of λ) and then plugging the corresponding value of q in the second KKT condition to find λ.This yields q = G −1 1 1 H G −1 1 as optimal solution.

Eigenpair-finding by post-processing of minimal rational interpolant
Once a minimal rational interpolant has been computed, we obtain the eigenpairs of interest by a poleand residue-finding step, which we detail here.Given a rational function u in barycentric form, see (2), its poles can be found as the (at most) S − 1 finite eigenvalues of a (S + 1) × (S + 1) generalized eigenproblem in arrowhead form, cf., e.g., [6].More specifically, we have the characterization: for any finite λ ∈ C, Now, assume that λ ∈ C is a pole of u with order N > 0, i.e., a (finite) eigenvalue of ( 4) with multiplicity N .We seek the corresponding N residues r 1 , . . ., r N , yielding the expansion for all z in a punctured neighborhood of λ, with s an analytic function.If N = 1 (the "simple" case), the single corresponding residue of u can then be computed as If N > 1, one first finds the highest-order residue r N using a formula similar to (6), and then recursively computes the other residues by deflation.We skip the details here.Each pole-residue pair (λ, r k ) satisfying ( 5) provides an approximation of an eigenpair of (1).See Theorem 1.

Adaptive sampling
In the field of model reduction, it is customary to drive the addition of new sample points by an error indicator ρ : Z → R, with Z ⊂ C being the region over which an approximation of the target u is sought.More precisely, the following greedy sampling loop is performed: Here, the function u that we approximate is the solution of the non-homogeneous eigenproblem T(z)u(z) = v, cf. the definition of u in Section 1.For this kind of problems, it is common to employ the residual norm as indicator, i.e., ρ(z) = T(z) u(z) − v 2 .However, its evaluation at a not-yet-seen point z may, in general, be expensive, due to the need to assemble T(z), to perform a matrix-vector product, and to compute a high-dimensional norm.Accordingly, the identification of the next sample point z may also be rather expensive.
To solve this issue, we rely on a different indicator.To motivate our choice of ρ, we first consider a simplified framework, namely, we assume that the target eigenproblem is linear : T(z) = T 0 + zT 1 , with 2 Inverting the Gramian matrix G may be numerically unstable.A more robust formulation for minimal rational interpolation also exists, where the linear constraint j q j = 1 in (3) is replaced by the Euclidean normalization j q j 2 = 1.In this case, the closed-form solution to the problem is: given the eigendecomposition G = VΛV H , with the diagonal elements of Λ in non-decreasing order, the optimal q is the first column of V. See [3] for more details.
T 0 , T 1 ∈ C n×n .In such setting, it was shown in [7] that, as long as the surrogate u is an interpolatory rational function u = n/d as in (2), the residual norm has the simplified form Since the constant C in (7) does not depend on z, one can find z = arg max z∈Z T(z) u(z) − v 2 only by looking at the surrogate denominator d.Taking a qualitative viewpoint, this way of finding the next sample point can be interpreted as follows: we add the next sample point by maximizing its distance from the already explored points (where information is already available) while minimizing its distance from the poles of the surrogates (where more information is needed to improve the accuracy of the eigenpair approximation).See Fig. 1 for an illustration.If the problem is more general, namely, if T(z) depends nonlinearly on z, then (7) is no longer true, since the residual norm may depend on z in a more complicated way.However, we can still choose ρ = |d(•)| −1 as heuristic error indicator, and then use it to drive the greedy sampling loop.
Remark 1.Our proposed greedy sampling driven by |d(•)| −1 has some affinity to Leja-Bagby (or similar) sampling schemes.See, e.g., [5].However, the Leja-Bagby scheme chooses the next sample point by using information on the current sample set only.On the other hand, our approach, in the spirit of greedy model reduction, uses information on the current sample set but also on the current surrogate u itself.

Theoretical considerations
We devote this section to two topics.The first one is related to our claim that pole-residue pairs of u(•) = T(•) −1 v provide information on eigenpairs of (1).We motivate this by the following particular version of Keldysh's theorem.See, e.g., [2].
Theorem 1.Let T : C → C n×n be analytic and assume that λ ∈ C is an eigenvalue, i.e., there exists a solution of (1) for such λ.For any v ∈ C n , there exist an integer N > 0, vectors and an analytic function s : Z → C n , such that All vectors {v k } N k=1 ⊂ C n solve (1) with eigenvalue λ.From this result, we conclude that the eigenpair approximation described in Section 2 is sensible.Indeed, we are simply using surrogate quantities as approximations of the corresponding exact ones in (8).
To conclude this section, we would like to briefly discuss why we choose minimal rational interpolation for the construction of the approximation u.Other interpolatory methods, like the (set-valued) Loewner framework [3] or AAA [6] use a barycentric form like (2) but rely on least-squares formulations to find the coefficients q 1 , . . ., q S .In this respect, minimal rational interpolation has the clear advantage of needing fewer sample points than its competitors, when employed to build a surrogate of certain degrees of n and d: only S samples are needed to approximate S − 1 eigenpairs, whereas the other above-mentioned methods require approximately twice as many samples to achieve the same degree.
However, this efficiency comes at a price: from a theoretical viewpoint, the convergence results for minimal rational interpolation have restrictive assumptions.Indeed, the theory developed in [7] is limited to the case of simple poles (N = 1 in Theorem 1) with linearly independent residues.In particular, this implies that minimal rational interpolants might struggle with some kinds on nonlinearities.For instance, in a nonlinear eigenproblem of the form sin(λ)w λ = 0, minimal rational interpolation might not be able to identify more than a single eigenvalue, since all the eigenvectors of the problem are collinear.A more in-depth numerical investigation of these issues is in progress, and is outside the scope of the present paper.

Numerical results
In this section, we apply our proposed method to numerically solve a nonlinear eigeproblem stemming from an application in acoustics.
A plane pressure wave with wavenumber k = 10 (all quantities are adimensional) impinges on the opening of a cavity, modeling a bottle-like Helmholtz resonator.The shape of the resonator is parametric, with the parameter z ∈ Z = [1,3] denoting the width of the neck of the resonator.The walls of the resonator are assumed to be sound-soft.We are interested in finding the values of z (i.e., the geometric configurations) for which a resonance is observed.In layman's terms, we want to determine which resonator shapes "make a sound" in response to the given impinging wave.
We base our mathematical model for this problem on the Helmholtz equation at wavenumber k, cast inside a 2D-section of the resonator Ω(z) ⊂ R 2 (we assume that the problem is invariant with respect to the third space dimension).See Fig. 2 for a depiction of two sample geometric configurations.We specify our choice of parametrization for Ω in Appendix A. The boundary of the domain, namely, ∂Ω(z), is partitioned into inlet (where a Neumann boundary condition is imposed) and wall (where a Dirichlet boundary condition is imposed).The resulting problem reads The solution field w((x, y), z) is a function of the space coordinates (x, y) ∈ Ω(z), but also of the geometric parameter z.It represents the pressure intensity at a given point, for a given resonator geometry.
In order to handle the parametric geometry of the domain, we follow a mapping approach: we fix Ω := Ω(2) as reference domain, introduce a nonlinear z-dependent bijective mapping φ z : Ω ( x, y) → (x, y) ∈ Ω(z), and cast (9) in Ω .By setting w(( x, y), z) := w(φ z ( x, y), z), we obtain Note that the change of variables leads to the appearance of the coefficients a 1 , . . ., a 4 , which are related to the Jacobian of φ z .In general, they may all be nonzero fields depending non-linearly on both z and (x, y).The expressions of such coefficients are provided in Appendix A. We introduce a triangulation of the domain Ω , and we approximate (10) using piecewise-quadratic finite elements.We employ FEniCS [9] in our implementation.This turns the PDE into a linear system of the form Tw = 0, where T = T(z) ∈ C n×n is a sparse symmetric indefinite matrix with z-dependent entries and w = w(z) ∈ C n is a vector whose entries are nodal values of the finite-element approximation of w.For our discretization, n = 52344.The task is now to find values of z ∈ [1,3] for which the linear system has non-trivial solutions.Since z enters T through the nonlinear coefficients a 1 , . . ., a 4 , this is a nonlinear eigenproblem like the one in (1).
In order to numerically approximate the eigenpairs, we introduce a vector v ∈ C n , which we obtain by imposing non-homogeneous Neumann boundary conditions 3 .Using the notation from Section 1, we then define the vector-valued function u(z) = T(z) −1 v.We then run our proposed eigensolver.We select only the endpoints of Z, namely, z 1 = 1 and z 2 = 3, as initial sample set.We allocate a computational budget of 20 samples, i.e., 20 "solves" of the problem at different values of z, allowing the heuristic indicator from Section 2.2 to drive the selection of the sample points z 3 , . . ., z 20 .We display the results of the greedy minimal rational approximation method in Fig. 3.We observe a fairly good fit between u and u.Moreover, the heuristic estimator seems to capture rather well the locations where the approximation error is still large.
Concerning the approximation of the eigenpairs, our method identifies 11 eigenvalues within the region Z, as shown in Fig. 4 (left).In addition, 8 more eigenvalues are identified outside Z, but we do not show them here.A comparison of Fig. 3 (left) and Fig. 4 (left) shows that the sample points are (loosely) placed around the (a priori unknown) eigenvalues.
As a way to assess the quality of the approximation, we evaluate the norm of the residual of the homogeneous eigenproblem, namely, T( λ) w λ 2 , where ( λ, w λ ) is an approximated eigenpair (with normalized eigenvector), obtained from a pole-residue pair of the rational surrogate u.In Fig. 4 (left) we display the magnitude of such residuals.We can observe that the residual norms range between 10 −11 and 10 −3 , in a way that is somewhat consistent with the error curve in Fig. 3.This is perhaps slightly surprising, since the latter curve depicts the approximation error for the non-homogeneous problem.
As an additional numerical test, we double the computational budget and repeat our experiment.We show the corresponding approximation results in Fig. 5.We note that the approximation error (for the non-homogeneous problem) decreases by about 5 orders of magnitude.In Fig. 4 (right) we show the results pertaining to the eigenpairs and to the residual of the homogeneous eigenproblem.We observe two effects: an overall decrease of the norm of the residual (again, by around 5 orders of magnitude) and the appearance of (seemingly) spurious eigenpairs.The latter is a rather serious issue, which could be solved (or, at least, weakened) by introducing a "filtering" step in post-processing, with the aim of eliminating multiple approximations of the same eigenvalue.
values of the finite-element solution of the Helmholtz equation corresponding to the imposed non-homogeneous boundary conditions.However, it is not necessary for the application of our technique: in fact, a random (e.g., Gaussian) vector would have served us perfectly well, if not better.

Figure 1 :
Figure 1: Simple rational residual indicator ρ(z) = |d(z)| −1 in two test cases.The sample set consists of the points {−1, 1} in both plots.The greedy sampling loop, by maximizing the indicator, will place the next sample point at the location denoted by the vertical dashed line.Note that, in the left plot, the surrogate has a pole inside the interval Z = [−1, 1].The greedy indicator suggests taking a sample there.

Figure 2 :
Figure 2: Left: mesh on reference domain (coarsened to make the mesh visible).Center and right: solutions to the non-homogeneous problem for z = 2 and z = 1.28.The solution for z = 1.28 has been remapped from Ω to Ω(z = 1.28) for illustration purposes.Note the scale.

Figure 3 :
Figure 3: Left plot: exact u 2 (black dots) and approximated u 2 (blue line) using 20 greedily selected samples.The 20 sample points are denoted by the red dots at the bottom.Right plot: relative approximation error at 101 validation points (blue line).The greedy indicator ρ(z) = |d(z)| −1 (on a finer grid of 501 points) is also shown as a dotted line.