Parametric model order reduction and its application to inverse analysis of large nonlinear coupled cardiac problems

Predictive high-fidelity finite element simulations of human cardiac mechanics co\-mmon\-ly require a large number of structural degrees of freedom. Additionally, these models are often coupled with lumped-parameter models of hemodynamics. High computational demands, however, slow down model calibration and therefore limit the use of cardiac simulations in clinical practice. As cardiac models rely on several patient-specific parameters, just one solution corresponding to one specific parameter set does not at all meet clinical demands. Moreover, while solving the nonlinear problem, 90\% of the computation time is spent solving linear systems of equations. We propose a novel approach to reduce only the structural dimension of the monolithically coupled structure-windkessel system by projection onto a lower-dimensional subspace. We obtain a good approximation of the displacement field as well as of key scalar cardiac outputs even with very few reduced degrees of freedom while achieving considerable speedups. For subspace generation, we use proper orthogonal decomposition of displacement snapshots. To incorporate changes in the parameter set into our reduced order model, we provide a comparison of subspace interpolation methods. We further show how projection-based model order reduction can be easily integrated into a gradient-based optimization and demonstrate its performance in a real-world multivariate inverse analysis scenario. Using the presented projection-based model order reduction approach can significantly speed up model personalization and could be used for many-query tasks in a clinical setting.


INTRODUCTION
Cardiac solid mechanics simulations consist of solving large-deformation, materially nonlinear, elastodynamic coupled boundary-value problems. As the exact fluid dynamics of blood within the heart are usually not needed, the structural model is commonly coupled to lumped-parameter fluid models which provide the pressure to the endocardial wall 1 . These so-called windkessel models are then coupled to cardiac solid mechanics 2 . For a comprehensive review of models currently utilized in cardiac mechanics the reader is referred to 3 .

Model order reduction
The needed huge number of degrees of freedom (DOFs) and other challenges of solving coupled nonlinear problems make the solution of cardiac models computationally expensive and limit the models' use in clinical practice. For example, using a single node with 24 cores, a simulation of one heartbeat, which takes about one second in reality, takes about one day to compute with our high-fidelity four chamber model 4 . The potential to reduce computation time motivates the use of reduced order models (ROMs). In the following, different strategies in reduced order modeling are reviewed.
An important category of cardiac ROMs are "bottom up" ROMs. For these models, the same system of differential equations as for the full order model (FOM) is solved, but on a simplified analytical geometry. The displacements are commonly parameterized by only one scalar degree of freedom (DOF). These models are thus referred to as 0D models. Examples in this category include monoventricular cylindrical 5 , spherical 6 , or prolate spheroid 7 or biventricular 8 geometries. These models allow extremely fast evaluation, with computation times less than one second. Their results are, however, only lumped quantities which usually need an extra correction step in order to predict the solution of a corresponding patient-specific 3D model.
Another approach of MOR in biomechanics is the use of coarsely discretized geometries, see e.g. 9,10 . Coarsely discretized models are easy to implement, since the computational framework is identical to the one of the FOM. The disadvantage of using coarsely discretized geometries is that there is no exact control over the approximation quality and important features of the FOM might not be preserved by the ROM.
A third category of cardiac ROMs are "top down" ROMs, which are utilized in this work. These ROMs make a model computationally less expensive by reducing the dimension of the problem, starting from the FOM. For example in cardiac electrophysiology, approximated lax pairs for propagating wave fronts were proposed in 11,12 . A local reduced basis method for parameterized cardiac electrophysiology was recently introduced in 13 . Reduced basis methods were proposed for general large deformation, material nonlinear finite element simulations 14,15 . A framework for linear coupled multiphysics problems was introduced in 16 .
For large-scale finite element simulation, about 90 % of the time is spent iteratively solving linear systems of equations. This proportion motivates the use of model order reduction by projection, where the full linear system is projected onto a much smaller dimensional subspace while preserving the model's most relevant features. The solution of the FOM is then approximated by a solution in the reduced space with a ROM. A popular method to generate such subspaces is proper orthogonal decomposition (POD), which is purely observation-based and independent of the underlying physics of the model. The snapshots, in our case transient observations of displacements, can be obtained from a FOM simulation of one heartbeat.
There are only few examples where POD has been applied to cardiac problems. The reduction of a patient-specific biventricular cardiac model using POD is described in 17 . A quasi-static cardiac model was reduced using POD in 18 and combined with hyperreduction techniques. However, analysis was only carried out using an idealized ellipsoidal left ventricular geometry with few DOFs. While this is very instructive, results for speedup and accuracy of the ROM are not conclusive for real-world cardiac problems. Furthermore, both references used a purely structural cardiac model with prescribed blood pressures.
Cardiac models rely on a large set of patient-specific parameters, describing constitutive behavior, hemodynamics, boundary conditions, or local fiber orientation. In order not to rely on a FOM simulation for each new ROM simulation, which would render the ROM simulation useless, the reduced subspace must be able to adapt to a changing parameter set. This adaption requires parametric model order reduction (pMOR). Among many global and local pMOR techniques, various subspace interpolation methods have been proposed in the past 19 . Specifically, a popular method using a Grassmann manifold was proposed in 20 and illustrated with a large coupled aeroelastic model of a fighter jet. The method proposed in 17 uses direct interpolation of parameter-dependent solutions instead of interpolating subspaces. Furthermore, a global pMOR approach using a global basis over the whole parameter range is employed in 18 .
The performance of POD in realistic coupled simulations of cardiac contraction is yet unknown. We demonstrate in this work the performance of POD applied to a patient-specific cardiac geometry with about 850 000 structural DOFs. To the best of the authors' knowledge, POD for large nonlinear models has only been applied to single fields, e.g. structural mechanics or fluid dynamics, separately 21 . In this work, we consider for the first time the case of a POD-reduced 3D structural model that is monolithically coupled to a 0D windkessel model, where we only reduce the structural dimension of the problem. Additionally, we compare several subspace interpolation methods for cardiac contraction. In these parametric simulations, we vary the contractility parameter controlling maximum active tension of the myofibers in our model, as it is the most influential parameter for cardiac function and commonly calibrated to experiments.

Inverse analysis
Many of the cardiac model parameters depend on a patient's physiology and are a priori unknown, as invasive experiments cannot be carried out on living human subjects. A predictive patient-specific cardiac model is thus subject to an iterative process termed inverse analysis. In this context, the simulation of one heartbeat with given parameters can be regarded as the forward problem. The reverted task of matching the parameters to given observations from the patient-specific heartbeat is then the inverse problem. Common clinical measurements of cardiac kinematics are displacement data extracted from cine or tagged magnetic resonance imaging (MRI), representing an Eulerian and Lagrangian description of motion, respectively. Other measurements include blood pressure or electrocardiograms. As cardiac mechanics simulations pose an expensive forward problem, repeated evaluation during inverse analysis has incredible computational demands. Furthermore, algorithms for inverse analysis commonly scale linearly with the number of parameters. Inverse analysis is thus a promising application of reduced order modeling.
During gradient-based optimization, the adjoint method offers computationally inexpensive gradient calculation. For example in 22 , regional contractility was estimated from short axis cine MRI. Using the adjoint method, ischemic regions in cardiac electrophysiology were identified in 23 . Most recently in 24 , passive material parameters and active fiber shortening was estimated for a biventricular geometry from ventricular volume and regional strain.
Gradient-free inverse analysis for cardiac problems was demonstrated in 25 , where regional cardiac contractility was estimated from cine MRI using the unscented Kalman filter (UKF). The reduced order UKF was further applied in 26,27 to estimate boundary condition parameters of the aorta for a fluid-structure-interaction problem. Other examples of gradient-free inverse analysis include 28 , where left-ventricular active and passive material parameters were estimated from 3D tagged MRI using a parameter sweep.
There are some examples, where reduced order modeling has been combined with inverse analysis in biomechanics. For arterial hemodynamic fluid-structure-interaction problems, an inverse analysis with uncertainty quantification was performed in 29 using a reduced basis method. There are however few references for cardiac solid models. In 17 , the reduced order UKF was applied to estimate cardiac contractility in a healthy and an infarcted region. The forward simulations were carried out using POD, thus converging to a different solution than using the FOM only. In 10 a multifidelity approach was proposed to calibrate hemodynamical and structural parameters of a cardiac model to ventricular pressure measurements. Here, a Levenberg-Marquardt-based optimization uses evaluations switching between a 3D FOM, a coarsly discretized version of the 3D FOM, and a 2D surrogate model. Another multifidelity approach was used in 30 between a 3D FOM and a 0D surrogate model.
Using coarsely discretized or surrogate models does however not guarantee that the most important features of the FOM are preserved. These surrogate models further lack the ability of pMOR to inherently "learn" from evaluations of the FOM to become more precise throughout the optimization. Instead, they require an additional mapping between FOM and surrogate model solutions. Most importantly, using 2D or 0D surrogate models during inverse analysis, the heart can only be tuned to scalar measurements. However, a calibration to spatial measurements from cine or tagged MRI might be desired in many applications, e.g. when detecting infarcted regions 25 . In this work, we thus propose a novel method of how a ROM can be integrated into any optimization-based inverse analysis leading to considerable savings in CPU time, and demonstrate its performance in a real-world multivariate inverse analysis scenario.
The remainder of this work is structured as follows. In section 2, we introduce the full order elastodynamic and hemodynamic models. We derive a reduced formulation for the monolithically coupled system in section 3 and review several pROM subspace interpolation methods. In numerical experiments in section 4, we demonstrate the accuracy and speedup of our ROM and show its response to parametric variations. Furthermore, we propose in section 5 a pMOR-based method for inverse analysis and analyze its performance with our four-chamber heart model. We close with a conclusion and future perspectives in section 6.

3D-0D COUPLED CARDIOVASCULAR MODELING
In this section we give a brief overview of our full order model (FOM) which is composed of a 3D elastodynamical model, see section 2.1, coupled to a 0D hemodynamical model, see section 2.2. We further provide insights into the numerical solution of the model in section 2.3. For a more detailed description, the reader is referred to 4 .

3D elastodynamical model
We follow the classic approach of nonlinear large deformation continuum mechanics to model the elastodynamic problem of 3D cardiac contraction. We define the reference configuration and the current configuration , which are connected by the displacements = − . We calculate the deformation gradient , the right Cauchy-Green tensor , and the Green-Lagrange strain tensor Balance of momentum, a Neumann windkessel coupling condition with left ventricular pressure , omni-directional springdashpot boundary conditions, and pericardial boundary conditions yield the weak form of the 3D elastodynamic boundary value problem in the reference configuration with density 0 , accelerations̈ , virtual displacements and strains and , respectively, the Jacobian = det of the deformation gradient, the second Piola-Kirchhoff stress tensor , the reference surface normal , and spring stiffness , and viscosity , for vessel and epicardial surface, respectively. We define three surfaces for the imposition of boundary conditions endo 0 , vess 0 , and epi 0 at the left endocardium, the outside of the great vessels, and the epicardium respectively. At the cut-offs of the great vessels we apply homogeneous Dirichlet boundary conditions. We define different nonlinear materials for adipose tissue, ventricular myocardium, atria, aorta, and the pulmonary artery. The geometry and the different materials are shown in figure 1 . Each is composed of a hyperelastic and a viscous contribution depending on the rate of Green-Lagrange strains. Quasi-incompressibility is enforced by a penalty potential. The ventricular myocardium is modeled with an isotropic Mooney-Rivlin material and has an additive active stress component act modeling the contraction of myofiber bundles in reference fiber direction 0 with active stress ∈ [0, [ and the function | ( )| + = max( ( ); 0). The contractility parameter controls the upper limit of the active stress component. The prescribed activation function ( ) is with steepness = 0.005 s and descending and ascending sigmoid functions + and − , respectively. The indicator function ∈ ]0, 1[ indicates ventricular systole. The times sys and dias model the onset of systole and diastole, respectively. The times sys and dias , and maximum and minimum myocyte activation rates max and min , respectively, are calibrated to match the timing of ventricular systole as observed in cine MRI, as demonstrated in section 5. We discretize displacements and virtual displacements arising in the weak form (2) using quadratic basis functions on each tetrahedral finite element. Assembly of the discretized problem leads to the matrix notation of the spatially semi-discrete residual of the full order structural model with mass matrix , nonlinear force vector , and nodal displacements, velocities, and accelerations ,̇ , and̈ respectively. We discretize the boundary value problem in time using a combintation of Newmark's method 31 and the generalized-scheme 32 to obtain the time and space discrete structural residual S ( +1 , , +1 ) at time step + 1.

0D hemodynamical model
We couple the left ventricular 3D structural model to a 0D lumped-parameter windkessel model of the circulatory system. We utilize in this work a four element windkessel, using resistances , compliances , and an inertance . Pressures at different parts of the model are denoted by . We distinguish between a proximal part and a distal part of the aorta. The atrial pressure ( ) is prescribed to simulate atrial systole. The venous pressure ref is kept constant. We model the atrioventricular and semilunar valves with a smooth diode-like behavior by nonlinear resistances ∶= ( , ) and ∶= ( , )  respectively, depending on a sigmoid function . This yields the set of differential equations which are coupled to the 3D structural model by the ventricular pressure and the change in ventricular volumė , depending on the structural displacements . The vector of primary variables yields = [ , , , ] T , including the flux through the inertance . We discretize the set of windkessel equations (6) in time with the one-step-scheme 33 . This yields the discrete windkessel residual 0D ( +1 , +1 ) at time step + 1.

Solving the coupled problem
We solve the coupled 3D-0D model with the structural and windkessel residuals S and 0D , respectively, for the displacements and windkessel variables at time step + 1 with the Newton-Raphson method linearizing the residuals in iteration . The solution is converged if with the structural and windkessel residual and increment tolerances S res , 0D res , S inc , and 0D inc , respectively. Note that the coupled model in (7) is independent of the concrete formulation of the structural and windkessel models from sections 2.1 and 2.2 respectively. It is valid for any arbitrary residuals S and 0D . For details of the model used here, again see 4 .

NONLINEAR PARAMETRIC MODEL ORDER REDUCTION
The 3D-0D cardiovascular model described above represents a large-scale, nonlinear, parametrized, and monolithically coupled model, featuring multiple sources of nonlinear system behavior and depending on several model parameters. Firstly, our model contains geometric nonlinearities, due to the use of the Green-Lagrange strain tensor ( ). Secondly, the utilized material laws for myocardial tissue induce material nonlinearity. The third and last source of nonlinearity is given by the nonlinear coupling between the structural and the hemodynamical model due to the Neumann windkessel boundary condition, acting in direction of the current normal vector of the endocardium. Furthermore, our model depends on many parameters = 1 , … , T ∈ Ω ⊂ ℝ , classified in different categories. For instance, there exist parameters describing the constitutive behavior of the used materials (stiffness, viscosity, and incompressibility parameters), the additive active stress component act (e.g. the contractility , max , min , sys , dias ), the hemodynamics (e.g. resistances , compliances , inertance ), as well as the boundary conditions for the outside of the great vessels and the epicardium (spring stiffnesses , and dashpot viscosities , ). Thus, our discrete nonlinear parametrized FOM reads The use of a 0D lumped-parameter windkessel model, instead of e.g. a 3D fluid dynamics model of the heart chambers and arteries, already simplifies the computational complexity of the coupled system. However, the numerical analysis of the present model still demands a high computational effort due to the large number of structural DOFs. While this is no problem for a few standard forward simulations, it is extremely challenging -and might even prohibit -fast model calibration, inverse analysis, and clinical applications. Therefore, our aim is to employ projection-based model order reduction to obtain a cardiovascular reduced order model (ROM) that accurately approximates the original model with substantially less DOFs and, consequently, less numerical effort. To this end, in section 3.1 we first apply the classical projection-based model order reduction framework to the structural component of our cardiovascular problem and further describe the numerical solution of the coupled ROM.
A suitable strategy to compute the required projection matrix for a fixed parameter set is then explained in section 3.2.
Afterwards, different subspace interpolation techniques are presented in section 3.3, in order to compute a parametric reduced order model (pROM) for any new parameter set. Finally, some implementation details are given in section 3.4.

Cardiovascular reduced order model
As mentioned before, the high dimension of the FOM comes from the discretization of the 3D elastodynamical model, whereas the 0D model only contributes few, in our case four, windkessel DOFs in . Consequently, we only apply model order reduction to the structural component while the 0D hemodynamical model remains unchanged. Our aim is to approximate the high dimensional structural solution ∈ ℝ using a linear combination of ≪ basis vectors, with being the number of DOFs of the full elastodynamical model. With the set of basis vectors ∈ ℝ contained in a projection matrix = 1 ... ∈ ℝ × and the vector of the reduced model's DOFs r ∈ ℝ , the approximation is To obtain a square system, we project the structural residual onto the space spanned by T yielding the spatially semi-discrete reduced residual S semi,r ∈ ℝ S semi,r = T S semi ( ̈ r , ̇ r , r , , ).
This represents the Galerkin projection of the full structural residual S semi onto the subspace  spanned by the vectors in . After discretization in time, we obtain the Newton-Raphson update + 1 at time step + 1 of the coupled reduced order model by linearizing the spatially and time discrete form of the residual. Through the chain rule of differentiation, the left column entries of the block Jacobian matrix are right-multiplied by . The structural block T S ∕ is now of reduced dimension ℝ × . Note that the windkessel Jacobian 0D ∕ and windkessel degrees of freedom remain unchanged. For the off-diagonal coupling blocks 0D ∕ and S ∕ only the structural dimension is right and left multiplied with the projection matrix or its transpose, respectively. The original (7) and reduced (12) block-Jacobian are visualized in figure 2 .

S S 0D 0D
Structural MOR Visualization of the Jacobian during projection-based model order reduction of the structural dimension of the block matrix system in (7) to (12). The diagonal structural and windkessel blocks are colored yellow and green, respectively. Off-diagonal coupling blocks are shaded. Note that the dimension of the diagonal windkessel block remains unchanged.
The update step of a Newton iteration is carried out by extrapolating the reduced DOFs r to the full order displacement vector with the projection matrix by Note that the full order structural residual S and the full order block-Jacobian are always evaluated using the full order displacements . It is only after their full evaluation and assembly that their dimensions are reduced by projection. The convergence check is carried out with the reduced residual and reduced displacement increment The convergence criteria for the 0D windkessel model remain unchanged. As the coupled full order model (7) in section 2.3, the coupled ROM in (12) is valid for any full order structural and windkessel residual S and 0D , respectively.

Subspace computation via POD
In this work, we use the method of Proper Orthogonal Decomposition (POD) to compute the reduced basis required for the projection-based reduction of our full problem. POD 34,35 is a straightforward and very well-known nonlinear model reduction approach, which relies on so-called snapshots, i.e. discrete-time observations of the solution of our FOM for a fixed parameter set , to construct the basis . Given s ≪ snapshots gained from a numerical simulation of the FOM sample point, we obtain the snapshot matrix The goal of POD is to construct a basis for an optimal approximation of the solution manifold spanned by the snapshot matrix.
In other words, the aim is to generate a basis that optimally approximates the information gathered in the snapshots. Therefore, we perform a singular value decomposition (SVD) of the snapshot matrix with the orthogonal matrices ∈ ℝ × and ∈ ℝ s × s containing the left and right singular vectors, respectively, stored column-wise. The diagonal matrix features all s singular values sorted in descending order on its main diagonal. We now select the first singular vectors from the columns of the left singular matrix corresponding to the largest singular values in to obtain the basis vectors of the projection matrix . The singular values are frequently used to define the Relative Information Content (RIC) This measure allows to select an appropriate basis dimension such that ( ) ≥ 1 − POD for a given small tolerance POD . 18 The approximation error made by selecting < s basis vectors can be quantified by the sum of the squared truncated singular values Note that this technique provides an optimal basis for the approximation of the snapshot matrix in a least-squares sense. 36,37,38 Thus, the efficiency of POD and the basis quality crucially depends on the selection of snapshots, which is required to represent the model's dynamics behavior sufficiently. Further note that POD requires the expensive numerical simulation of the full forward model, in general for many parameter sets, to collect representative snapshots. Nevertheless, this data-driven approach is very well applicable for the reduction of any nonlinear system.

Interpolation of subspaces
The cardiac model described in section 2 relies on many patient-specific parameters, describing e.g. constitutive behavior, hemodynamics, boundary conditions, or local fiber orientation. Consequently, a repeated model evaluation and simulation for many different values of the parameters is indispensable to personalize the model. The aim of parametric model order reduction (pMOR) is to find a reduced cardiovascular model that preserves the parameter-dependency, thus allowing a variation of any of the parameters directly in the reduced model without having to repeat the whole reduction process each time. The parametric reduced model can be then used e.g. for patient-specific parameter estimation or uncertainty quantification purposes.
To efficiently reduce the parametric cardiovascular model, we decompose the pMOR procedure into an offline and online stage. In the offline phase, the parametrized full order model with parameters = 1 , … , T ∈ ℝ is first simulated for several parameter sample points , = 1, … , , and then corresponding local projection bases ( ) are computed via POD from the data obtained. In the online phase, the projection matrix ( * ) for a new parameter value * is generated by interpolating between the precomputed subspaces. In this paper, different subspace interpolation techniques are examined, which will be explained in the following. To do so, we suppose that local basis matrices 1 = ( 1 ), … , = ( ) ∈ ℝ × spanning the subspaces ( 1 ), … , ( ) have been computed in the offline phase from the snapshot matrices ( 1 ), … , ( ) ∈ ℝ × s at the sample points 1 , … , . Each basis matrix ( ) is composed of the vectors ( ) =1 . For the interpolation, appropriate weighting functions ( * ) should be selected to compute the interpolated basis ( * ) in the online phase. Basically, any multivariate interpolation method could be used for this purpose: e.g. polynomial interpolation (Lagrange polynomials), piecewise polynomial interpolation (splines), radial basis functions (RBF), Kriging interpolation (Gaussian regression), inverse distance weighting (IDW) based on nearest-neighbor interpolation or even sparse grid interpolation. 19 For simplicity, in this paper we consider the special case of piecewise linear interpolation. We compare in this work four interpolation methods: weighted concatenation of bases, weighted concatenation of snapshots, adjusted direct basis interpolation, and basis interpolation on a Grassman manifold. A detailed mathematical description of the subspace interpolation methods is given in appendix A.

Implementation details
The coupled FOM and ROM in (7) and (12), respectively, are solved using our in-house parallel high-performance finite element software package BACI 39 . The code is implemented in C++ making use of the Trilinos library 40 . To the solve FOM's large linear system in (7) we use a parallel iterative GMRES solver with 2 × 2 block SIMPLE-like preconditioning. For the ROM's (a) Reference configuration.

FIGURE 3
Visualized displacements in four-chamber-view. Displacements increase from blue to red regions.
small linear system in (12) we use a serial direct solver. All preliminary calculations, i.e. singular value decompositions and the interpolation of subspaces, are performed in MATLAB (Release 2017b, The MathWorks, Inc., Natick, MA, USA).

NUMERICAL RESULTS AND DISCUSSION
In this section we present results for the approximation of the FOM simulation with ROM simulations. We distinguish here between model order reduction and parametric model order reduction. For a fixed parameter set, using the contractility = 280 kPa, we explore the approximation qualities of POD in section 4.1. Afterwards, we analyze the approximation quality with respect to a changing contractility in section 4.2. We employ a four chamber geometry obtained in vivo from a 33 year old healthy female volunteer. The imaging data was acquired at King's College London, UK using a Philips Achieva 1.5T magnetic resonance imaging (MRI) scanner. The geometry was meshed using Gmsh 41 with a resolution of 2 mm, yielding 282 288 nodes and 167 232 quadratic tetrahedral elements, totalling = 846 864 structural degrees of freedom (DOFs). Additionally, we have four windkessel DOFs. The cut geometry is displayed in figure 3 a.

Model order reduction
In this section we demonstrate the reducibility of our coupled hemodynamical-structural simulation model of a cardiac cycle and compare computational costs. Following analysis of the heart's eigenmodes in section 4.

POD-modes of the heart
To study the reducibility of our cardiac model, we analyze the decay of the singular values compared to the first one. This gives a measure of relative importance of the modes selected by POD. In figure 4 a we show the normalized singular value ∕ 1 of mode . For modes < 50 there is a fast decay in relative importance, indicating good reducibility. There is a plateau for

Approximation quality
To quantify the overall approximation quality of ROM simulations, we calculate a spatial error compared to the FOM solution. We define here the spatial ∞,∞ -error with ROM ( ) and FOM ( ) as nodal displacements at node at time step of ROM and FOM respectively. The spatial ∞,∞ -error thus gives the highest displacement error at any node at any time step and is an upper bound for all spatial approximation errors. The ∞,∞ -error is shown in figure 4 b depending on the number of reduced modes . It is clearly evident that the approximation error strongly decreases, when more modes are used for the approximation. Remarkably, even for the very low number of 10 modes we obtain a solution whose largest approximation error at any node at any time step is below 1 mm, which is the order of magnitude of our MRI resolution from which the geometry was obtained. Furthermore, using ROM simulations with a reduced order of > 300 does not yield significant improvements in terms of accuracy. This is in agreement with the decay of the normalized singular values in figure 4 a, where modes > 300 contain little more information than the preceding ones. For many medical applications, it is not necessary to calculate an accurate spatial displacement field. There are rather a couple of scalar quantities which are used in clinical practice, e.g. as a cardiac performance indicator or for the prediction of disease progression. Such a quantity is the ejection fraction which is calculated from left or right ventricular volume. To evaluate the approximation of the EF by a ROM simulation, we compare in figure

Speedup
For performance measurements, we ran all FOM and ROM simulations on a single node of our Linux cluster. One node features 64 GB of RAM and two Intel Xeon E5-2680 "Haswell" processors, each equipped with 12 cores operating at a frequency of 2.5 GHz. In figure 6 we give a brief overview of the numerical performance of the FOM simulation. We show in figure 6 a the number of Newton iterations in each time step. The number of Newton iterations is between three and nine. It is elevated to five during ventricular systole and rises to nine at end-systole, where the aortic valve closes. The number of linear solver iterations of each Newton iteration at a given time step is shown in figure 6 b. The number of linear iterations is between 20 and 60 and shows similar trends as the number of Newton iterations. This performance is reasonable and assures a good basis to which ROM simulations can be compared to. In the following, we compare exclusively simulation time and exclude time for creating the projection matrix . It is calculated once in a preliminary step using the same hardware and requires only about one minute.  The computation time of ROM simulations with various reduced orders is compared to the total FOM simulation time in figure 7 . Firstly, we show in figure 7 a the speedup factor of ROM over FOM simulations. The effect of POD is evident for ROM simulations between = 500 and = 10, where we achieve a speedup of ≈ 5 and ≈ 13 over the FOM simulation, respectively. Note that while we achieve high speedups we did not lower hardware demands. The RAM consumption has actually increased slightly for ROM simulations, since we need to additionally store the projection matrix . This again motivates to pursue hyper-reduction in future studies.
We distinguish in figure 7 b between three components of total computation time. Component "Linear system" includes the time required for the multiplications of the projection matrix with the blocks of the Jacobian matrix in (12) as well as the time to solve the reduced linear system. This component strongly depends on the reduced order as it scales with the complexity of the matrix-matrix multiplications, which is the main time contributor. The solution time of the reduced linear system itself is negligible. Component "Evaluate elements" contains time spent during element evaluation to assemble the block Jacobian matrix and the right-hand side. As expected, this component is independent of , since we still build the full system before projecting it to the -dimensional subspace. Component "Other" sums up all other time spent during the simulation, e.g. file input and output or general overhead, and is also independent of .
In figure 7 c we show the relative distribution of simulation time for FOM, ROM500, and ROM10. For the 21 hours spent during a FOM simulation, 92% of simulation time are spent solving the linear system. This large proportion shows the potential for savings using MOR with POD. Reducing and solving the linear system in ROM10 only makes up 4% of the simulation time. However, the new bottleneck is now the element evaluation at 63% of the simulation time. This, again, motivates the use of hyper-reduction methods, such as the discrete empirical interpolation method (DEIM) 42 , for ROMs with very few degrees of freedom.

Parametric model order reduction
In this section we provide a quantitative comparison of several subspace interpolation methods introduced in section 3.3 for parametric model order reduction (pMOR) to demonstrate the ability to evaluate the ROM simulations at parameter sets without prior FOM knowledge. We vary the contractility in (3), controlling the upper limit of the myocardium's active stress in fiber direction. It is a key parameter of cardiac contraction and has a major influence on elastodynamics as well as on several scalar cardiac measures. It is commonly calibrated to match the end-systolic volume of the left ventricle as measured in cine MRI.
In this work, we vary the contractility ∈ [280 kPa, 430 kPa], as this range produces FOM simulation results that are in agreement with cine MRI. We use = 300 for all ROM simulations in this section, since it was shown in figure 4 b that no further improvements are made in approximation quality for > 300. In section 4.2.1 we demonstrate the approximation quality with respect to the spatial displacement field. However, in many clinical applications a full solution of the displacements is not needed. We therefore show the approximation quality of pMOR with respect to scalar cardiac quantities of clinical significance in section 4.2.2.

Approximation of displacements
In figure 8 we compare the spatial ∞,∞ -error for a varying contractility. We compare pMOR simulations using snapshots of FOM simulations from one, two, and four -sample points in figures 8 a, 8 b, and 8 c, respectively. As a reference, we additionally show the error obtained from direct ROM simulations, i.e. ROM simulations where we use a projection matrix ( ) from the corresponding FOM evaluation for each . This information is however not used in the pMOR solutions displayed here and would not be available in a typical MOR scenario, as it would render MOR useless. The direct ROM approximation error is mostly independent of the choice for parameter . We show in figure 8 a the MOR approximation with varying and a constant projection matrix ( 1 ) which was obtained from a single FOM simulation with sample point 1 = 355 kPa. Technically, this would not be considered in pMOR, since the projection matrix is not adapted to the parameter set. It can be observed that MOR simulations with ≠ 1 provide reasonable results with a spatial error below 1 mm using ( 1 ). However, with an increasing range of the parameter interval, the additional effort of subspace interpolation becomes advantageous. The approximation error of MOR simulations using two and four sample points are displayed in figures 8 b and 8 c, respectively. Both studies show similar results. The error is highest between sample points and approaches the error of the direct ROM simulations close to the sample points. The pMOR approximation errors are reduced when using a finer resolution of sample points. The subspace interpolation with the largest spatial error is obtained by the Grassmannian manifold and concatenation of bases (CoB) methods, coinciding in the middle between two sample points. An error of one order of magnitude smaller is achieved by using the concatenation of snapshots (CoS) and the adjusted direct interpolation methods, staying well below a spatial ∞,∞ -error of 1 mm.
The reason for the good performance of the adjusted direct interpolation method is due to the POD of both snapshot matrices of the left and right sample point yielding modes that allow for a distinctive pairing according to the Modal Assurance Criterion. However, the same subspace can be spanned by two sets of orthogonal basis vectors which are not necessarily linearly dependent on each other. If the information of a mode related to the left sample point is scattered among various modes corresponding to the right sample point, the direct interpolation method will most probably yield considerably worse results. The CoS method is in our case far superior to the CoB method. As outlined in section 3.3, the CoS method allows for a direct usage of the snapshots at the sample points, while the CoB method uses the projection matrices computed at the sample points. The projection matrices, as compared to the snapshots, contain no information about the relative importance of the modes. The snapshots thus contain more information about the dynamics of the system, enabling the CoS method to select a more suitably interpolated subspace than the CoB method. Note, however, that the CoS method can only be applied when using it in combination with an observation-based reduction technique like POD.

Approximation of scalar cardiac quantities
In figure 9 we show scalar output quantities of our cardiac model, evaluated for the same MOR simulations as in figure 8 using one, two, and four sample points in figures 9 a, 9 b, and 9 c, respectively. For each study we evaluate ejection fraction (EF), maximum left ventricular pressure (LVP), and maximum left atrioventricular plane displacement (LAVPD) and compare them to the the FOM results. All three output quantities are important determinants of cardiac viability. They are also chosen because they allow us to study the approximation of different outputs of our coupled 3D-0D elasto-hemodynamical model with pMOR. EF, as defined in (22), is an integral value of the spatial displacement field, LVP is an output of the 0D windkessel model, and LAVPD is the average of a small subset of nodal directional displacements. For one sample point we again compare the FOM solution in figure 9 a to the solution of a ROM using a constant projection matrix. The parameter dependence of the three cardiac quantities on is reproduced well by the constant ROM simulations. As expected, the deviations from the FOM solution are largest at evaluations furthest away from the sample point. However, the accuracy might still be sufficient for many applications. The cardiac quantities using two and four sample points are shown in figures 9 b and 9 c, respectively. For clarity, we show here only the results of the subspace interpolation methods which performed best and worst in figure 8 , i.e. the CoS and Grassmann method, respectively. The outputs oscillate visibly between sample points when using the Grassmann interpolation method, improving as the resolution of sample points is refined. As in figure 8 , the CoS method performs well, leading to a good approximation of the scalar cardiac quantities between sample points.

APPLICATION TO INVERSE ANALYSIS
Models of cardiac elasto-hemodynamics commonly depend on a large set of parameters which need to be calibrated to patientspecific measurements using inverse analysis. This procedure is however computationally very expensive, as it requires many evaluations of the model (7), which will be denoted forward model in the following. We propose in this section a novel approach for gradient-based inverse analysis utilizing the parametric reduced order model (pROM) developed in section 4.2. We replace the FOM forward evaluations typically required for the finite differences to obtain the gradient by pMOR forward evaluations (12). We illustrate this method using a simple Levenberg-Marquardt (LM) algorithm 43,44 in section 5.1. The method however is applicable to any gradient-based optimization. In section 5.2 we demonstrate its performance on a typical inverse analysis scenario using the cardiac forward problem introduced in section 2.

Parameter estimation based on reduced models
Given normalized model outputs ( ) ∈ ℝ of a FOM depending on normalized parameters ∈ ℝ and ≥ normalized measurements we aim to minimize the squared sum of residuals to obtain the optimal set of parameterŝ . The Jacobian of the residual vector with respect to the parameter vector is ∈ ℝ × . At the optimum̂ , the gradient ∇ = T = vanishes and the Hessian ∇ 2 > is positive definite. Using the LM algorithm, we obtain the iterative procedure with at iteration +1 with damping parameter . The LM algorithm approximates the Hessian as ∇ 2 ≈ T . The damping parameter should tend to zero as the parameter set approaches the optimal solution̂ . For → ∞ we approach the steepest descent method, for = 0 we obtain the Gauss-Newton method. In general, for a nonlinear model the analytical derivatives of the model evaluations with respect to the parameters required for the Jacobian matrix are not easily available. The columns of the Jacobian matrix are thus typically approximated by finite differences The gradient evaluation vector Δ is built from the -th component of a finite distance vector ∈ ℝ and the -th unit vector ∈ ℝ in the direction of each parameter. The sign in (27) is chosen for each parameter so that the evaluation with parameter set Δ is within the range of all previously evaluated parameter sets.
Calculating the approximated Jacobian matrix requires + 1 evaluations of our forward model which is computationally expensive in case of a large number of parameters . The pROM introduced in section 4.2 is very accurate for parameters in the proximity of the sampled parameter sets, as was shown in section 4.2.1 for the cardiac contractility parameter . More importantly, the pMOR evaluations shown in section 4.2.2 were not only able to accurately predict FOM scalar cardiac quantities but also their slope with respect to a changing contractility from only two FOM sample points. We therefore propose to use pROM evaluations of in (27) instead of FOM evaluations. The iterative procedure for the inverse analysis is sketched in algorithm 1. Note that while using this approach, we still find a local minimum of the objective function in (23) with respect to the FOM.
Further note that with the strategy introduced in algorithm 1 the + 1 gradient evaluations using the pROM simulations can only be computed after a complete evaluation of the FOM simulation. Considering a scenario of infinite available computing resources, our strategy would actually slightly increase computation time over the standard approach of using the FOM for all evaluations, as all + 1 model evaluations can here be run in parallel. However, considering the more likely scenario where computing resources are just sufficient to calculate one or few FOM simulations at a time, the strategy outline in algorithm 1 leads to considerable time savings especially for a large number of parameters .
Algorithm 1 can be combined with any subspace interpolation method in step 7. We use here the weighted concatenation of snapshots method (CoS) introduced in section A.2, as it performed best in the experiments in section 4.2 and is easily applicable Algorithm 1 Inverse analysis with pMOR-gradient 1: initialize 0 , 0 2: = 0 3: while convergence criterion from (26) not fulfilled do 4: evaluate FOM ( ) and calculate residual from (23) 5: store snapshots ( ) 6: for = 0, … , do 7: build reduced basis ( Δ ) from (28) 8: evaluate pROM ( Δ ) 9: end for 10: calculate Jacobian from (27) 11: update parameter vector +1 from (25) 12: ← + 1 13: end while 14: return̂ = to high dimensions of . For the weights of the snapshot matrix for gradient evaluation , we use a simple inverse distance weighting between two evaluation points with normalized distances 1 , 2 and weights 1 , 2 for the current evaluation and the next closest evaluation , respectively. Since at the beginning of the iteration ≪ | − |, the weight 1 of the current snapshot matrix ( ) is always close to one, whereas the weight 2 is close to zero. This can be interpreted that we "enrich" the snapshots of the current iteration with snapshots from a previous iteration to represent parametric dependence. As the optimization converges and the changes in parameters are close to the step size of the finite differences, the weights 1 and 2 equalize. For the first iteration of the optimization we rely on standard MOR evaluations using the constant projection matrix from the first FOM evaluation.
In the following, we give an equation for the speedup of gradient-based inverse analysis achieved by using pROM evaluations for the calculation of the Jacobian with respect to CPU time. Note that actual computation time depends on parallelization of model evaluations. We compare CPU time required to achieve convergence after iterations for a model with gradients calculated from pROM and FOM forward model evaluations, denoted by superscript pROM and FOM, respectively. We do not include the time spent during subspace generation, as it is negligibly small compared to pROM and FOM evaluation time. The total CPU times are where is the time required for a single forward evaluation. It can be observed from (30) that the number of parameters only scales the pROM evaluation time but not the FOM evaluation time. Using the speedup of a single pROM evaluation over a FOM evaluation, we obtain the total speedup for the inverse problem with respect to CPU time as As will be shown in section 5.2, the first factor is close to one as the number of iterations is comparable for both approaches when using a reasonably large number of reduced modes . The second factor approaches in the case of many parameters. Note that in practice there is a trade-off between the two factors. Choosing a very low-dimensional reduced model with few degrees of freedom results in a high single call speedup but may increase the number of iterations for the inverse analysis,    as the tangents are now approximated worse than with a higher . Further note that, after their respective number of iterations, both approximations achieve the same convergence criterion, which is always evaluated using the FOM.
Other variants of algorithm 1 are feasible, e.g. replacing all FOM evaluations by pROM approximations as the inverse analysis algorithm converges closer to the optimum. Such algorithms however require more advanced strategies to switch between both model evaluations. The algorithm presented here demonstrates the most simple and straightforward approach of including a pROM within a finite difference gradient-based inverse analysis.

Numerical results
We demonstrate the ability of the inverse analysis method proposed in section 5.1 to accurately and efficiently estimate parameters for a real-world cardiac estimation problem. We consider the case of a cardiac simulation which we want to calibrate to a given volume curve, i.e. measurements of left ventricular volume over time during one cardiac cycle. We have no prior solutions of our FOM and thus need to build our projection matrices from scratch starting at the first iteration of algorithm 1.
We choose the solution displayed in figure 5 a of a forward FOM simulation as our ground truth. As parameters we choose contractility from (3) and myofiber activation rate max , myofiber deactivation rate min , onset of ventricular systole sys , and onset ventricular diastole dias from (4). We thus estimate all parameters necessary to determine the shape of the input function of our model, i.e. the active stress over time ( ). The parameters , max , and min control cardiac output. However, due to their large variation they are commonly calibrated to a given patient 25 . These parameters are interconnected with the timing parameters sys and dias . The non-normalized parameters at the start of the inverse analysis and of the ground truth are listed in table 1 . We initialize the damping parameter 0 = 0.1. We further choose the number of reduced modes = 300 as it offers a good trade-off between accuracy and speedup.
In figure 10 we display the performance of the pROM inverse analysis using algorithm 1 compared to the standard approach where the gradients are evaluated using the FOM only. Figure 10 a shows the decay of the objective function from (23). We define a convergence criterion ∕ 0 < 10 −5 , which is achieved at FOM = pROM = 7. In figure 10 b we compare the development of the gradient of the objective function with respect to the parameters. As we consider a synthetic case in the absence of noise, both objective function and gradient should approach zero as → ∞. As both measures are non-monotonically decreasing, this indicates a non-smooth optimization problem. However, the pROM300-gradient optimization is in excellent agreement with the FOM-gradient optimization. The start, ground truth, and the converged solutions of both methods after seven iterations are shown in figure 11 . Here, the activation function, i.e. the input of our model, and the volume, i.e. the output of our model, are shown in figures 11 a and 11 b, respectively. It can be observed that both optimization methods match well with ground truth data for the given convergence criterion. The convergence of the five parameters relative to their initial values is shown in figure 12 for both methods. Additionally, the iteration where the convergence criterion is achieved is indicated. Both methods show a similar trend towards the optimal parameters. With a single evaluation speedup of ≈ 7.1, we obtain an overall speedup in CPU time of the pROM300 method over the FOM method of ≈ 3.3, since FOM ∕ ROM = 1 and = 5 in our case.evaluate the gradients reduces CPU time by 69% while achieving the same accuracy.

CONCLUSIONS
In this work we proposed a new projection-based reduced order model for coupled structure-windkessel cardiac models, where we solely reduced the large structural dimension. Specifically, we used a nonlinear large deformation cardiac finite element model with pericardial boundary conditions. For subspace generation, we employed proper orthogonal decomposition (POD) applied to displacement snapshots of the full order model (FOM). We demonstrated the accuracy and speedup of the reduced order model (ROM) for a range of reduced dimensions ∈ {10, … , 500}. In that range, the approximation error was found to be between 2 ⋅ 10 −1 mm and 1 ⋅ 10 −4 mm, which is well below the resolution of state of the art cardiac imaging employed in current clinical practice. For these simulations, we achieved speedups between 13 and 5 over our FOM. For highly reduced models, it was shown that the new bottleneck in simulation time is element evaluation. This motivates the inclusion of hyper-reduction methods, such as the discrete empirical interpolation method (DEIM) 42 or the energy conserving mesh sampling and weighting method (ECSW) 21 , in future research. As the kinematics of a patient-specific heart can already be observed in motion MRI, it might be conceivable to incorporate this displacement information in the reduced space.
There exist many potential applications of MOR in cardiac many-query settings. One example is the task of obtaining a physiological periodic state, i.e. where left and right ventricular output per cardiac cycle match. In these scenarios, a cardiac simulation with constant parameters is run for multiple cycles, until the change from one cycle to the next is below a given tolerance. In 10 it was reported that in some cases more than ten cycles were necessary until converge to a periodic state. After simulating one FOM cycle and calculating the projection matrix, all preceding cycles could be run using a ROM since the shape of the cardiac contraction will be similar to the first cycle. In this use case, MOR can lead to drastic time savings, especially as the individual cardiac cycles cannot be run in parallel.
We further compared four different methods of parametric model order reduction (pMOR) to allow ROM evaluations at parameter sets without prior FOM knowledge. The pMOR methods were evaluated by varying cardiac contractility, an important determinant of cardiac performance. The weighted concatenation of snapshots method was found to approximate the displacements of the FOM best for this example. Additionally, we showed that the clinically important scalar cardiac quantities ejection fraction, maximum left ventricular pressure, and left atrioventricular plane displacement are also well approximated using pMOR. Next to model calibration and design exploration, a possible application of cardiac pMOR could be multifidelity uncertainty quantification 9 .
Finally, we introduced a novel method to include pMOR into a finite difference gradient-based inverse analysis. Using the Levenberg-Marquardt algorithm as an example, we proposed to use the FOM for all objective function evaluations and pMOR for all gradient evaluations, based on snapshots from the current and previous iterations. Using synthetic data in a real-world inverse analysis scenario, we demonstrated that pMOR-gradient-based optimization shows the same convergence properties as FOM-gradient-based optimization while achieving considerable CPU time savings. This method can be incorporated easily into existing optimization frameworks and could even be combined with commercial solvers for the structural problem. Using the inverse analysis approach proposed here has the advantage that we still calculate a full displacement field in each evaluation of the forward model. We can thus evaluate any spatial quantity, which is not possible when using 2D, 1D, or 0D surrogate models for the 3D structural model. In future research, this will allow us to compute a spatial approximation error with respect to cine or tagged MRI to estimate patient-specific parameters from clinical observations. To further improve pMOR-based gradient evaluation, we will include pMOR-gradients in a taylored optimization algorithm that benefits from multiple cheap gradient and second-derivative evaluations to speed up convergence.

A SUBSPACE INTERPOLATION METHODS
In this appendix, we give a thorough mathematical description of the subspace interpolation methods utilized in this work. For a comprehensive review on these and other methods of parametric model order reduction, the reader is referred to 19 .

A.1 Weighted concatenation of bases
A common and straightforward approach to obtain a global basis matrix from the precomputed local bases ( 1 ), … , ( ) is given by the method "concatenation of bases". With this technique, the local bases are at first simply concatenated side-byside, followed by a SVD of the resulting matrix to compute the global basis . This technique can be extended by introducing the weighting functions ( * ) in the concatenation of bases, in order to compute a parameter-dependent interpolated basis ( * ) which takes the distance of the new query point * with respect to the sample points 1 , … , into account. To this end, the matrices ( 1 ), … , ( ) are first weighted with weights ( * ) and concatenated afterwards. Then, the SVD of the concatenated matrix̃ ( * ) ( * ) = 1 ( * ) ( 1 ), … , ( * ) ( ) =̃ ( * )̃ ( * )̃ ( * ) T ∈ ℝ × ⋅ (A1) is performed. The interpolated basis ( * ) is finally constructed by considering the first left singular vectors ̃ ( * ) =1 that best represent the weighted and concatenated matrix̃ ( * ): ( * ) = ̃ 1 ( * ), … ,̃ ( * ) ∈ ℝ × . (A2) Please note that the described weighting procedure is purely optional. The advantage of the weighted approach is that subspaces near the interpolation point * are favored and stronger considered than subspaces describing the dynamics for far-distant sample points. However, this extended technique requires more computational effort than the classical concatenation approach, since a SVD has to be performed for every new * to compute the parameter-dependent interpolated basis ( * ).

Special case of two precomputed bases and one parameter
In order to make the afore explained method more clear, we now briefly present the special case of two precomputed bases ( = 2) and one single parameter ( = 1). Let us assume that bases ( 1 ) and ( 2 ) have been computed at the parameter sample points 1 and 2 , and that the new parameter value * lies between these two samples. Suppose that we choose the reference basis e.g. as = ( 2 ). Then, the vectors * ( ) ( 1 ) that fulfill * ( ) = argmax MAC ( 1 ), ( 2 ) for = 1, … , are selected to be combined with the vectors ( 2 ). The interpolation reads providing that a linear interpolation is employed.

A.4 Basis interpolation on a Grassmannian manifold
As discussed before, a direct interpolation of the local bases is not meaningful, since they span different subspaces. In addition to the afore explained adjustment of the bases before interpolation, one may also interpolate the underlying subspaces on a tangent space of a manifold. The method proposed by Amsallem and Farhat 20 constructs a basis matrix ( * ) for a new parameter point * by interpolating the subspaces corresponding to the bases { ( )} =1 on the tangent space to the Grassmannian manifold  (ℝ ). The first step of the approach consists in choosing a local basis matrix 0 for the reference point  0 ∈  (ℝ ), at which the tangent space   0 to the manifold  (ℝ ) is constructed. Afterwards, all subspaces ( ) spanned by the local bases ( ) are mapped onto this tangent space by the so-called logarithmic mapping: span( ) = Log  0 ( ) ∈   0 . This is done basically by computing thin SVDs In order to compute the orthonormal basis ( * ) for a new parameter point * , the matrices ( ) =1 are first interpolated using the weights ( * ) to obtain * = ( * ) = ∑ =1 ( * ) ( ).
The special case of two precomputed bases ( = 2) and one single parameter ( = 1) is extensively described in 20 .