Bayesian optimisation for efficient parameter inference in a cardiac mechanics model of the left ventricle

Abstract We consider parameter inference in cardio‐mechanic models of the left ventricle, in particular the one based on the Holtzapfel‐Ogden (HO) constitutive law, using clinical in vivo data. The equations underlying these models do not admit closed form solutions and hence need to be solved numerically. These numerical procedures are computationally expensive making computational run times associated with numerical optimisation or sampling excessive for the uptake of the models in the clinical practice. To address this issue, we adopt the framework of Bayesian optimisation (BO), which is an efficient statistical technique of global optimisation. BO seeks the optimum of an unknown black‐box function by sequentially training a statistical surrogate‐model and using it to select the next query point by leveraging the associated exploration‐exploitation trade‐off. To guarantee that the estimates based on the in vivo data are realistic also for high‐pressures, unobservable in vivo, we include a penalty term based on a previously published empirical law developed using ex vivo data. Two case studies based on real data demonstrate that the proposed BO procedure outperforms the state‐of‐the‐art inference algorithm for the HO constitutive law.


| INTRODUCTION
The last decade has seen impressive progress in the mathematical modelling of soft-tissue mechanics. 1,2 In particular, the specific application to cardiac mechanics 3 promises to improve our understanding of cardiac physiology and pathology. A recent case-control study 4 involving patients diagnosed with myocardial infarction (MI) and healthy volunteers has demonstrated that the biophysical parameters of a cardio-mechanic model of the left ventricle (LV) of the heart can be non-invasively estimated from cardiac magnetic resonance (CMR) scans, and that the parameters thus inferred allow for an accurate classification of subject disease status. However, the cardio-mechanic equations do not admit a closedform solution and require computationally expensive numerical procedures based on finite element method (FEM). 5 These procedures have to be repeated iteratively as part of a numerical optimisation or sampling procedure, leading to potentially excessive computational run times that discourage uptake of the methodology in the clinical practice.
We aim to address this difficulty by adopting the framework of Bayesian optimisation (BO), which has been particularly designed for optimisation problems where each evaluation of the objective function is expensive and the number of such evaluations has to be kept as low as possible. The main idea is to sequentially approximate the unknown objective function by a statistical model and use this model to identify the next query point, typically according to some exploration-versus-exploitation trade-off criterion. In the literature the said statistical model is typically referred to as a surrogate function 6,7 and Gaussian processes 8 are commonly used in this context. (Surrogates are also called approximations, metamodels, response surface models or emulators. 9 We reserve the latter term for pre-trained approximations based on considerably larger numbers of points compared to those used in BO.) The trade-off criterion is quantified by the so-called acquisition function. Several strategies on how to specify the details of BO, in particular the acquisition function, have been proposed in the literature. 6 In this study, we use a popular criterion called Expected Improvement 10 as it admits a closed-form expression for Gaussian process surrogates, is conceptually simple and performs well for a range of problems. 11 To date, there have only been a few applications of BO to physiology and clinical decision support, noticeably published only recently. BO has been applied to improve personalised time series models for patient monitoring 12 or for efficient human-in-the-loop optimisation of wearable devices, for example powered prosthetics for walking assistance 13 or soft exosuits. 14 BO has also allowed for rapidly mapping residual network function in stroke 15 or an efficient design of transcatheter aortic valve leaflets and optimisation of their geometry. 16 BO has been also used in the context of computer-aided (or robot-assisted) surgery. 17,18 Another strand of literature has applied BO for efficient parameter inference for finite-element problems, for example in haemodynamics, 19 for learning material property of atherosclerotic carotid arteries, 20 or in cardiology. 21 The latter study of Fan et al. 21 is most closely related to our work as it has combined finite element modelling and BO to characterise exercise-induced myocardium growth to infer two unknown growth parameters. The main difference to our work is that Fan et al. 21 focused on the estimation of two growth parameters affecting myocardial growth, whereas our work focuses on constitutive parameters determining the passive filling process in diastole.
Myocardial stiffness has been considered an important factor in heart failure patients with preserved ejection fraction, 22,23 which is thought to be caused by myocyte stiffening and interstitial fibrosis. 24 To this end, a robust and fast estimation of myocardial passive stiffness is a key step for improved patient characterisation and treatment planning, which remains a great challenge in the cardiac modelling community despite continuous efforts in decades. 25,26,27,28,29,30,31 The purpose of the present paper is to adapt the BO framework to the problem of inferring passive myocardial properties of healthy human LVs from in vivo measurements. In this study, the passive myocardial properties are characterised by an invariant-based hyperelastic material model proposed by Holzapfel and Ogden 32 (HO model), which has been widely used in cardiac modelling communities. 33,4,34 The first attempt to address parameter estimation, also known as the inverse problem, in this context was made by Gao et al. 28 These authors proposed an algorithmic, multi-step procedure to infer the parameters of the HO model, in which in each step a subset of parameters was optimised with possibly different objective functions. The choice of particular subsets was based on expert knowledge of the myocardial properties of the LV. Our proposed BO approach is conceptually different, as BO iterations are run with respect to an a priori chosen parametrisation and objective function, thus requiring little expert knowledge.
To the best of our knowledge we are the first to introduce BO into cardiac mechanics for inferring unknown constitutive parameters. Our work contributes both to the global optimisation literature and to the application specific literature. Our first main contribution is developing a BO framework for the challenging problem of parameter inference in the HO model. Specifically, we propose modifications of the two components of BO, the target function and the acquisition function. The former modification consists in augmenting the target function by incorporating prior knowledge of the end-diastolic volume-pressure relationship from an empirical law established by Klotz et al. 35 based on ex vivo data. This allows us to obtain more realistic myocardial properties not only at physiologically normal pressures but also at high pressures, usually not observed in vivo. Generally, the myocardium response to increasing pressure is nonlinear: it appears linear for typical low pressures observed in vivo for healthy subjects, and exponential for increasing high pressures, such as those observed for example for patients with hypertension. It can be challenging to infer those parameters that characterise the nonlinear stiffening effects based solely on in vivo data.
Our second main contribution is adapting the standard Expected Improvement criterion 10 to the challenges of inference in the HO model. First, following previous studies, 36,37 we allow for unknown constraints in the parameter space by training a classifier based on a probity model with Gaussian process priors. Unknown constraints are related to those parameter regions that violate the physiological assumptions of cardio-mechanic models, which is then demonstrated as crashes of the associated forward simulators. Second, next to the standard approach of developing a surrogate for the objective function as a whole, we develop an alternative paradigm, based on a partial error surrogate, in which we develop surrogates for individual squared error terms of the target function. This is motivated by a previous study 38 in the context of parameter inference in the HO model, which reported a substantial improvement in accuracy in the emulation framework resulting from emulating a vector of outputs of the cardio-mechanic model. Our proposed approach based on using a surrogate for the squared error terms instead of outputs has an advantage of being directly applicable to the Expected Improvement acquisition function, without requiring any approximation.
Our third main contribution is related to two empirical studies. First, we demonstrate that our proposed BO framework outperforms the state-of-the-art algorithm of Gao et al. 28 for parameter estimation in cardio-mechanic models. Second, we show how incorporating the prior knowledge about volume-pressure relationship (based on an empirical low established by Klotz et al. 35 using ex vivo data) allows for physiologically more realistic parameter estimation consistent with the LV's characteristics in high pressure regimes, relevant for example for patients with hypertension or patients with diastolic heart failure. 22 We note that in this paper we are concerned with Bayesian optimisation, which should not be confused with Bayesian inference. Bayesian inference is a statistical approach that quantifies uncertainty about model parameters via posterior distributions. Bayesian optimisation, on the other hand, is a computationally efficient global optimisation method. The word "Bayesian" in "Bayesian optimization" is related to the fact that methods from non-parametric Bayesian statistics are used to approximate the objective function and quantify the uncertainty of the approximation. As opposed to Bayesian inference, uncertainty quantification in Bayesian optimisation is not an end in itself, but a means to another end: optimally trading off exploration and exploitation in the quest for a computationally efficient optimisation path. In fact, conceptually Bayesian optimisation is closer to classical, or frequentist, statistics, than to Bayesian statistics, as the frequentist approach is based on obtaining point estimates via optimising an underlying loss function.
The structure of this paper is as follows. Section 2 provides necessary details of the cardio-mechanic model of interest and introduces the state-of-the-art inference method. 28 We discuss BO methodology, including the proposed extensions, in Section 3. Our three empirical studies are presented in Section 4. Section 5 concludes the paper and presents the outline for further research.

| CARDIO-MECHANIC MODEL OF THE LEFT VENTRICLE IN DIASTOLE
Biomechanical modelling of the LV in diastole is concerned with the process of diastolic filling, starting at early diastole and finishing at the end of diastole, which can be considered as a purely passive response. The nonlinear myocardial passive response means the myocardium is soft under lower pressure but becomes stiffer and stiffer under higher pressures because of gradually engaged collagen fibres. 32 A few models of the LV biomechanics have been proposed in the literature, see a recent review by Chabiniok et al. 2 In this work, we follow previous studies 33,28 and focus on the constitutive law proposed by Holzapfel and Ogden 32 to describe passive myocardial response, in other words the myocardial stiffness. The adopted HO model is an invariant-based incompressible hyperelastic constitutive law which takes into account the layered myofibre structure of the myocardium. Studies have demonstrated that it can realistically capture the key properties of the myocardium and at the same time is relatively easy to implement numerically. 39 For these reasons it has been widely used in cardiac modelling communities 40,4,3 The incompressible HO model is based on the following strain energy function where I i , i 1, 4f, 4s, 8fs f gare invariants describing the deformation (corresponding to the matrix and fibre structure of the myocardium), and a,b, a f , b f , a s , b s , a fs , b fs are the eight constitutive parameters, which we collect in vector ϑ. The invariants in (1) are given as where C ¼  T  is the right Cauchy-Green deformation tensor, with  being the deformation gradient, while f 0 and s 0 are the myocyte and sheet directions, respectively, determined via a rule based approach, details can be found in Wang et al. 33 Note that f 0 are s 0 are initial values hence are known before running the forward simulator. Given that the myocardium is incompressible, the Cauchy stress tensor can be determined from (1) as in which p is a Lagrange multiplier to enforce the incompressibility constraint, and I is the identity matrix. The eight parameters in ϑ have the following interpretations: a and b describe the matrix response, a f and b f describe the contribution from the fibres along myocytes; a s and b s represent the contribution of the fibres along the sheet direction; a fs and b fs account for the shear effects in the fibre-sheet plane. Interested readers are referred to Holzapfel and Ogden 32 for a detailed explanation of the HO model. The quasi-static cardio-mechanics model in diastole is given by 41 where b is the body force density per unit volume, p endo represents the LV cavity pressure, u is the vector of displacements at boundaries, Ω is the computational domain defined by the LV mesh with the boundary denoted as ∂Ω, n is the normal direction of ∂Ω, Γ N and Γ D are the Neumann and Dirichlet boundaries, respectively, with Γ N [ Γ D ¼ ∂Ω and Γ N \ Γ D ¼ ;. Solving (4) gives us the LV dynamics in response to the applied LV blood pressure, from which relevant quantities of interest can be readily extracted, such as the end-diastolic LV volume and circumferential strains. Figure 1 illustrates the essential components of the cardio-mechanic model in diastole. The boundary value problem (4) does not admit a closed-form solution, hence numerical solutions are necessary in this context. The standard approach to solving biomechanical models is via the finite element method (FEM). 5 In this work we use the nonlinear finite element software Abaqus FEA (Simulia, Providence, RI, USA) to solve the discretized cardio-mechanics model. For a given parameter vector ϑ, the LV mesh, that is the discretized LV geometry at early diastole, and the desired end diastolic LV pressure, the Abaqus forward simulator will determine the deformed mesh at the end of diastole by balancing the internal and external mechanical energies under prescribed boundary conditions, see Figure 1. The deformed mesh is then used to extract segmental circumferential strains at corresponding segments defined in four short-axial CMR cine images following clinical convention as specified by the American Heart Association (AHA), 42 as well as to compute the cavity volume at the end of diastole. Figure 2 illustrates an example of the reconstructed LV geometry derived from in vivo cine images with manually defined wall boundaries.
F I G U R E 1 Inference pipeline in cardiac mechanic models Details of the LV geometry reconstruction can be found in Gao et al.,4 including the definition of circumferential segmental strains.
The parameter vector ϑ describes the myocardial material properties and hence determines its mechanical responses through the HO model. Table 1 presents two sets of reference values of ϑ used in the literature. However, a subject-specific value of ϑ can neither be directly observed nor measured in vivo. Thus, we are interested in inferring it using routinely-available non-invasive in vivo data obtained from CMR images. Figure 2, left, presents an instance of such a scan together with manually annotated LV boundaries for the epicardium (red) and endocardium (blue). To the best of our knowledge, Gao et al. 28 were the first to address the issue of parameter inference in the HO model from in vivo clinical measurements (CMR scans). We discuss their algorithm, together with its updated version, 43 in Section 2.1. Still, parameter estimation for the HO model using in vivo data remains a challenging problem due to the potential nonidentifiability of the constitutive parameters and their strong correlation, 28,38 as well as the high cost of solving the cardio-mechanics model numerically.

| Existing estimation methods
Gao et al. 28 proposed a three step algorithm based on matching the responses from the forward simulator associated with the HO model with the in vivo measurements obtained from CMR scans (circumferential segmental strains and the end-diastolic cavity volume). We refer to their approach as the HGO algorithm. The key idea behind the HGO algorithm is not estimating ϑ directly but via rescaling a reference value ϑ 0 , taken from an earlier study. 33 Moreover, each step rescales a different subset of HO parameters.
Listing 1 presents steps of the HGO algorithm, where the scaling scalars C a , C b and C 3 are defined in the following way. Scalars C a and C b are used to match the parameters in two parameter groups, a group ¼ a,a f , a s , a fs f g and , while scalar C 3 matches a and a fs  where a ð2Þ and a 2 ð Þ fs are values of a and a fs , respectively, optimised in the second step of the algorithm. The two objective functions matching the simulated values, depending on the constitutive parameter ϑ and geometry H, to the measurement are defined as follows where V Ã and ε Ã i , i ¼ 1, ÁÁÁ, K ¼ 24 are the measurements of the LV cavity volume and 24 circumferential strains, respectively, Note the 24 circumferential strains are measured from 4 short-axis slices from the base to the middle ventricle with 6 regions per slice. While V ϑ,H ð Þ and ε i ϑ, H ð Þ are the corresponding values obtained from the forward simulator using the parameter vector ϑ and the LV geometry H.

| Updated algorithm
Gao et. al 43 have developed an updated version of the original HGO algorithm, 28 which we present in Algorithm 2. We can see that the main difference between the two versions is that the new one includes an additional step based on refining C a and C b with f O2,Klotz , which we discuss in Section 2.2. In short, f O2,Klotz augments f O2 with an additional term which, for the given parameters ϑ and LV geometry H, quantifies the discrepancy between the LV volume predicted for pressure of 30 mmHg and-since no in vivo data is observed at such a high pressure-the prediction from the so-called Klotz-curve, 35 which we discuss in Section 2.2. This aims at inferring parameters which not only agree with the in vivo data observed at low pressures, but also give realistic prediction for high pressure scenarios. Another difference between Algorithms 1 and 2 can be observed in the last step: the updated version matches a and b while the original algorithm optimises a and a fs . There are two reasons for the latter modification. First, the sensitivity of a fs is much lower compared to a and b 28 and it is more related to the shearing response, which does not directly link to the measured circumferential strains. 32 Second, a and b are two highly sensitive parameters and directly related to the extracellular matrix response. Thus, in the final step a and b are updated to match the measured data.

| Parameter reduction
The eight parameters in ϑ are highly correlated and not identifiable from limited data based on CMR scans. 28 Therefore, we follow Davies et al. 38 and use a parametrisation reduced to a 4-dimensional manifold ALGORITHM 1 Original HGO algorithm of Gao et al. 28 where θ ¼ θ 1 ,θ 2 ,θ 3 , θ 4 ð Þ T is the vector of scalings of the reference parameters ϑ 0 ¼ a 0 , b 0 ,a f0 , b f0 , a s0 , b s0 ,a fs0 , b fs0 ð Þ T . Wang et al. 33 proposed a popular reference parameter vector, used in several later studies. 28,38 However, those values were based on ex vivo data, making them potentially of limited relevance to our problem (inference based on in vivo data). Hence, we use different values for ϑ 0 based on in vivo data population-wide averages. 4 The only exception is the implementation of the HGO algorithm, 28 which originally uses the earlier reference values. 33 We have implemented this algorithm using both reference parameter vectors. 33,4

| Klotz-curve
In vivo data are collected at physiologically normal (low) ventricular pressures. These pressures, however, are not recorded in healthy volunteers due to the invasive nature of the procedure. We assume a population-based end-diastolic (ED) pressure of 8 mmHg. 28 Working with "low pressure data" implies we can only estimate those parameters of the HO model that describe myocardium properties at such low pressures. However, parameters leading to a similar low pressure behaviour (in terms of the implied stress-strain relationships) may result in a very different behaviour at high pressures, such as 30 mmHg. These high pressures are of interest as they may reveal LV stiffness with impaired relaxation, which characterises diastolic heart failure. 22 Hence, we want to avoid that parameters which have been inferred from in vivo pressures exhibit physiologically unrealistic behaviour at high pressures.
However, high pressure in vivo volume measurements are not available, therefore we propose to predict them using the empirical law found by Klotz et al., 35 which we will refer to as the Klotz-curve. This empirical relationship is valid for a wide range of hearts, including healthy and diseased (with congestive heart failure or left ventricular assist device support) human hearts as well animal (canine and rat) hearts. The normalised ED volumeṼ Ã is defined as 35 where V Ã is the measured unnormalised volume at P Ã , V 0 is the zero-pressure volume called the load-free volume. An empirical law based on population-wide ex vivo data is then established 35 relating the normalised LV volume to the corresponding measurement pressure P Ã with the specific form of where A and B are parameters. In our case V Ã is the in vivo measured volume at pressure P Ã ¼ 8 mmHg. Combining (9) and (8) where V Kl 30 denotes V 30 predicted using the Klotz-curve. Note that the ED pressure-volume relationship in (9) as well as the derived formula for the high-pressure ED volume (10) are based on normalised volumes: volume normalisation was crucial to obtain a widely-holding pressure-volume relationship. 35 Note that formula (10) depends on the load-free volume V 0 , which can only be measured ex vivo. With only in vivo measurements available, we need to make some assumption regarding how to estimate it. We use the early diastolic volume as a proxy for the load-free volume, since it is known that the LV pressure is the lowest at early diastole. 28,44 Thus, we predict V Kl 30 using the approximated b V 0 and the in vivo measured volume where b A ¼ 27:78 and b B ¼ 2:76 are the estimates. 35 We use the Klotz-curve predicted high-pressure volume b V Kl 30 to add a high-pressure penalty term to the objective function (6) as follows where V 8 and V 30 are LV cavity volumes from the forward simulator, at 8 mmHg and 30 mmHg, respectively; ε i , i ¼ 1, ÁÁÁ,K ¼ 24, are regional circumferential strains from the forward simulator at 8 mmHg; V Ã 8 is the in vivo measured LV cavity volume (at the normal pressure of 8 mmHg).

| BAYESIAN OPTIMISATION
Suppose we are interested in finding the global optimum of a smooth function f x ð Þ : ℝ d ' X ! ℝ. We will refer to f as the objective or target function and without loss of generality assume it is a loss function to be minimised. Further, suppose computing f at any given x is expensive and/or time-consuming so that we want to find the optimiser x Ã ¼ argmin x f x ð Þ in as few steps as possible. BO proceeds in an iterative fashion, at each iteration t selecting an input x t and querying the black-box function y t ¼ f x t ð Þ. Input selection is done by optimising the so-called acquisition function (AF), which is based on an approximation of f with a surrogate and models the desired trade-off between exploration (searching for new promising regions) and exploitation (increasing the resolution around beneficial regions already discovered).

| Gaussian process surrogates
Typically, the AF is based on a surrogate model of the objective function f , with a popular choice for such a surrogate being Gaussian processes (GPs). As opposed to alternative nonlinear methods, such as for example splines and neural networks, GPs allow exact interpolation rather than regression, which is what is conceptually required for deterministic objective functions, such as the one based on a deterministic cardiomechanic model. However, due to technical reasons discussed below, restating the interpolation problem as a regression problems is more practical, so that below we focus on the GP regression setting.
In BO GP regression is trained on a sequentially expanding set of inputs . We refer to Rasmussen and Williams 8 for an extensive treatment of GPs. GP regression is a nonparametric model in which a GP prior is put over the unknown function of interest to represent our initial beliefs about its smoothness. The target values at points x j , j ¼ 1, …, t observed up to iteration t are given by where N μ,σ 2 ð Þ denotes the Gaussian distribution with mean μ and variance σ 2 (and iid stands for independently and identically distributed). Specification (12) is a general one, allowing for noisy observations. Our forward simulator of interest is deterministic, so mathematically there is no noise (σ 2 ¼ 0). However, including a non-trivial noise term ε j is still convenient for at least two reasons. First, in practice there may be numerical noise, due to the finite precision of the numerical solution of the forward problem. Second, when working with GPs it is a common practice to allow for a small jitter (or nugget) term (e.g. 10 À9 ), which is added to the diagonal of the training covariance matrix for numerical stability of the matrix inversion. Effectively, jitter is imposed noise, which slightly dilutes the informativeness of our deterministic data.
In (12) the latent values f are the prior mean and the covariance function (kernel) of the process f , respectively. Hence, it is assumed that the responses y j are conditionally independent given the latent values f x j À Á . Below we adopt a standard assumption that μ x ð Þ ¼ 0 so the latent process f x ð Þ is fully specified by its kernel function. The kernel can be specified in many different ways, including numerous standard functional forms. 8 We consider the Matérn 3/2 kernel with automatic relevance determination (ARD) given as , d is the length scale parameter for the ith input variable and ARD refers to this allowing for a different length scale for each dimension. Thus our GP regression model in (12) is parametrised by a vector of hyperparameters ϕ ¼ σ 2 , σ 2 k ,l 1 , …, l d À Á T , so that we consider the observation noise variance σ 2 a hyperparameter. 8 The Matérn 3/2 kernel produces continuous trajectories with continuous derivatives at the same time providing a considerable degree of roughness (it is once differentiable) allowing us to model functions with cliffs and plateaus. Although this choice of kernel rules out standard second-order optimisation methods, like quasi-Newton, we note that second-order methods are of limited use in the presence of extensive multimodality (which is a typical problem when optimising AFs, see Section 3.2.1). Moreover, based on our experimentation this low-level Matérn class leads to numerically more stable covariance matrix inversion than a higher-level Matérn class. For the collected inputs X t we obtain the GP prior over function values where  is the identity matrix, and marginalising over the latent variables f t gives the formula for the marginal likelihood Under the adopted Gaussian observation model (12) the conditional posterior distribution of the latent variables also becomes Gaussian, with the following form

| Acquisition function
The performance of BO algorithms depends on the particular choice of AF. Various AFs have been proposed in the literature, 6 but the Expected Improvement 10 criterion remains one of the most popular choices. EI has gained popularity as it is conceptually simple (as opposed to e.g. information theoretic AFs), it admits a closed form expression for GP surrogates and performs well for a range of problems. 11 As its name suggests, EI quantifies the expected improvement, that is the amount of improvement over the current incumbent f Ã , that is the best (lowest) value of f found so far, weighted by the probability of it occurring. EI is defined as where D denotes the set of inputs and outputs recorded so far and p y j x, D ð Þis the predictive posterior output distribution given the chosen evaluation point x and the previously observed D. With a GP surrogate GP μ x ð Þ,k x ð Þ ð ÞEI can be expressed 10,6 as and Φ and ϕ are the CDF and PDF of the standard normal distribution, respectively. The EI is made up of two terms. The first term is increased by decreasing the predictive mean μ x ð Þ, the second term is increased by increasing the predictive uncertainty k x ð Þ. This shows how EI automatically balances exploitation and exploration.

| Inner optimisation
BO selects the next query point based on maximising the AF, which is often referred to as inner optimisation as it is within the outer optimisation (optimising the unknown objective function f ) of the BO algorithm. Even though inner optimisation does not require any computationally expensive forward simulations, it can be a challenging task. 7 This is often due to the observed multimodality of EI, which typically is picked around several points and flat between them. A popular approach to overcome this problem is the multi-starts method, 45 which performs standard numerical optimisation using for example quasi-Newton methods such as BFGS multiple times, from different initial points. However, a pilot study we conducted revealed that the multi-starts method performs poorly in our application, irrespective of a particular kernel choice (we tested several standard kernels, such as squared exponential, Matérn 5/2, Matérn 3/2 and neural network), often indicating suboptimal points and thus leading to unstable BO runs. That was the case even with an atypically large number of restarts (set to 1000 and drawn from a Latin hypercube design).
For these reasons we use a different approach based on a global optimisation algorithm called OQNLP 46 (or Global Search in its implementation in MATLAB's Global Optimisation toolbox that we use), which led to very stable results. In short, the OQNLP is a heuristic algorithm also based on multiple starting points. Those starting points, however, are not randomly (or quasi-randomly) selected, but rather calculated from the scatter search algorithm. 47 They are then scored based on the corresponding value of the objective function and, if present, constraint satisfaction. Next, gradient-based nonlinear programming (NLP) solvers are run from the starting point with the highest score value. Thereafter, the remaining points are analysed and those that are unlikely to improve the best local minimum found so far are rejected (which is different to the standard multistart approach in which solvers are run for all the initial points). The filtering analysis takes into account the corresponding value of the objective function, constraint violations and basins of attraction. The latter means that the points in the same basins of attraction as previously analysed points are excluded from further NLP optimisation. This feature, that is analysis of basis of attraction, contributes to the efficiency of the OQNLP algorithm as it allows for discarding points which would converge to the already found local optima.

| Unknown constraints
It may happen during BO iterations that the proposed point, in spite of maximising the AF based on the current surrogate, will violate the biomechanical assumptions behind the HO law (we refer the reader to Appendix F) for an analysis of simulator crashes. Such a situation will demonstrate itself as a crash of the associated forward simulator, which is inefficient as some computations will be carried out in vain. The problem of a simulator failing to terminate or crashing was addressed in the literature 36,37 by weighting EI by the probability of the constraint being satisfied where P x C ð Þis the probability of x being a valid point not leading to a crash of the forward simulator. To model the probability of the unknown constraint being satisfied we use a GP classifier with a probit likelihood 36 with approximate inference carried out using expectation propagation. 48 Similarly as for the regression model, we use the Matérn 3/2 kernel. A probit likelihood is a conceptually sound approach, however it means that we lose the analytic tractability of the standard EI function as for non-Gaussian likelihoods approximate inference methods (such as Laplace approximation, 49 variational inference, 50 expectation propagation 48 ) are necessary. 8 A simpler though heuristic classifier could be provided by the so-called label regression 51 (LR). LR is an ordinary linear regression model fitted to explain a binary outcome variable and its advantage is that P x C ð Þ can be computed explicitly, yet based on a suboptimal probability model (a Gaussian, rather than a logit or probit model). LR has been successfully employed to account for hidden constrains in the context of BO applied to pulmonary circulation. 52 However, in our preliminary study we found that in our application LR led to unstable results, generally performing poorly.
We note that the probability of a crash happening during a BO run is LV geometry and material parameter dependent. For the investigated LV geometries crashes are not very frequent, see Sections 4.3 and 4.4, however, we note that the reported numbers of crashes, if nonzero, are already based on training a classifier. In fact, our pilot study revealed that not allowing for any learning from past unsuccessful simulations can lead to repeatedly querying very similar "crash points", resulting in much higher numbers of crashes.

| Partial error surrogates
Notice that both target functions, f O2 and f O2,Klotz , are sums of individual squared error terms. Thus, instead of developing a surrogate for the target as a whole we propose to construct a surrogate for each error term separately. In the context of emulation 38 a similar approach led to a substantial improvement in accuracy by emulating the vector of outputs from the cardiac mechanic model. That approach is efficient in the context of emulation when the developed emulator is aimed to be used for a wide range of different subjects. However, in our context using surrogate functions for the outputs would require us to approximate EI, as implied by the predictive means from individual output surrogates. Considering surrogates for the individual error terms instead allows us to obtain a closed form expression for EI. Note that considering partial error surrogates instead of output surrogates does not cause an efficiency loss as in our case optimisation is carried out for subject-specific measurements, known before running computations (which is different than for emulators).
We refer to the proposed approach as partial error surrogates. Let where f is either f O2 in (6) so that K Ã = K + 1 = 25, or f O2,Klotz in (11) so that K Ã = K + 2 = 26, and each f i ð Þ refers to the squared error terms on the right-hand side in those formulae. We assume that f i ð Þ s are mutually independent and model each f i ð Þ using a GP regression parametrised by hyperparameters ϕ i ð Þ . Allowing for correlated f i ð Þ s would require us to use computationally much more expensive multi-output GPs, 53,54,55 which would increase the computational overhead of each BO iteration, diminishing the posited "cheapness" of the surrogate compared to the forward simulator. Each partial GP regression has the conditional posterior mean μ are the latent variables for the ith output from the cardiomechanic model and Y are the observed values for the ith output. Then the conditional posterior mean μ t of the full target f is just the sum of conditional posterior means μ i ð Þ t from the partial surrogates and the conditional posterior variance k t is the sum of the partial variances k i ð Þ t :

| Standardisation
Standardising inputs and outputs is a common practice when working with GPs, aimed to provide numerical stability of the computations. In our case input standardisation is not an issue as all the partial surrogates use the same inputs. Predictions for the outputs, however, enter the formulae for the conditional posterior distribution (14) so we need to "de-standardise" the predictions to obtain correct partial moments where m i ð Þ 0 and σ i ð Þ 0 are the mean and standard deviation, respectively, used to standardise the ith partial squared errors (e.g. calculated on the initial design). The reverse transformation is given as for the standardised ith partial squared error, this corresponds to for the original ith partial squared error. Then the moments (14) used in the acquisition function are given by

| Emulator for high-pressure volume
Computing f O2,Klotz in (11) requires running the forward simulator twice, at 8 mmHg and 30 mmHg, which requires double computing time/power. Each of these runs may be unsuccessful, that is may lead to a crash. However, parameters leading to crashes may differ between the 8 and 30 mmHg cases. Hence, we would need two classifiers, one for the 8 mmHg case and one for the 30 mmHg case, and we could only update the GP surrogate for f O2,Klotz when both cases terminate successfully.
To overcome these problems and to reduce the computational costs we propose to obtain the value for V 30 using an emulator rather than by running the forward simulator. Emulators of cardiac mechanics models often include constitutive parameters as inputs. 38 However, the outputs from the forward simulator depend not only on the constitutive parameters but also on the LV geometry. Since we want the V 30 emulator to approximate V 30 for a broad range of LV geometries (as opposed to developing a geometry-specific emulator 38 ) we want to include some information about the LV geometry as input. However, we cannot use the LV mesh directly as it is a high-dimensional (with thousands of coordinates) object and such a dimension is prohibitively high for any emulator. Our preliminary experimentation revealed that accurately predicting only the LV volume at 30 mmHg, and not the strains, does not require much details about a subject's specific shape of the LV and can be done based on a single "summary statistic" of the LV mesh-an approximation b V 0 to the load-free LV cavity volume V 0 (see Section 2.2)-without a substantial loss in the accuracy of the emulator. We note that if we were also after emulating circumferential strains at 30 mmHg, accounting for more subtle features of the LV geometry would be necessary. To sum up, we train the V 30 emulator on the scaling parameters θ and the load-free volume b V 0 , in which we approximate the true function

| Final algorithm
Listings 3 and 4 summarise the initialisation and iterations, respectively, of the BO algorithm with the constraintweighted EI, Klotz-curve objective (11), partial error surrogates and V 30 emulator.

| APPLICATIONS
This section presents three applications of BO to the problem of parameter inference in the HO model of the LV. We start with comparing BO without the Klotz-curve prior with the original HGO algorithm, 28 which we refer to as the basic study. We demonstrate that BO attains a similar or a lower value of the objective function f O2 in considerably fewer iterations (Section 4.3). Next, in the Klotz-curve study we compare the versions of both algorithms that include the Klotz-curve (Section 4.4). We show that also in this case BO typically outperforms the multi-step algorithm, even though the updated HGO algorithm 43 works better than its original version. In both these studies we consider four healthy volunteers (HVs), same in both applications, which we label HV A, HV B, HV C and HV D. Finally, in Appendix C we investigate theoretical accuracy of the proposed method by means of a synthetic data study based on HV A from the basic study.

| Set-up
In all our applications we compare two variants of BO, with target and partial error surrogates (13) (see Section 3.1 and 3.2.3). We use 500 iterations for BO, which are run after 40 iterations corresponding to an initial design obtained with Latin hypercube sampling, 56 in which we follow a commonly followed rule-of-thumb 10 to use 10d points to initialise optimisation (for a d-dimensional problem). We use a recent value for ϑ 0 (defined below (7)) based on in vivo data 4 for both BO and the HGO algorithm, except the original HGO algorithm in the basic study for which we use the reference value from Wang et al., 33 originally used in the HGO algorithm.

| Evaluation
In our result analysis we focus on three aspects. First, we consider the value of the objective function, as this is what both algorithms directly target. In the basic study this is f O2 given in (6), in the two later studies this is f O2,Klotz given in (11). Visually, we analyse the convergence of the incumbent, that is the best value of the objective function recorded so far, which is standard in the BO literature. 6,37 The convergence is with respect to the invocations of the forward simulator, which are the main computational burden and which we want to limit. The physical computing time is a less reliable measure of computing costs as it necessarily depends on the machine used, number of task performed in parallel, etc., and does not generalise to other mechanistic models with different computational costs of the forward problem. Note that the HGO algorithm is based on optimisation solvers implemented as "black-boxes" (such as the trust-regionreflective algorithm or the sequential quadratic programming algorithm in MATLAB), so that tracking the incumbent over the individual iterations is not directly available. Moreover, each step in HGO has its own stopping criterion determining the number of iterations carried out in each step. Therefore, for HGO we just report the final value of the objective y HGO min and the corresponding number of iterations i HGO min (defined as the sum of iterations performed in all steps).
Quantitatively, we compare BO and HGO based on y min ,i min ð Þand y i HGO min À Á , i y HGO min À Á À Á . The former pair is concerned with the lowest value attained by BO over all 540 iterations y min and the number of iterations completed to obtain y min denoted i min . These values are compared with their HGO counterparts, y HGO min and i HGO min . The latter pair relates BO iterations to the final result from the HGO algorithm: y i HGO min À Á is the value of a BO run after the number of iterations used by the HGO algorithm i HGO min , while i y HGO min À Á is the number of iterations required to achieve at least as good a performance as the HGO algorithm. If after 540 iterations the value of the objective obtained with BO is still higher than the final value from the HGO algorithm we set i y HGO min À Á ¼ ;. Table 2 summarises the notation used in both empirical studies.
Second, we analyse the stress-stretch curves corresponding to the final parameter estimates. We consider two virtual uni-axial stretch experiments, one is along the myofibre direction (myocytes), and the other one is along the sheet direction (transmurally), often adopted in the literature for characterising nonlinear myocardial property. 28,39,57 Stretchstress curves present the relationship between stress and stretch along two axes in the myocardium, along the myocyte and along the sheet direction. They provide insights into the predicted stiffness of the myocardium, with higher values of stress for a given stretch value indicating a stiffer material. They are of interest to cardiac mechanics specialist and clinicians as the parameters of the HO model cannot always be uniquely identified using experimental data 28 and further due to the nonlinear myocardium response. The latter makes the differences in the predicted parameters not fully

ALGORITHM 3 Initialisation of Bayesian optimisation with partial error surrogates
informative about the implied output differences. For the reproducibility of stretch-stress curves, we report the underlying parameters in Appendix A.
Third, we consider deformations of the 3D LV geometry implied by the final parameter estimates. The underlying rationale is that uni-axial stretch-stress curves provide only a simplified test and do not capture the full picture of the deformation of the whole LV geometry. Following a cardiac mechanics benchmark paper, 58 we illustrate the deformation of a line in the epicardial surface (see Figure 3), with a specific focus on two regions: the inflexion point near the middle ventricle and the apical region. In this section we discuss deformations at the standard pressure of 8 mmHg, while Appendix D provides additional figures related to deformation at 30 mmHg. The latter, much higher pressure may be useful for comparing estimates as it leads to exaggerated differences in the implied deformations because of the intrinsic nonlinearity in the HO model.

ALGORITHM 4 Iterations of Bayesian optimisation with partial error surrogates
Both BO and HGO provide point estimates. Given an increasing awareness in the cardiac modelling community of the importance of uncertainty quantification and sensitivity analysis for model parameter estimates, 59,60,61 we discuss in Appendix E how to quantify uncertainty of BO estimates in a straightforward way. We then illustrate one of the available methods, the residual bootstrap approach, in a computational experiment.

| Basic study
Gao et al. 28 have established a benchmark for parameter inference in the HO model. Their algorithm (see Listing 1) does not involve the Klotz-curve prior discussed in Section 2.2, hence for comparability we first consider a version of BO without the Klotz-curve prior, with the objective function given by f O2 from (6).
We appreciate that the HGO algorithm runs with respect to different parameters than the proposed BO framework and that the number of parameters used by both algorithms is different (in particular, in HGO the number of parameters with respect to which the optimisation is carried out may vary between the algorithm steps). This may raise concerns about the fairness of the BO-HGO comparison. Therefore, in Appendix G we present a comparison of BO against an established iterative optimisation scheme in the same 4-dimensional parameter space and demonstrate the superiority of BO also in this context. Figure 4 presents the convergence of the incumbent for the four considered HVs. We observe that BO converges very quickly, typically in around 200 iterations after the initial design, so that the graph stays flat over the remaining  iterations. The HGO algorithm typically requires more iterations to terminate. Moreover, in all the cases BO converges to a lower value of the objective function than HGO. In the most extreme case of HV C, the lowest value found on the BO initial design is lower than the final value for HGO. The two variants of BO, with target and partial error surrogates  F I G U R E 5 Basic study: stretch-stress curves for four LV geometries (HV A, HV B, HV C, HV D). Left: responses to stretches along the myocyte direction f 0 , right: responses to stretches along the sheet direction s 0 (see (2)). Bayesian optimisation with a target surrogate (target) and a partial error surrogate (partial) together with the old version of the HGO algorithm (HGO old) F I G U R E 6 Basic study: deformations of the LV geometry at 8 mmHg implied by the final estimates from different algorithms. Left panels show a deformed reference line in the epicardial surface, see the red path in the LV geometry in Figure 3; middle and right panels show details of the inflexion point and the apical region, respectively (13) (see Section 3.1 and 3.2.3) generally perform comparably, except for HV D, for which BO with a partial error surrogate converges faster and to a better point than BO with a target surrogate. The main quantitative insights from Figure 4 are summarised in Table 3. We can see that y min obtained with BO are always lower than their HGO counterparts. However, in most cases (except HV A and BO with a target surrogate) they are obtained using more iterations than HGO. A fairer comparison is hence provided by y HGO min . Even after less iterations BO still provides lower values of the objective function. This better performance of BO can be alternatively assessed using i y HGO min À Á which shows that for HV B, HV C and HV D we record at least as good performance of BO as for HGO using 4-7 times less iterations.
As mentioned at the beginning of this section, we focus on analysing stretch-stress curves and deformations of the LV geometry, presented in Figures 5 and 6, and not parameters estimates explicitly (the latter are nevertheless reported in Appendix A). We observe that the HGO algorithm predicts stiffer myocardium for high stretches along myocyte, while for low stretches the responses from BO and HGO are typically very comparable. The only exception is HV C for which HGO predicts much higher stiffness than BO along the whole range of stretches. On the other hand, for HV C both algorithms predict almost identical responses for stretches along the sheet direction. For the remaining subjects sheet direction stretches are associated with higher stiffness predicted by BO compared with HGO. The deformations of the LV geometry implied by the parameter estimates from different algorithms are generally very similar. The biggest differences between BO and HGO can be observed in the apical region (especailly for HV B), while both BO versions lead to comparable deformations.
We conclude that the original HGO algorithm is easily outperformed by both variants of BO. Moreover, the mechanical response predicted by both BO and HGO for high stretches (i.e. > 1.15) can differ substantially. This could be partially explained by the limitation of using in vivo measurements, which are less informative about stiffening effects under high stretch as the myocardium is mostly operating in the linear regime of the stress-stretch curve in vivo, in particular for healthy LVs with a maximum stretch below 1.2. 62 The latter confirms that inferring parameters characterising the nonlinear stiffening effects is challenging when based on in vivo data only. Therefore, introducing extra information about pressures higher than the normal end-diastolic pressure becomes necessary in order to characterise nonlinear responses at higher pressures. In the clinic, this could be done through stress CMR. 63 In this study, in our next application we will rely on the Klotz-law discussed in Section 2.2, which provides an alternative way to capture T A B L E 4 Klotz-curve study: convergence of the objective function for Bayesian optimisation and the updated HGO algorithm for four LV geometries (HV A, HV B, HV C, HV D) the nonlinear relationship between the LV cavity volume and the loading pressure within a large range, that is up to as high as 30 mmHg.

| Klotz-curve prior study
In our analysis of the basic study we have demonstrated that our proposed BO framework outperforms the original HGO algorithm 28 in terms of delivering lower values of the objective function using less invocations of the forward simulator. In this section we want to replace that benchmark with its updated version. 43 We can now use the full BO algorithm described in Listing 4 as the new HGO algorithm includes the Klotz-curve prior. This time we run each version of BO (with a target surrogate and a partial error surrogate (13), see Sections 3.1 and 3.2.3), three times (with independent runs based on the same initial design) to verify the robustness and stability of the procedure. Figure 7 shows that BO again outperforms the HGO algorithm in terms of achieving lower values of the objective function f O2,Klotz in fewer iterations. We can see that the updated HGO algorithm performs better than the original one relatively to BO, for example for HV C and HV D the new HGO algorithm converges closer to BO than in the basic study. Appendix B provides additional insights into the behaviour of the objective function f O2,Klotz by decomposing it into the f O2 component (i.e. errors for the 8 mmHg volume and strains) and the Klotz component (i.e. error for the 30 mmHg volume). Table 4 summarises the results related to the convergence of the objective function. Since in the present study the performance of the HGO algorithm is more comparable to the one of BO, we decided to investigate whether the low values of the objective function for BO were not a consequence of a potential bias in the V 30 emulator. The latter is used as a part of our framework aimed at achieving computational savings as discussed in Section 3.3, however the new HGO algorithm still uses the forward simulator to obtain the volume at 30 mmHg. We thus run the forward simulator for each run of BO at points θ corresponding to y min and y i HGO min À Á , to obtain the true values of V 30 at these points and hence true values of f O2,Klotz . We report the volumes in the columns indicated with V 30 , with the E(Á) values being the volumes from the emulator and the S(Á) values being the volumes from the simulator. Similarly, we have two sets of the objective function values, obtained with the simulator (with superscript S) and the emulator. These results reveal that our V 30 emulator provides very accurate predictions, with hardly any differences in the high pressure volumes and consequently in the values of the objective function compared to the simulator. Figure 8 illustrates this point using a Bland-Altman plot. The first four sets of points (from the left) show the differences between the simulator-based and emulator-based values of f O2,Klotz against the average over both these values. Note that the scale of the y-axis is 10 À3 . We can see that all the differences are indeed negligible so comparing the HGO algorithm against emulator-based runs of BO is valid.
F I G U R E 8 Klotz-curve: Bland-Altman plot comparing the Klotz objective function with V 30 computed using the emulator and the forward simulator F I G U R E 9 Klotz-curve study: stretch-stress curves for four LV geometries (HV A, HV B, HV C, HV D). Left: responses to stretches along the myocyte direction f 0 , right: responses to stretches along the sheet direction s 0 (see (2)). Bayesian optimisation with a target surrogate (targ.) and a partial error surrogate (part.), three independent runs (v1, v2, v3) in each version F I G U R E 1 0 Klotz-curve study: deformations of the LV geometry at 8 mmHg implied by the final estimates from different algorithms. Left panels show a deformed reference line in the epicardial surface, see the red path in the LV geometry in Figure 3; middle and right panels show details of the inflexion point and the apical region, respectively The stretch-stress curves in Figure 9 show that including the Klotz-curve prior leads to the results from BO and HGO being more comparable. For all the curves, except HV C along the sheet direction, the differences in predictions for small stretches (observable in vivo) observed in the basic study now with the Klotz-curve prior are considerably reduced. Moreover, comparing for example the curves along the myocyte direction for HV B with those from Figure 5 we observe that now the predicted stiffness for high stretches is noticeably reduced. A similar observation can be made for the deformations of the LV geometry, see Figure 10. For instance the difference in the apical region for HV B visible in Figure 6 have now disappeared.

| Surrogates performance and heterogeneity
Comparing the performance of target and partial error surrogates does not reveal a clear pattern. In fact, the preferred surrogate type turns out to be case-specific (depending on the LV geometry and the data), so that one cannot know a priori which surrogate to use. Therefore, for robustness and extra uncertainty quantification, we would suggest trying both approaches, preferably in parallel.
Finally, we note that we would expect partial error surrogates to be superior to target surrogates when there is considerable heterogeneity in the strain data, which would be the case for MI patients. Logically, in a perfectly homogeneous case K partial surrogates would just learn the same mapping, which would lead to K same surrogates. The more heterogeneity in the strain data, the more need for flexibility in modelling individual errors.

| Acquisition function
The particular choice of the acquisition function is pivotal for the performance of BO. The Expected Improvement (EI) criterion applied in this paper has a number of advantages, most importantly it admits a closed form formula under Gaussian process surrogates and is widely used in the machine learning community. However, we note that the analytic tractability of EI becomes less advantageous as we do not use just EI as our AF, but EI con , in which EI is augmented with the probit-based probability of unknown constraint satisfaction, making the whole AF not tractable. Moreover, EI might perform in a greedy way. 64 The development of better specifications for acquisition functions is an active research area in the BO literature. 6 Two classes of alternative acqusition functions seem to be particularly relevant in our context. The first group consists of information-based policies, such as the Entropy Search 65,66 (ES) criterion, that aim to infer the location of the unknown minimum by focusing on the posterior distribution of the minimiser. When deciding where to query ES takes into account what we expect to learn about the unknown minimum from evaluating at a particular point. This is different to EI, which evaluates where we believe the minimum is, without taking into account the effect of such a query on learning. ES is thus conceptually very appealing, however it is more challenging to implement in practice than EI. The second group of potentially useful AF is related the so-called portfolios of AFs. 67,6 As demonstrated by our empirical studies, it is unlikely to develop a globally preferable acquisition strategy, that is preforming best regardless of the given LV geometry and measurements. The portfolio approach considers multiple AFs and uses a meta-criterion (a high level AF) to select the final query point from those proposed by individual AFs.

| Computing time
Depending on the hardware used and the number of parallel tasks performed on a given clusters, computing time per BO iteration may vary considerably. For our studies, the mean computing time was approximately 9.5 minutes (on a typical Linux workstation being Dual 10 core Intel(R) Xeon(R), with a 2.50 GHz CPU and 64 GB RAM), while the median computing time was slightly over 7 minutes (the 10%, 25%, 75% and 90% quantiles were 5.3, 5.8, 10.8, 16.9 minutes, respectively). In our studies we considered 500 BO iterations, which for the median computing time would correspond to 2.5 days to complete optimisation. However, we note that premature convergence might be satisfactory in the clinic, as we observe rapid convergence of the objective function in the initial iterations. If we conservatively assume 200 BO iterations, that would correspond to 1 day for optimisation with the median computing time.
Therefore, despite considerable efficiency gains compared to the HGO algorithm, BO is still likely to be too timeconsuming to provide a viable tool for the clinical practice if optimisation is supposed to be performed independently for each subject, starting from scratch. Multi-task BO 68 could address this issue by leveraging prior knowledge acquired from searches over comparable domains, in our case from optimisation performed for previous subjects. Multi-task BO uses the data coming from different searches by processing them simultaneously and constructing a multi-task GP 55 to capture correlation between related tasks. This approach would allow the user to focus on parameter regions that have been found to be promising.
An alternative approach for fast parameter estimation would be to estimate parameters directly from clinical images without first obtaining strain or displacements using cardiac models, however, to our best knowledge, we are not aware of such studies. Cutting-edge physics-informed machine learning approaches may be the answer to that question. 69,70,71 Nevertheless, we note that the goal of our group's research agenda is to make the whole inference pipeline automatic, that is to estimate the myocardial properties directly from CMR images. One of the stepping stones towards that goal was our work on an automatic framework for LV geometry prediction directly from CMR images based on convolutional neural networks. 72

| Biomechanics cardiac modelling
In this study, we have focused on the passive parameter estimation of in vivo hearts by only using conventional cine CMR images. The main reason is that knowing the mechanical properties of myocardial tissue is essential to understand the underlying mechanisms of various heart diseases, such as diastolic heart failure, 73 and it can further lead to personalised patient treatments, for example de-stiffening the heart. 74 In this study, we have adapted the framework of Bayesian optimisation to infer unknown parameters in the HO model, which outperformed the multi-step algorithm of Gao et al. 28 Future studies shall take into account the stiffness heterogeneity and material properties of different constituents that will provide a closer link to cardiac pathology compared to the global stiffness estimation. We also expect that the developed procedure in this study can be readily applied to other inverse problems by simply updating the loss function according to the available measurements, that is nonlinear-mechanics in general, or any parametric mathematical model.
A standalone single ventricle model during diastolic filling is chosen here by only including the essential features, such as subject-specific geometry, layered myofibre architecture, etc. Such standalone LV model is still being widely used in cardiac modelling communities. 75,30,34 It shall be mentioned that the LV model can be further improved by including the right ventricle, the valves, the pericardium with more realistic boundary conditions, and further coupled with the blood flow and ectrophysiology. Such a complex model will inevitably introduce extra unknown parameters, which make the inference problem more challenging. For example, Strocchi et al. 76 used spatially varying robin boundary conditions to represent the pericardium in a four-chamber heart model; their results have shown that even though the pericardium can significantly affect ventricular motion, it has little effect on the local strains and wall thickness. The basal plane of the LV model is fixed because of lack of measured cardiac motion on the basal plane in our 2D cine CMR imaging dataset. This could be overcome by 3D tagged CMR imaging, 29 while it requires complex processing and extra scan time that is not routinely available.
Studies have demonstrated the high descriptive and predictive capabilities of the HO model in modelling cardiac dynamics. 33,39,34 Due to the poor identifiability of some parameters in the original HO model, 32 a few reduced HO models have been proposed in the context of parameter inference, such as the reduced formula with transverse isotropy used in Asner et al. 29 For example, by using a reduced HO model with only the first two terms, Hadjicharalambous et al. 30 inversely estimated a and a f by fixing b and b f , and further proposed to learn the ratio a=a f . Instead, other studies introduced scaling parameters to simplify the parameter inference problem for the HO model. For example, Peirlinck et al. 34 introduced two scaling factors for a 0 s and b ' s, respectively. This is the first step in the HGO algorithms. Considering that the HO-type material models are still being used widely in the cardiac modelling community, the Bayesian optimisation framework developed in this study can be readily extended to other HO-type strain energy functions, and further to other types of constitutive laws in soft tissue mechanics, such as the Fung-type strain energy function. 58 A further future research direction is to estimate spatially varied material stiffness for a given patient, especially for a patient with localised disease, such as myocardial infarction. It is believed that even for a healthy heart, myocardial stiffness will vary spatially. However, there is a lack of controlled experimental data to inform how heterogeneous stiffness is distributed in the myocardium. Should it be based on the widely used AHA-17 divisions or other segmental divisions (free wall, septum, etc.)? Potentially, CMR T1 mapping may provide key information about interstitial fibrosis, 77 which could further inform myocardial stiffness. For patients with myocardial infarction, experiments have demonstrated that the infarcted region, in general, has much higher stiffness than healthy myocardium. To model that, regional varied stiffness is usually used when modelling myocardial infarction. 3 In a recent study, Balaban et al. 78 have demonstrated that heterogeneous elastic material properties in infarcted hearts could be estimated by using an adjointbased data assimilation approach with a reduced HO model. Further validation studies with controlled experiments are needed. For patients with diastolic heart failure, 23,73 the heart remodels its function and structure more globally compared to myocardial infarction, including the remodelling of myocardial stiffness, thus the global myocardial stiffness, as estimated in this study, is still relevant to those patients.
To sum up, this study should be considered as a step forward for passive parameter identification in cardiac mechanics using the HO model. Still many challenges need to be overcome before addressing the passive parameter identification in full for modelling cardiac mechanics, such as detailed cardiac motion measurements, unloaded residual-stressed geometry, micro-structure informed spatial heterogeneity, interactions with active contraction and electrophysiology, in vivo experiments, etc. Interested readers are referred to recent reviews in computational cardiac mechanics. 1,79

| CONCLUSIONS
We have proposed an efficient, Bayesian optimisation-based framework for parameter inference in a cardiac mechanic model of the left ventricle in diastole. We have demonstrated that BO converges to lower values of the two objective functions considered and requires less invocations of the associated forward simulator than established state-of-the-art iterative optimisation algorithms. We have developed a new approach to minimising a target function given as a sum of error terms based on approximating each of these terms individually via partial error surrogates. The findings of our empirical studies suggest that this approach may outperform the standard one based on target surrogates. However, which of the two variants is preferred in a specific case could be subject-dependent (depending on e.g. the LV geometry).

A P P END I X A : RESULTS ON PARAMETERS
A.1 | Basic study Table A1.

A P P END I X C: SYNTHETIC DATA STUDY
In this appendix, we report the results of a synthetic data study aimed at verifying the theoretical accuracy of the proposed method. We consider the geometry of HV A and generate ten synthetic data sets by perturbing the estimate for HV A corresponding to the lowest value of the objective function with Gaussian noise. This way we obtain synthetic data sets with known underlying ground-truth (GT) parameter values, which we then estimate with BO. Figure C2 presents the results from this experiment in terms of errors in the output space (by means of the objective function) and in the parameter space (by means of the difference between the GT value and its estimate). Figure C3 illustrates the convergence of the incumbent values. We can see that the convergence of the objective function is rapid and to very low values (compared with the real data studies in Sections 4.3 and 4.4), of order 10 À5 . Moreover, we notice that on average the difference between the GT parameter values and their estimates is zero. As expected, θ 1 and θ 2 are estimated with high precision, while θ 3 and θ 4 are harder to identify, hence there is more variance in the corresponding estimation errors.  Figures 6 and 10, respectively. Applying much higher pressure results in differences between deformations implied by estimates from different algorithms being exaggerated. Still, the deformations implied by BO and HGO estimates are generally consistent, with most noticeable differences appearing in the apical region.   Recently, there has been an increasing awareness in the cardiac modelling community of the importance of uncertainty quantification and sensitivity analysis for model parameter estimates. 59,60,61 Below we describe three principled directions one can take to assess the uncertainty of parameter estimates: an asymptotic approach, a computational frequentist approach, and a computational Bayesian approach. We also discuss the advantage of Gaussian processes (see Section 3.1) used in this work over polynomial chaos methods in the context of BO.

E.1 | Asymptotic approach
The asymptotic approach is based on the Fisher information inequality and the Cramer-Rao bound, whereby the variance of an estimated parameter is lower-bounded by the inverse of the negative second derivative of the log likelihood with respect to this parameter. For additive independent Gaussian noise, the log likelihood is just a rescaled version of our sum-of-squares objective function; so it can easily be obtained. To generalise this to more than one parameter, we need to get the diagonal elements of the inverse of the negative Hessian, that is the matrix of negative second derivatives of the log likelihood. To compute the second derivatives, we can exploit the fact that our objective function is approximated by a Gaussian process, and that a Gaussian process is closed under differentiation, that is if the kernel function of the Gaussian process is twice differentiable, we will get these derivatives in closed form. The advantage of this approach is that it is analytically tractable and fast. The disadvantage is that it is based on asymptotic assumptions. These may not hold for sparse data, with the consequence that the lower bound may not be tight. For that reason, we prefer computationally more expensive non-asymptotic methods.
F I G U R E E 6 Klotz-curve study: violin plots for the bootstrapped parameter estimates from run v1 with target surrogates

E.2 | Computational frequentist approach
This approach is based on a residual bootstrapping procedure 80 to estimate the parameter estimation uncertainty. Take the predictions (standardised circumferential strains and end-diastole LV volume) obtained from the final parameter vector estimate θ Ã , ξ θ Ã ð Þ, and compute the residuals by comparison with the data. Let us call the set of residuals E. We then generate an ensemble of hypothetical data, D 1 , …, D K , by drawing residuals with replacement from E and adding these values to ξ θ Ã ð Þ. For each of these bootstrap replica, D 1 , …, D K , we carry out a local iterative optimisation (e.g. using conjugate gradients), starting from θ Ã . This procedure returns a set of bootstrap estimates, θ 1 , θ 1 , …, θ K f g , which can be used for uncertainty quantification. We use the median absolute deviation (MAD) as a simple summary statistic; this is a robust measure of variability that is less susceptible to outliers than the standard deviation. We multiply the MAD with a constant scale factor of k = 1.4826, which makes it a consistent estimator for the estimation of the standard deviation of a normally distributed random variable. Figure E6 presents examples of bootstrap results by means of violin plots for BO estimates (with target surrogates, run v1) from the Klotz-curve study. Each figure is based on K = 25 bootstrap samples. Recall that the parameters come from a compact domain [0.1,5] 4 hence any apparent truncation of a violin plot (e.g. θ 4 for HV C) is caused by the underlying parameter being estimated close to its boundary. We can see that the uncertainty in parameter estimates for HV A is very small as visualised by hardly visible "violins"; we can observe a few outliers, however. In contrast, the uncertainty for HV B is much larger, as depicted by larger "violin" areas. Table E3 reports the associated MADs

E.3 | Computational Bayesian approach
We can use the emulated objective function to approximate the log likelihood, define appropriate prior distributions for the parameters and apply standard Markov chain Monte Carlo techniques to approximately sample from the posterior distribution. The practical disadvantage is that this is an intrinsically sequential approach, whereas the optimisation runs for the bootstrap approach can be parallelised.

E.4 | Gaussian processes versus polynomial chaos
Polynomial chaos methods have been widely used as an alternative to Gaussian processes for emulation and surrogate modelling. 59,60,61 However, for Bayesian optimisation, the emulation of the objective function is not sufficient. We also need a reliable estimate of the variance of the approximated objective function, which is needed for the acquisition function (see Section 3.2). Figure 4 in Shahriari et al. 6 shows several examples of methods that emulate the objective function well, but provide poor estimates of its variance. This will lead to a poor trade-off of exploration versus exploitation and is unsuitable for Bayesian optimisation. For that reason, we have used Gaussian processes rather than polynomial chaos expansion in our work.

A P P END I X F : SIMULATOR CRASHES
In our applications, crashes are caused by implausible parameter sets, by which we mean that the "crash" parameter sets result in an excessive distortion of the underlying LV geometry, for example, when all parameters are close to the lower bounds, suggesting a very soft myocardium. An increased number of crashes for BO compared to HGO is a T A B L E E 3 Klotz-curve study: median absolute deviations (MADs) for the bootstrapped parameter estimates from run v1 with target surrogates consequence that BO is a global optimisation technique, in which exploitation is balanced with exploration, while HGO performs mostly local, gradient-based steps and hence focuses on exploitation. From this point of view, despite their name, crashes are not necessarily something negative, provided an algorithm has a feature that allows for accounting for them (like the classifier in our proposed BO framework, see Section 3.2.2), as they allow us to learn about the feasible parameter set. We emphasise again that the advantage of our global optimisation scheme is a wider exploration of the configuration space and that our model thus becomes aware of these critical regimes. A local greedy optimisation scheme, on the other hand, is by construction "blinkered", that is so focused on exploitation that it never explores these critical regimes. Figure F7 presents uni-axial stretch-stress responses along the sheet-normal direction between the crash points (red) and the BO estimates (black). We can see that the material stiffness along the sheet-normal direction is dramatically lower in crashed simulations than in successful simulations corresponding to BO estimates, in other words, very soft along the sheet-normal directions for those crashed simulations. This would explain why we observed crashes for certain parameter sets.

A P P END I X G: COMPARISON WITH A STATE-OF-THE-ART NUMERICAL OPTIMISER
We appreciate that the HGO algorithm runs with respect to different parameters than the proposed BO framework and that the number of parameters used by both algorithms is different (in particular, in HGO the number of parameters with respect to which the optimisation is carried out may vary between the algorithm steps). This may raise concerns about the fairness of the BO-HGO comparison. Therefore, in this appendix, we present a comparison of BO against an established iterative optimisation scheme in the same 4-dimensional parameter space. Specifically, we consider the SQP algorithm (sequential quadratic programming), which is a state-of-the art gradient-based numerical optimiser, implemented in the fmincon function from MATLAB's Optimisation Toolbox. Since the performance of numerical optimisation depends on the initial point, we have run SQP from 4 different initial points: θ 1 ð Þ 0 ¼ 1, 1, 1, 1 ½ T and three F I G U R E F 7 Uni-axial stretch-stress responses along the sheet-normal direction for crash points (solid red) and the BO estimates (dashed black) points closest to θ 1 ð Þ 0 in the Latin hypercube initial design used to initialise BO. Figure G8 reveals that BO converges to lower values of the objective function for all the HVs. Moreover, BO typically requires fewer forward simulations achieve the same level of accuracy as the numerical optimizer.
F I G U R E G 8 Basic study: comparison of the BO convergence (dashed lines, light green: target surrogates; dark green: partial error surrogates) with the final optima from the SQP (four red dots, one for each initial point) and HGO (black dot) algorithms