Bounding the solutions of parametric weakly coupled second-order semilinear parabolic partial differential equations

SUMMARY In this paper, two novel techniques for bounding the solutions of parametric weakly coupled second-order semilinear parabolic partial differential equations are developed. The ﬁrst provides a theorem to construct interval bounds, while the second provides a theorem to construct lower bounds convex and upper bounds concave in the parameter. The convex/concave bounds can be signiﬁcantly tighter than the interval bounds because of the wrapping effect suffered by interval analysis in dynamical systems. Both types of bounds are computationally cheap to construct, requiring solving auxiliary systems twice and four times larger than the original system, respectively. An illustrative numerical example of bound construction and use for deter- ministic global optimization within a simple serial branch-and-bound algorithm, implemented numerically using interval arithmetic and a generalization of McCormick’s relaxation technique, is presented. Problems within the important class of reaction-diffusion systems may be optimized with these tools. © 2016 The Authors. Optimal Control Applications and Methods published by John Wiley & Sons, Ltd. 2016


INTRODUCTION
Reaction-diffusion systems can be modeled using semilinear parabolic partial differential equations (PDEs). An important and well-known example is the heat equation with source term nonlinear in the temperature. One may be faced with the task of fitting such a model to experimental data by formulating an optimization problem. The resulting optimization problem is typically nonconvex, making it desirable to develop a global optimization method for problems involving this class of differential equations, to ensure that the best possible fit can be obtained and that the descriptive power of this class of important models can be robustly evaluated. Alternatively, one may be interested in coming up with the best possible (i.e., global) solution to a design problem involving this important class of differential equations. Unlike stochastic global optimization methods, such as genetic algorithms and simulated annealing, deterministic global optimization using a branch-andbound algorithm can provide a guarantee that the global optimum has been identified to within a finite tolerance. This is achieved when the algorithm converges, because it represents a constructive procedure for locating the global optimum. The critical component of such an algorithm is the construction of parametric bounds on the PDE solution.
The problem of developing parametric bounds for the solutions of parametric ordinary differential equations (ODEs) has received much attention in the literature. It grew out of the study of 620 P. AZUNRE temporal domain by T Á 0; t f and the temporospatial domain by Q Á T , denoting its closure by N Q. Let Á @ T and let p 2 P Á p L ; p U denote the parameter vector. Intervals between vector functions are componentwise and pointwise in their domain. Denote by C N Q the space of functions continuous in N Q, and by C m;l N Q the space of functions with derivatives up to m th order with respect to x and up to l th order with respect to t continuous in N Q. For each p 2 P , and every i 2 I u Á ¹1; : : : ; n u º, define an operator as follows: L i u i;p .x; t / Á a i .x; t / @ 2 u i;p @x 2 .x; t / C b i .x; t / @u i;p @x .x; t /; 8.x; t / 2 Q: Then, define an operator as follows: x; t / Á @u i;p @t .x; t / L i u i;p .x; t/ ; 8.x; t / 2 Q: If, for every i 2 I u , a i is positive in Q, the operators L i and L i are said to be elliptic and parabolic, respectively. Then, the coupled system of a finite number n u of equations is parabolic. This system is weakly coupled in the sense that the coupling source function f does not depend on state spatial derivatives. Dependence of the state variable u p 2 R n u on p is denoted by the subscript to indicate that it is implicit. B i denotes a linear boundary operator of the form: with @u i;p @ . ; t/ for each t 2 T denoting the outward normal spatial derivative of u i;p . ; t/ on @ , and˛i .x; t / > 0;ˇi .x; t / > 0;˛i .x; t / Cˇi .x; t / > 0; 8.x; t / 2 : We next briefly summarize some standard continuity and consistency assumptions required for the existence of a solution to (3) as follows. The coefficients of operator L i and the functions h i ; u i;0 are assumed to be Hölder continuous in their respective domains 8i. The coefficients of the operator B i are assumed to be continuous in their domain 8i. The initial conditions u i;0 satisfy the boundary conditions at t D 0 8i. The functions f i are assumed to be Hölder continuous in Q 8i. Additionally, some regularity is expected for the boundary surface @ in the sense that each of its points can be represented locally by a function satisfying a Hölder continuity condition. An exhaustive discussion of these well-studied existence conditions is outside the scope of this work, and we refer the interested reader to Section 2.1 of [10], for instance, for more details.
For notational brevity, it is convenient to rewrite (3) as follows:

A simple serial branch-and-bound method
Here, a procedure for deterministic branch-and-bound global optimization is outlined, to motivate the constructions of parametric bounds on u p . The classic reference for branch-and-bound theory is [20].

621
Consider an optimization problem of the form: .u p .x; t /; x; t; p/ 9 = ; : Here, O denotes a potentially nonconvex on P objective function, for example, least squares or maximum likelihood in parameter estimation problems. Moreover, N Q M denotes a set of points in N Q at which experimental measurements are taken. Standard optimization software (e.g., f mincon in MATLAB) can only yield a locally optimal solution O .p loc / in the vicinity of the initial guess. A branch-and-bound algorithm can determine the globally optimal solution O p opt to within some finite absolute convergence tolerance O , by recursively bounding the solution on progressively smaller subintervals of the parameter space.
A local optimizer can be used to obtain an upper bound by initializing it anywhere on the subinterval (another approach is to just evaluate the objective function anywhere on the subinterval). The corresponding lower bound can be obtained in one of two ways. Assume that a pointwise in N Q interval bound u L ; u U on u p is available (the construction of such bounds is addressed later on in this paper). Standard interval arithmetic [3] can then be used to propagate it (along with P ) through to obtain a corresponding interval bound O L ; O U on O. O L can then be taken to lower bound O, and hence its global minimum, on the subinterval. Alternatively, assume that a pointwise in N Q bound u C V p ; u C C p on u p is available, with u C V p being convex and u C C p being concave on P (the construction of such bounds is again addressed later on in this paper). Then, a generalization of McCormick's relaxation technique can be applied to obtain an interval bound O C V .p/ ; O C C .p/ on O .p/ for each p 2 P , with O C V being convex and O C C being concave on P . O C V can then be locally optimized on the subinterval to yield a lower bound on the global minimum (this step uses the well-known fact that the local minimum of a convex function is its global minimum).
To aid visualization, the different types of bounds are illustrated in Figure 1 for a univariate objective function. A generalization of McCormick's relaxation technique will be used to construct u C V p ; u C C p later on in this paper and will be discussed in more detail at that time. If the objective lower bound on any subinterval is higher than the least upper bound known so far (LUB, or the incumbent solution), the global solution cannot exist on it, and the subinterval is excluded from further consideration. If the least lower bound on the remaining subintervals is not within O of the LUB, one subinterval is bisected on a uniformly randomly selected parameter into two intervals to be bounded and added to the active interval list. The process is initiated with P and continued until the least lower bound is within O of LUB. At this point, an optimal solution is available as the parameter corresponding to the LUB (this solution, by construction, is known to be within O of the global solution).  A sample illustrative iteration of the procedure, with the convex bound being employed to lower bound the objective on each subinterval, is shown in Figure 2. We see that we need to construct bounds for the PDE solution that are either constant in the parameter (i.e., an interval bound) or convex lower and concave upper in the parameter (bounds also typically referred to as convex and concave relaxations, or simply relaxations, in the global optimization literature).

Background theorems
Here, key theorems necessary to construct the bounds are collected. The following existencecomparison theorem is key for the construction of the interval bounds. Unless made explicit otherwise, the order relation 6 should henceforth be taken to be componentwise and pointwise in the domain for vector functions.
We first restate a positivity lemma for parabolic operators from [10]. This lemma is a consequence of the maximum principle for parabolic operator (2).

Theorem 1
Let a function u p 2 C 2;1 .Q/ \ C. N Q/ be such that where c Á c.x; t/ is a bounded function in Q. Then, u p > 0 in N Q.
Next, we restate a theorem due to Pao [10].

Theorem 2
Consider (6) for some p 2 P and assume that a pair of functions v and w in C 2;1 .Q/ \ C N Q , ordered as v 6 w, satisfy the following inequality 8i 2 I u : with it being assumed that each f i is Lipschitz in the state on OEv; w , that is, Then, there exists a unique solution u p to (6), and it is ordered as v 6 u p 6 w.
Proof See Theorem 8.9.3 in [10]. The author of that work, Pao, proved the theorem by constructing a monotone sequence of functions, with w and v as the initial condition for the iteration, to converge to u p from above and below, respectively.
Next, we prove an original theorem, utilizing the previous two theorems, that will be required to construct the interval bounds.

Theorem 3
Consider (6) for some p 2 P and assume that a pair of functions v and w in C 2;1 .Q/ \ C N Q satisfy the following inequality 8i 2 I u : with it being assumed that each f i is differentiable in the state on .min.v; w/; max.v; w// and is Lipschitz in the state on OEmin.v; w/; max.v; w/ (condition 10). Then, such a pair of functions is necessarily ordered as v 6 w. Moreover, there exists a unique solution u p to (6), and it is ordered as v 6 u p 6 w.

Proof
First, we show that such a pair of functions is necessarily ordered as v 6 w. For this purpose, subtract the top half of (11) from the lower half, and observe that the following inequality 8i 2 I u is directly implied: Whenever w i D v i , the right-hand side of (12) is 0. Next, observe that whenever w i ¤ v i , the mean value theorem can be applied to deduce the following inequality 8i: where Á is an intermediate value (pointwise in Q) between w i and v i for each z in OEmin .v; w/ ; max .v; w/. It is well-known that Lipschitz functions that are differentiable have bounded first derivatives (see any standard textbook on analysis, such as [21]). Thus, @f i @´i is bounded in Q for each z in OEmin .v; w/ ; max .v; w/. Then, by the positivity lemma for (6) (Theorem 1), it follows that w i v i > 0; 8i 2 I u , that is, that v 6 w. Having shown this, the existence of a unique solution u p ordered as v 6 u p 6 w is a direct consequence of Theorem 2.

P. AZUNRE
The following theorem is key for the construction of the convex/concave relaxations.

Theorem 4
Consider the pair of scalar PDEs: for i 2 ¹1; 2º. Assume that, for some p 2 P , the following inequality holds:

Interval bound
We next construct an interval bound on the solution to (6).

Theorem 5
Consider a pair of functions u L and u U satisfying the following inequality 8i 2 I u : with it being assumed that each f i is differentiable in the state on u L ; u U and is Lipschitz in the state on u L ; u U , that is, 9K i 2 R C such thať Then, there exists a unique solution u p to (6) for each p 2 P , ordered as u L 6 u p 6 u U .

Proof
For each p 2 P , u L and u U satisfy the hypotheses of Theorem 3.
Consider the following simple example application of this theorem.

Example 1
Consider the following scalar PDE for some p 2 P : @u p @t .
x; t / @ 2 u p @x 2 .x; t / D e p 3 ; 8.x; t / 2 Q; with the boundary and initial conditions not being parameter dependent. An auxiliary system satisfying (16) is obtained as follows: with the same initial and boundary conditions as the original PDE. Here, we have used the fact that the source function is monotonically increasing in p to deduce that In general, standard interval arithmetic can be used to obtain an interval bound for the range of the right-hand side, thereby obtaining a valid auxiliary system satisfying (16), with a variety of software tools (e.g., INTLAB for MATLAB [22]) being available to automate the process. Note, however, that Taylor bounding methods [6,23] may also be used, provided the right-hand side source functions are sufficiently differentiable, as required by Taylor arithmetic.

Relaxations
In this subsection, it is assumed that an interval bound u L ; u U has been constructed as specified in the previous subsection (this also establishes the existence of a unique solution u p for each p 2 P ). We are interested in constructing convex and concave relaxations of each u i;p on P , that is, a pair of functions u cv i;p and u cc i;p that are respectively convex and concave on P pointwise in N Q and respectively lower bounds and upper bounds pointwise in N Q for each p 2 P . The following theorem is used to achieve this.

Theorem 6
For some p 2 P , consider a pair of functions u cv p and u cc p satisfying the following equality 8i 2 I u : Here, the superscripts C V and C C should be taken to mean that these functions are respectively valid convex relaxations and concave relaxations of the original right-hand side functions, provided u cv p and u cc p are respectively valid convex and concave relaxations of u p . Moreover, assume that A ; 8 y 1 ; y 2 ; y 3 ; y 4 2 R n u R n u R n u R n u ; A ; 8 y 1 ; y 2 ; y 3 ; y 4 2 R n u R n u R n u R n u : Then, u cv p and u cc p are valid convex relaxations and concave relaxations of u p , respectively.

Proof
For each p 2 P , under the assumed global Lipschitz continuity of each f C V i and f C C i in u cv p and u cc p , the sequence in C 2;1 .Q/ \ C N Q , with successive iterates defined by the following: 8i 2 I u converges to the unique solution u cv p ; u cc p to (21) from any initial estimate in C 2;1 .Q/ \ C N Q . See Theorem 8.9.1 in [10] for proof. The formal reason behind this is that the mapping between successive iterates is a contraction mapping on the Banach space C 2;1 .Q/ \ C N Q [10]. For every p 2 P , choose u cv;0 p and u cc;0 p to be u L and u U , respectively. Assume that the following inequalities hold at step k for any distinct parameter pair p 1 ; p 2 2 P and any 2 .0; 1/: u cv;k p 6 u p 6 u cc;k p ; 8p 2 P; u cv;k p 1 C.1 /p 2 6 u cv;k Note that these are valid at k D 0. These inequalities capture the fact that u cv;k p ; u cc;k p are valid relaxations of u p , and hence and u C C i;0 are valid relaxations of their respective functions at step k. Simultaneously, consider the following sequence: Consider the following simple analytic example to fix these ideas. The purpose of this example is to allow the reader to get a better sense of how Theorem 6 may be used in practice.

Example 2
Consider the following function: g.p/ D e p 3 C u p ; 8p 2 P D OE 1; 1 : Here, u p is a function whose dependence on p is not known explicitly, but whose range is known to be bounded on P by u L ; u U and whose relaxations are known to be u cv p and u cc p . Truncate these relaxations to lie on u L ; u U as u C V p D max.u cv p ; u L / and u C C p D min.u cc p ; u U / (recalling that max and min functions preserve convexity and concavity, respectively). We are interested in obtaining a convex relaxation of g on P , g C V . Then, consider the finite recursive composition: with 5 representing the original function g. Use the specified interval for the interval-valued independent variable p to specify the following: Because 2 is monotonically increasing in 1 , deduce The˛BB relaxation rule [27] can be used to obtain the following relaxations for 2 : Truncate these to lie on L 2 ; U 2 (recalling that max and min functions preserve convexity and concavity, respectively) as follows: Because the exponential function is convex, its convex relaxation on its domain is also the exponential function. Moreover, because it is a monotonically increasing function, it attains its infimum at the lower bound of its domain. Then, McCormick's composition theorem can be applied to relax 3 as follows: Here, mid is just the median of the three input values. Thus, the convex relaxation of 5 is as follows: In other words, the convex relaxation of g as a function of p, g C V , is given by the following: p/ D e mid .max. 1;p 3 C3.p 2 1//;min.1;p 3 3.p 2 1//; 1/ C max u cv p ; u L : Finally, consider the following PDE (with the function we relaxed previously as the source term) for each p 2 P D OE 1; 1: @u p @t .
x; t / @ 2 u p @x 2 .x; t / D e p 3 C u p ; 8 .x; t/ 2 Q; with initial and boundary conditions that do not carry parameter dependence, and assume that an interval bound u L ; u U has already been constructed for u p using Theorem 5. The convex relaxation of u p for each p 2 P can then be obtained by solving the following PDE: @u cv p @t .

The case of state spatial derivatives coupled to parameter dependence
When parameter dependence is coupled to the state spatial derivatives, that is, when the elliptic operator in (1) takes the following form for each p 2 P : the construction of constant bounds does not change much; that is, one solves the PDE system defined by the following inequalities: (41) 8i 2 I u in place of (16). To verify that this is true, one just need to recognize that for each p 2 P , u L and u U satisfy the hypotheses of Theorem 3. This case is important for reaction-diffusion systems because optimizing over diffusion coefficients is a very important use-case for this class of problems. However, note that the development of separate convex/concave relaxations for this case is less important, because one can always perform a two-stage optimization: first, over the diffusion coefficients using the system (41), and then over the parameters in the source term using system (21) if nonquasimonotonicity calls for it. In the unlikely scenario that a reaction-diffusion system possesses a source-term dependent on diffusion coefficients (the author of this work is yet to see a physical example), the final chapter of the author's dissertation [28] conjectures a way to extend system (21) to this case.

NUMERICAL DEMONSTRATION
First is a numerical note. Systems are solved using the method of lines, discretizing them on the spatial domain using the three-point-centered finite difference scheme on a uniform grid of spatial nodes and integrating the resulting coupled ODE system forward in time using the C CC CVODES ODE solver [29]. McCormick relaxations for the right-hand sides are constructed by the opensource C C C library libMC [26]. Interval arithmetic is performed using a combination of libMC and INTLAB. The local optimizer used is the f mincon optimizer in MATLAB (which uses the trust-reflective-region method). All C C C code was linked to MATLAB using its mex interface. Note that because in the example, the source function is polynomial in the state, the hypotheses of Theorem 5 are easy to verify.
The following example involves a semilinear parabolic PDE in two variables, coupled through a nonquasimonotone function, so relaxations are constructed along with the interval bounds. It is a modified form of the ODE example in [5].
x; t / @ 2 u 2;p @x 2 .x; t / D p 2 u 1;p .x; t / u 2;p .x; t/ C u 2;p .x; t / (44) Figure 3. Objective and its bounds for the numerical example in (A) on the left (the objective is red, interval lower bound is black, interval upper bound is magenta, convex lower bound is green, and concave upper bound is blue). In (B) on the right, a slice (where p 1 D p 2 ) through objective to help visualize its nonconvexity. and initial conditions are specified as a line between these boundary values. P is specified as OE1; 7 OE1; 7. N Q M is specified by the uniform spatial grid of 100 points on which the PDE is solved. The objective is visualized, along with its interval/constant bounds and relaxations, on P , in Figure 3. We see that all bounds are valid and that the relaxations are significantly better than the interval bounds.
Next, we numerically study the convergence of both types of bounds to the solution at the midpoint of P in Hausdorff metric, that is, the evolution of the quantities: evaluated on the parameter interval p mpt " HM ; p mpt C " HM as " HM varies from 3 to 0. Note that p mpt is just the midpoint of P , that is, OE4; 4. A similar study was carried out by Bomparde and Mitsos in [30] for McCormick models. The evolution in these quantities can be seen in Figure 4    linear convergence for the interval bounds and a quadratic convergence for the relaxations constructed using general McCormick relaxations. Finally, we show a sample run of the branch-and-bound algorithm on the problem. Absolute convergence tolerance was chosen as 10 4 , upper bounds were computed via local optimization of the original objective function, and lower bounds were computed via local optimization of the objective function's convex relaxation. We see in Figure 5 that bounds converge sufficiently close within 14 iterations. The optimal solution is 1.0499, which is attained at OE1; 7.