Linear and nonlinear solvers for variational phase-field models of brittle fracture

The variational approach to fracture is effective for simulating the nucleation and propagation of complex crack patterns, but is computationally demanding. The model is a strongly nonlinear non-convex variational inequality that demands the resolution of small length scales. The current standard algorithm for its solution, alternate minimization, is robust but converges slowly and demands the solution of large, ill-conditioned linear subproblems. In this paper, we propose several advances in the numerical solution of this model that improve its computational efficiency. We reformulate alternate minimization as a nonlinear Gauss-Seidel iteration and employ over-relaxation to accelerate its convergence; we compose this accelerated alternate minimization with Newton's method, to further reduce the time to solution; and we formulate efficient preconditioners for the solution of the linear subproblems arising in both alternate minimization and in Newton's method. We investigate the improvements in efficiency on several examples from the literature; the new solver is 5--6$\times$ faster on a majority of the test cases


INTRODUCTION
Cracks may be regarded as surfaces where the displacement field may be discontinuous. Fracture mechanics studies the nucleation and propagation of cracks inside a solid structure. Variational formulations recast this fundamental and difficult problem of solid mechanics as an optimization problem. The variational framework naturally leads to regularized phase-field formulations based on a smeared description of the discontinuities. These methods are attracting an increasing interest in computational mechanics. The aim of our work is to propose several improvements in the linear and nonlinear solvers used in this framework.
The code for the algorithms proposed in this paper, and the thermal shock example of section 4.3, are included as supplementary material † .

Variational formulation of fracture and gradient damage models
The variational approach to fracture proposed by Francfort and Marigo [1] formulates brittle fracture as the minimization of an energy functional that is the sum of the elastic energy of the cracked solid and the energy dissipated in the crack. The simplest fracture mechanics model, due to Griffith [2], assumes that the cracked solid Ω \ Γ is linear elastic and that the surface energy is proportional to the measure of the cracked surface Γ. The crack energy per unit area is the fracture toughness G c , a material constant. In this case the energy functional to be minimized is where u is a vector-valued displacement field, ε(u) = sym(∇u) is the second order tensor associated to the linearised strains inside the material, A 0 the fourth order elasticity tensor of the uncracked solid, and S(Γ) is the Hausdorff surface measure of the crack set Γ. In a quasi-static time-discrete setting, given an initial crack set Γ 0 , the cracked stated of the solid can be found by incrementally solving the following unilateral minimization problem [3,1]: where Cū(Ω) ≡ {u ∈ H 1 (Ω, R n ), u =ū on ∂ūΩ} is the space of admissible displacements, ∂ūΩ is the part of the boundary where the Dirichlet conditions are prescribed and H 1 (Ω \ Γ, R n ) denotes the Sobolev space of vector fields defined on Ω \ Γ with values in R n . The minimization problem above is labelled unilateral because the crack set cannot decrease in time. This problem is quasi-static and rate-independent, so that time enters only via the irreversibility constraint. The numerical solution of the free-discontinuity problem [4] above is prohibitive, because of the difficulty related to the discretization of the unknown crack set Γ where the displacement may jump.
To bypass this issue, Bourdin et al. [5] transposed to fracture mechanics a regularization strategy introduced by Ambrosio and Tortorelli [6] for free-discontinuity problems arising in image segmentation [7]. The regularized model approaches the solution of (2) by the solution of arg min{E(u, α), u ∈ Cū(Ω), α ∈ D αi−1 }, (4) with the regularized energy functional with c w = 4 1 0 w(α) dα. In this formulation α is a smooth scalar field, that can be interpreted as damage, and is an additional parameter controlling the localization of α. With α i−1 the solution at the previous time step and denoting by ∂ᾱΩ the part of the boundary where the Dirichlet conditions are prescribed on α, the admissible space for α is a convex cone imposing the unilateral box constraint Dᾱ(Ω) ≡ {u ∈ H 1 (Ω, R),ᾱ ≤ α ≤ 1 a.e. in Ω, α =ᾱ on ∂ᾱΩ}, 3 [10,11]. The α field is a damage variable that modulates the elastic stiffness and introduces an energy dissipation. In this context, (5) can be regarded as a model per se for the evolution of damage in the material, and one can associate a physical meaning to the internal length and to evolutions following local minima of the functional (5). In particular, one can show that in quasi-static evolutions ruled by a local minimality condition, the internal length is regarded as a constitutive parameter that controls the critical stress in the material before failure. We refer the reader to [11] and Section 4 for further details on this point. Whilst the original model of Bourdin et al. [5] assumes small deformations, isotropic materials, quasi-static evolution, and allows for interpenetration of crack lips in compression, recent contributions include extensions to dynamics [12,13,14], multiphysics couplings [15,16], anisotropic materials [17], large elastic deformations [18,19,20], cohesive fracture [21], and compressive failure with unilateral contact at crack lips [22,23,24,25,26], plates and shells [27,28], thin films [29]. Other works [11,10,30,31] discuss how the choice of the functions a and w in (5) affects the properties of the solutions, by analytical and numerical investigations. On the basis of these results, recent numerical works [see e.g. 32] adopt the choice w(α) = α and a(α) = (1 − α) 2 + k , which we also employ in the rest of this paper. We refer the reader to [11] for a comparative analysis of this model and the original model in Bourdin et al. [5].
In the remainder of this paper, we discuss the numerical solution of the minimization problem (4), after a standard finite element discretisation. We focus on the simplest model, neglecting the effect of geometrical non-linearities and the non-symmetric behaviour of fracture in traction and compression. More complex physical effects drastically modify the character of the numerical problems to be solved and require further problem-specific developments that are outside the scope of this work. where Note that the bilinear form E uu is akin to a standard linear elasticity problem, and that the bilinear form E αα is akin to a Helmholtz problem. The most popular algorithm for the solution of the system (7) is the alternate minimization method proposed by Bourdin et al. [5]. This algorithm rests on the observation that while the minimization problem (4) is nonconvex, the functional is convex separately in u or α if the other variable is fixed. Alternating minimization consists of alternately fixing u and α and solving the resulting smaller minimization problem, iterating until convergence. At each iteration before convergence the optimization subproblem has a unique solution with a lower energy, and thus the algorithm converges monotonically to a stationary point [33]. The algorithm is detailed in Algorithm 1.

Algorithm 1: Standard alternate minimization
Result: A stationary point of (4). Given (u i−1 , α i−1 ), the state at the previous loading step.
The first subproblem, finding the updated displacement given a fixed damage, involves solving a standard linear elasticity problem, but with a strongly spatially varying stiffness parameter. The second subproblem, finding the updated damage given a fixed displacement, involves solving a variational inequality where the Jacobian is a generalized Helmholtz problem, again with spatially varying coefficients. The standard termination criterion used in [5] is to stop when the change in the damage field drops below a certain tolerance. Another approach [26] is to stop based on a normalized change in the energy. Miehe et al. [34] perform a single alternate minimization iteration and propose the use of an adaptive time-stepping.
The main drawback of alternate minimization is its slow convergence rate. This motivates the development of alternative approaches using variants of Newton's method for variational inequalities [10], such as active set [35] or semismooth Newton methods [36]. Newton's method is quadratically convergent close to a solution, but its convergence is erratic when a poor initial guess is supplied [37,38]. Numerical experience indicates that Newton's method alone does not converge unless extremely small continuation steps are taken. Recent attempts to address these convergence issues include the use of continuation methods [39] or globalization devices such line searches and trust regions [40]. Moreover, Newton-type method result in a large system of linear equations to be solved at each iteration; in prior work direct methods have been employed, limiting the scalability of the approach.

5
In this work we make several contributions to the solution of (4). First, we cheaply accelerate alternate minimization by interpreting it as a nonlinear Gauss-Seidel method and applying overrelaxation. Second, we further reduce the time to solution by composing alternate minimization with an active set Newton's method, in such a way that inherits the robustness of alternate minimization and the asymptotic quadratic convergence of Newton-type methods. Third, we design scalable linear solvers for both the alternate minimization subproblems, and the coupled Jacobians of the form (10) arising in active-set Newton methods.
Following Bourdin et al. [5], we spatially discretize the problem with standard piecewise linear finite elements on unstructured simplicial meshes [41]. This discretization converges to local minimizers of the Ambrosio-Tortorelli functional [42]. Alternatives proposed in the literature, but not considered here, includes isogeometric approaches [13]. Adaptive remeshing is a valuable method of improving to computational efficiency [43]. The presence of thin localisation bands renders anisotropic remeshing strategies [44] particularly attractive. Our work on the linear and nonlinear solver is potentially synergetic with these other efforts to improve computational efficiency.
The paper is organised as follows. Section 2 presents our improved nonlinear solver. The underlying linear solvers and preconditioners are discussed in section 3. Section 4 introduces three fundamental test problems that we use to assess the performance of the solvers. The results of the corresponding numerical experiments are reported in section 5. Finally, we conclude in section 6.

NONLINEAR SOLVERS
In this section we propose several improvements to the nonlinear solver employed for the minimization of the regularized energy functional (4). The first improvement is to reinterpret alternate minimization as a nonlinear Gauss-Seidel iteration: this naturally suggests employing an over-relaxed Gauss-Seidel approach, which we discuss in section 2.1. This over-relaxation greatly reduces the number of iterations required for convergence, with minimal computational overhead. The second improvement is to use alternate minimization as a preconditioner for Newton's method [45], as discussed in section 2.3. By combining these, our solver enjoys the robust convergence of (over-relaxed) alternate minimization and the rapid convergence of Newton's method. Alternate minimization is used to drive the approximation within the basin of convergence of Newton's method; once this is achieved, Newton's method takes over and solves the nonlinear problem to convergence in a handful of iterations. As our numerical experiments in section 5 demonstrate, this strategy is faster than relying on alternate minimization alone, even with over-relaxation.

Over-relaxed alternate minimization
In the block-Gauss-Seidel relaxation method for linear systems, the solution variables are partitioned; at each iteration, some variables are frozen and a linear subproblem is solved for the remaining free variables; the updated values for these variables are used in the solution of the next subset. Similarly, a nonlinear block-Gauss-Seidel relaxation first solves a nonlinear subproblem for one subset of the variables, then uses those updated values to solve for the next subset, and so on [46]. Alternate minimization is precisely a nonlinear block-Gauss-Seidel method that iterates between the displacement and damage variables. Just as over-relaxation can accelerate linear Gauss-Seidel [47], it can also accelerate nonlinear Gauss-Seidel [46]. Therefore, we augment the standard alternate minimization algorithm with a simple over-relaxation approach, Algorithm 2. The state before and after each alternate minimization substep are compared to determine the update direction, and over-relaxation is applied along that direction with relaxation parameter ω. In the damage step, the bound constraint on α is enforced during the line search: if a step with ω would be infeasible, the algorithm setsω to the midpoint of [1, ω], and repeats this recursively until the update to α retains feasibility. (The question of infeasibility does not arise for ω < 1.) The literature on over-relaxation methods is vast, and we briefly summarise the main points here. In linear successive over-relaxation (SOR) applied to a matrix A, the convergence depends on the 6 P. E. FARRELL AND C. MAURINI
spectral radius of the SOR iteration matrix where D, −L and −U are the diagonal, lower triangular and upper triangular components of A. Essentially, over-relaxation attempts to choose an ω that reduces ρ(M ). A similar result holds for block SOR [48]. Kahan [49] proved that ω ∈ (0, 2) is a necessary condition for the convergence of SOR, i.e. for ρ(M ) < 1. Ostrowski [50] proved that this is sufficient for convergence in the case where A is symmetric and positive-definite. For nonlinear SOR, Ortega and Rheinboldt [46,Theorem 10.3.5] proved the surprising result that the asymptotic convergence rate depends on the spectral radius of the SOR iteration matrix evaluated at the Jacobian of the residual evaluated at the solution. Nonlinear Gauss-Seidel methods (ω = 1) can also be extended to minimisation problems with constraints, under the name of block coordinate descent. We are not aware of any analysis of over-relaxation in the context of constraints, or in the infinite dimensional setting, as would be necessary here for a proof of convergence; however, the numerical experiments of section 5 demonstrate that convergence was achieved for all problems with all values of ω ∈ (0, 2) attempted, and that over-relaxation can significantly reduce the number of iterations required for convergence on difficult problems.

Choosing the relaxation parameter ω
The number of iterations required depends sensitively on the choice of ω. Extrapolating from the nonlinear SOR theory, we hypothesize that the optimal ω is that that minimizes the spectral radius of the SOR iteration matrix associated with the unconstrained degrees of freedom at the minimizer. Unfortunately, identifying this ω a priori appears to be difficult: such an analysis would rely on the spectral properties of the Hessian at the unknown minimizer [51], which are not in general known. In this work we rely on the naïve strategy of numerical experimentation on coarser problems, and defer an automated scheme for choosing ω to future work.

Composing over-relaxed alternate minimization with Newton
Even with over-relaxation, achieving tight convergence of the optimization problem takes an impractical number of iterations (on the order of hundreds or thousands for difficult problems). Therefore, instead of driving the optimization problem to convergence with ORAM, we use it instead to bring the iteration within the basin of convergence of a Newton-type method, Algorithm 3. There are two main problems to solve in designing such a composite solver: first, deciding when to switch from ORAM to Newton, and second, handling the possible failure of the Newton-type method.

7
The inner termination criterion for the over-relaxed alternate minimization used in this work is based on the norm of the residual of the optimality conditions (7). As the optimality conditions are a variational inequality, it is not sufficient to merely evaluate a norm of [E u , E α ], because the feasibility condition should enter in the termination criterion. Instead, the residual is defined via a so-called nonlinear complementarity problem (NCP) function Φ: a function which is zero if and only if the variational inequality is satisfied [52]. In this work we use the Fischer-Burmeister NCP-function, which is described in section 2.4. A typical inner termination criterion might be to switch when the norm of the Fischer-Burmeister residual has decreased by two orders of magnitude, although the choice taken should vary with the difficulty of the problem considered. If the inner tolerance is too tight, an excessive number of alternate minimization iterations will be performed before switching to Newton; if the inner tolerance is too loose, then the Newton iteration may not converge and the extra cost of solving Jacobians yields no advantage.
If the Newton-type method diverges (possibly significantly increasing the residual), it can be handled in one of two ways. The first is to check at the end of an outer iteration whether Newton's method reduced the residual: if not, discard the result of Newton and continue with more alternate minimization iterations. The second is to choose a Newton-type method that is guaranteed to monotonically decrease the norm of the residual, or to terminate with failure: this property is achieved by complementing the Newton iteration with a backtracking line search. This latter option was implemented in our experiments. If the Newton method fails to achieve a sufficient reduction, the outer composite solver simply reverts to alternate minimization to bring the solution closer to the basin of convergence. In this way, the robustness and monotonic convergence of alternate minimization is combined with the quadratic asymptotic convergence of Newton's method.
, the state at the previous loading step.
while not converged do Set Φ 0 to be the norm of the residual of the optimality conditions evaluated at (u k , α k ).
Apply over-relaxed alternate minimization, Algorithm 2. end while not converged and maximum iterations not reached do Apply an active-set Newton method with backtracking line search, such as Algorithm 4. end end

Reduced-space active set method
Both the damage subproblem and the subproblem to be solved at each coupled Newton iteration are variational inequalities, which when discretized yield complementarity problems. In this section we briefly review the Newton-type method used to solve these complementarity problems, a reduced-space active set method implemented in PETSc [53]. While semismooth Newton methods have gained significant popularity in recent years, the reduced-space method employed in this work makes devising preconditioners for the linear system to be solved in section 3.3 more straightforward.
A mixed complementarity problem (MCP) is defined by a residual F : R n → R n , a lower bound vector l ∈ R n −∞ , and an upper bound vector precisely one of the following conditions holds: A special case of a mixed complementarity problem is the choice l = 0, u = ∞, which is referred to as a nonlinear complementarity problem (NCP), NCP(F ). For clarity, the algorithm will be described in the context of NCPs; the extension to MCPs is straightforward [52].
An NCP-function φ : R × R → R is a function with the property that φ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = 0, i.e. that a solves NCP(b). An example is the Fischer-Burmeister function [54] NCP-functions are useful because it is possible to reformulate an NCP as a rootfinding problem.
Given an NCP-function, it is possible to define a residual of NCP(F ): A solution Algorithm 4: Reduced-space active set method.
Result: A solution of NCP(F ). Given x 0 , the initial guess.
Compute the active and inactive sets A and N via (15).
Solve the reduced Newton step (16) for d N .
Choose the step length µ such that Φ 2 is minimized, via line search on π x k + µd ; if this search direction fails, use the steepest descent direction instead. end At each iteration, the algorithm constructs a search direction d. The search direction is defined differently for the active and inactive components of the state. Given an iterate x and a fixed zero tolerance ζ > 0, define the active set 9

LINEAR SOLVERS AND PRECONDITIONERS
With this configuration of nonlinear solvers, there are three linear subproblems to be solved: linear elasticity for the displacement field with fixed damage; a Helmholtz-like operator for the damage field with fixed displacement; and (submatrices of) the coupled Jacobian of the optimality conditions. As it is desirable to solve finely-discretized problems on supercomputers, it is important to choose scalable iterative solvers and preconditioners for each subproblem. These are discussed in turn.

Linear elastic subproblem
The linear elastic problem is symmetric and positive-definite, and hence, the method of conjugate gradients [56] is used. However, the problem is poorly conditioned due to the strong variation in stiffness induced by damage localization, and appropriate preconditioners must be employed. The Krylov solver is preconditioned by the GAMG smoothed aggregation algebraic multigrid preconditioner [57], which is known to be extremely efficient for large-scale elasticity problems.
For algebraic multigrid to be efficient, it is essential to supply the algorithm with the nearnullspace of the operator, eigenvectors associated with eigenvalues of small magnitude [58]. The elasticity problem without damage has a near-nullspace consisting of the rigid body modes of the structure; with damage, the localized variation in stiffness induces additional near-nullspace vectors. Calculations of the smallest eigenmodes of the elasticity operator with SLEPc [59] indicate that if the structure is partitioned into two or more undamaged regions separated by damaged regions, the elasticity operator has additional near-nullspace vectors associated to independent rigid body motions of the separate regions. For example, suppose algebraic multigrid alone (no Krylov accelerator) is used to solve the elasticity problem arising with the converged damage field of the problem of the traction of a bar (section 4.2). With no nullspace configured, convergence is achieved in 2004 multigrid V-cycles; if only the entire rigid body modes are supplied, convergence is achieved in 50 V-cycles; and if the additional near-nullspace vectors corresponding to the partition are supplied, then convergence is achieved in 6 V-cycles.
While these additional near-nullspace vectors assist the convergence of the algebraic multigrid algorithm, they are very difficult to compute, as they depend on the damage field itself. Therefore in this work we do not supply these additional near-nullspace vectors, supplying only the rigid body modes of the entire structure. When a Krylov method is used to accelerate the convergence of the algebraic multigrid, the ratio of iteration counts between the full and partial near null-spaces decreases from approximately 10 to approximately 2. However, it may be possible to improve the convergence of the linear elasticity problem by approximating the additional near-nullspace vectors arising due to damage. This could be of significant benefit, as this phase constitutes a large proportion of the solver time.

Damage subproblem
The inactive submatrix of the Helmholtz problem for damage is also solved with conjugate gradients and the ML smoothed aggregation multigrid algorithm [60,61], with the near-nullspace specified as the constant vector.

The Newton step
Let the inactive submatrix of the coupled Jacobian be partitioned as where A is the assembly of linear elasticity operator (11a), and B and C are the inactive submatrices of the coupling term (11b) and the linearised damage operator (11c). The fast iterative solution of block matrices has been a major topic of research in recent years [62], with most preconditioners relying on the approximation of a Schur complement of the operator. It is straightforward to verify where S = C − B T A −1 B is the (dense) Schur complement matrix of J with respect to A. In this work, we take the simple approximation S ≈ C, which yields the preconditioner which requires one application of C −1 and two applications of A −1 per preconditioner application. This is implemented in PETSc using the symmetric multiplicative variant of the PCFIELDSPLIT preconditioner [64,65]. Both inverse actions are approximated by two V-cycles of algebraic multigrid. MINRES [66] is employed as the outer Krylov solver, as far from minimizers the Hessian may not be positive definite.

TEST CASES
In this section we introduce three test cases that are used to assess the performance of the proposed solvers. These test cases will then be used to assess the performance of the solver in section 5. The first investigates temporally smooth propagation of a single crack driven by appropriately controlled Dirichlet boundary conditions. The second consists of the uniaxial traction of a bar, testing crack nucleation. The last considers a thermal shock problem involving the nucleation and propagation of a complex pattern. All test cases consider isotropic homogeneous materials. In this context, the relevant material parameters are the Poisson ratio ν, the Young's modulus E, the fracture toughness G c , and the internal length . One can show that the internal length may be estimated by knowledge of the limit stress σ c through the relation Dimensional analysis shows that without loss of generality both G c and E can be set to 1, with a suitable rescaling of the loading. Hence, in all experiments we fix G c = E = 1, and in addition we fix ν = 0.3.

Surfing: smooth crack propagation
The main advantage of the variational regularized approach to fracture analyzed in this paper is its ability to compute the propagation of cracks along complex paths, including crack bifurcation, merging, and possible jumping in time and space. However, it is desirable to test the numerical algorithm in a simpler situation where a single preexisting crack is expected to propagate smoothly along a straight path with an assigned velocity v. To this end, we consider the surfing experiment proposed by Hossein et al. [67]. This consists of a rectangular slab Ω = [0, L] × [−H/2, H/2] of length L and height H with the Dirichlet boundary condition imposed on the whole boundary of the domain.ū is the asymptotic Mode-I crack displacement of linear elastic fracture mechanics

11
where (r, θ) are the polar coordinates, (e 1 , e 2 ) are the Cartesian unit vectors, µ is the shear modulus, and L c is the length of the preexisting crack. The intensity of the loading is controlled by the stress intensity factor K I . From the theory we expect that the crack propagates at the constant speed v along the line x 2 = 0 for K I = K c I = √ G c E. In the numerical experiments we set K I /K c I = 1.0, v = 1, L = 2, H = 1 and L c = 0.05. Figure 1 reports the results of the corresponding numerical simulations. This test is particularly useful to verify that the dissipated energy does not depend on and is equal to the product of the crack length and the fracture toughness G c . Obviously, in order for this condition to hold, the discretisation should be changed with the internal length, as controls the width of the localization band. We typically set the mesh size to h = /5. In the present test, to speed up the benchmarks, we use a non-uniform mesh respecting this condition only in the band where we expect the crack to propagate, as shown in Figure 1. This a priori mesh refinement is exceptional and not applicable in general. In all other tests, a sufficiently fine uniform mesh will be employed.   A basic problem of fracture mechanics is to estimate the ultimate load before fracture of a straight bar in traction. We consider a two dimensional bar of length L and height H under uniaxial traction with imposed displacement, as shown in figure 2. Analytical studies [30,11] show that, for L sufficiently greater that , a local minimum of the energy functional (5) is the purely elastic solution α = 0 for t < t c = 3G c /8E and the solution with one crack represented in figure 2 for t > t c . The cracked solution has a vanishing elastic energy and a surface energy given by G c W . The test may be easily extended to a 3D geometry. The critical load t c is the same in 1D under a uniaxial stress condition, in 2D plane stress, or in 3D.

Thermal shock
The thermal shock problem of a brittle slab [32] is a challenging numerical test for the nucleation and propagation of multiple cracks. In physical experiments [68,69], several ceramic slabs are bound together, uniformly heated to a high temperature and quenched in a cold bath, so as to submit the boundary of the domain to a thermal shock. The inhomogeneous temperature field induces an inhomogeneous stress field inside the slab, causing the emergence of a complex crack pattern, with an almost periodic array of cracks nucleating at the boundary and propagating inside the slab with a period doubling phenomenon. Following [32,70], we consider a simplified model of this experimental test. The computational domain Ω = [−L/2, L/2] × [0, H] (see Figure 3) is a slab of width L and height H, with a thermal shock applied at the bottom surface x 2 = 0. At each timestep τ i , we seek the quasi-static evolution of the cracked state of the solid by solving for a stationary point of the following energy functional: where ε eff (u; τ ) = ε(u) − ε 0 (τ ) is the effective elastic deformation accounting for the thermally induced inelastic strain where β is the thermal expansion coefficient. The temperature field T imposed is the analytical solution of an approximate thermal diffusion problem with a Dirichlet boundary condition on the temperature for a semi-infinite homogeneous slab of thermal diffusivity k c . In particular, it neglects the influence of the cracks on the thermal diffusivity. The function Erfc denotes the complementary error function and τ = 2 k c t/ 2 a dimensionless time acting as the loading parameter. Figure 3. Geometry and boundary conditions for the thermal shock problem (left), where u 1 and u 2 denotes the two components of the displacement field. The loading is given by the thermal stress induced by the temperature field T (x 2 , τ ) of (26), whose dependence in x 2 is sketched on the right for different times τ .
As discussed in Bourdin et al. [32], the system is governed by three characteristic lengths: the size of the domain L, the internal length , and the Griffith length 0 = G c /E(β∆T ) 2 . Hence, choosing 13 as the reference length, the solution depends on two dimensionless parameters: the mildness of the thermal shock 0 / and the size of the slab L/ . Here we perform numerical simulations for fixed slab dimensions L = 20 and internal length = 1. We apply the displacement boundary conditions described in Figure 3 and do not impose any Dirichlet boundary condition on the damage field. As can be show by dimensional analysis, without loss of generality, we set E = 1, G c = 1, β = 1. We consider the performance of the solver for varying ∆T and mesh sizes h.
Analytical and semi-analytical results are available for verification purposes and for the design of the numerical experiments. For ∆T < ∆T c = 8E /3β 2 G c the solution is purely elastic with no damage (α = 0 everywhere) [32,70]. For ∆T > ∆T c the solution evolves qualitatively as in Figure 4, with (i) the immediate creation of an x-homogeneous damage band parallel to the exposed surface, (ii) the bifurcation of this solution toward an x-periodic one, which (iii) further develops in a periodic array of crack bands orthogonal to the exposed surface. These bands further propagate with a period doubling phenomenon (iv). The three columns in Figure 4 show the phases (ii)-(iv) of the evolution for ∆T /∆T c equal to 2, 4 and 8. The wavelength of the oscillations and the spacing of the cracks increase with ∆T . In particular [32] shows that for ∆T ∆T c the initial crack spacing is proportional to √ 0 . Figure 5 reports the evolution of the dissipated energy versus time for the three cases of Figure 4. We note in particular that, while the evolution is smooth for intense thermal shocks (see the curve ∆T = 8∆T c ), for mild shocks there are jumps in the energy dissipation and hence in the crack length (see the curve ∆T = 2∆T c ). These jumps correspond to snap-backs in the evolution problem, where the minimization algorithm is obliged to search for a new solution, potentially far from the one at the previous time step.
This problem constitutes a relevant and difficult test for the solver. First, the presence of a large number of cracks renders the elastic subproblem particularly ill-conditioned, and tests the effectiveness of the linear subsolvers and the coupled preconditioning strategy. Second, the presence of bifurcations and snap-backs stresses the convergence and effectiveness of the outer nonlinear solver. Third, the solution of the overall quasi-static evolution problem is strongly influenced by the irreversibility condition, testing the effectiveness of the variational inequality solvers.  showing the initial solution independent of the x 1 variable, the emergence of a periodic crack pattern and its selective propagation with period doubling. Each column corresponds to the result obtained for a specific intensity, increasing from left (2) to right (8). Here = 1 and the slab dimensions are 40 × 10 with a mesh size h = 0.2.

RESULTS OF NUMERICAL EXPERIMENTS
We present here the results of the numerical experiments that were performed to assess the performance of the proposed solvers. All problems were solved to an absolute l 2 residual tolerance of 10 −7 . For each test problem, we analyse the dependence of the results on the relevant physical parameter: we vary the internal length in the traction and surfing tests, and the intensity of the loading ∆T /∆T c in the thermal shock problem.  The results for the surfing, traction and thermal shock problems are shown in Tables I, II and III respectively. In all tables, the reduction column describes the decrease in iterations for the optimal ω compared to standard alternate minimization, ω = 1. In the traction case, standard alternate minimization is extremely efficient: a small number of iterations is required, the number of iterations required does not grow with , and applying any other ω slows the convergence of the method. By contrast, in the surfing and thermal shock cases, standard alternate minimisation converges slowly, and the number of iterations required increases as the physical parameters and ∆T /∆T c are varied. In this sense, the surfing and thermal shock cases are harder than the traction case. In these problems, over-relaxation helps significantly, reducing the number of iterations required by a factor between 1/2 and 3/4. Furthermore, the advantage gained by over-relaxation increases as the problem gets harder.

Composition of alternate minimization with Newton's method
We next consider ORAM-N, the composition of alternate minimization with Newton's method as described in section 2.3. For these experiments, Newton's method was attempted once alternate minimization had reduced the l 2 norm of the residual by 10 −1 . All linear solves (both for alternate minimization and Newton's method) were performed with LU, and all alternate minimizations employed the optimal over-relaxation parameter determined in the previous experiments (ω = 1.6 for the surfing and thermal shock cases, ω = 1 for the traction case). The time in seconds was measured for both approaches, as comparing iteration counts would be irrelevant. The runs were executed in serial on an otherwise unloaded Intel Xeon E5-4627 3.30 GHz CPU with 512 GB of RAM. The results for the surfing, traction and thermal shock problems are shown in Tables IV, V and VI respectively. Again, the traction case is unusual compared to the other two: while the gains are marginal in the traction case, composition yields a worthwhile and consistent reduction in runtime for the other tests. If a more robust semismooth Newton solver were available, the speedup from composition would further increase.

Preconditioning the full Jacobian
The preconditioner (20) requires inner solvers for the displacement elasticity operator A and the damage Helmholtz operator C. We first consider the performance of (20) with ideal inner solvers (LU), to investigate how the iteration counts scale with the physical parameters and with mesh size h. We then consider the performance with practical inner solvers, two V-cycles of algebraic multigrid for A and C. Each Jacobian solve was terminated when the l 2 norm of the residual was reduced by a factor of 10 −6 , although adaptive tolerance selection should be used in practical calculations to retain quadratic convergence of the inexact Newton method [72]. For each configuration of physical parameters and h, the total number of Krylov iterations required for convergence over all loading steps was divided by the total number of Newton iterations, to compute the average number of Krylov iterations required to solve a Jacobian. In these experiments the alternate minimization was terminated with a relative residual reduction of 10 −3 , or if the absolute residual norm reached 10 −6 . As the gains from employing Newton's method in the traction case were marginal, we consider here only the surfing and thermal shock problems.
The results for the surfing case with ideal and practical inner solvers are shown in Tables VII and VIII, and the corresponding results for the thermal shock case are shown in Tables IX and X. In the surfing case, the number of iterations required grows slowly as the mesh is refined, and grows slowly as is reduced. However, even for the smallest on the finest mesh, the number of outer Krylov iterations required is modest, and the results barely differ if the ideal inner solvers are replaced with practical variants. In the thermal shock case, the number of iterations required stays approximately constant as the mesh is refined, and grows slowly with the intensity ∆T /∆T c . Here, replacing the ideal inner solvers with practical variants does have a measurable cost in iteration count; this could be reduced by tuning the parameters of the algebraic multigrid algorithm employed, or by employing stronger inner solvers. These results show that the preconditioner (20) is a practical and efficient solver for the full coupled Jacobian, whose performance degrades slowly as the difficulty of the problem is increased.

CONCLUSION
In this paper we proposed several improvements to the current standard algorithm for solving variational fracture models. Over-relaxation is extremely cheap and simple to implement, but can greatly reduce the number of iterations required for convergence. Composing over-relaxed alternate minimization with Newton-type methods yields a further decrease in runtime, although at a more significant development cost. Together, these improvements to alternate minimization reduce the time to solution by a factor of 5-6× for the surfing and thermal shock test cases. Lastly, we proposed and tested preconditioners for the linear subproblems in alternate minimization and the coupled Jacobian arising in the Newton iterations when solving the whole problem with a monolithic active set method. These efforts are complementary to other approaches recently proposed in the literature, such as adaptive remeshing [44], adaptive time-stepping, continuation algorithms [39], or refined line-search techniques [40] that were not considered in this work. Our tests focus only on the simplest settings for variational fracture mechanics assuming small deformations and a simple rate-independent material behaviour. However, the developed techniques can be readily adapted to more complex contexts, including hyperelasticity, viscoelasticity, and inertial effects.
These results suggest several directions for future research. It would be highly desirable to develop a convergence analysis of block over-relaxed nonlinear Gauss-Seidel for variational inequalities, although we do not anticipate this will yield constructive insight for the choice of the over-relaxation parameter ω. It may be possible to design -robust preconditioners for the coupled Jacobian (where the convergence is independent of ) by choosing appropriate -dependent inner products for the displacement and damage function spaces. If the appropriate Babuška constants are independent of , the convergence will be also [73].
In this work we have considered only the simplest fracture model, assuming small deformations and symmetric behaviour in traction and compression. Our developments on over-relaxation and composition of alternate minimization with Newton could be applied with minor modifications to more complex cases, including for example the tension-compression splitting of the elastic energy to account for the non-interpenetration condition on the crack lips [22,23,24,25,26]. In this case an