Accelerating crack growth simulations through adaptive model order reduction

Accurate numerical modeling of fracture in solids is a challenging undertaking that often involves the use of computationally demanding modeling frameworks. Model order reduction techniques can be used to alleviate the computational effort associated with these models. However, the traditional offline‐online reduction approach is unsuitable for complex fracture phenomena due to their excessively large parameter spaces. In this work, we present a reduction framework for fracture simulations that leaves out the offline training phase and instead adaptively constructs reduced solutions spaces online. We apply the framework to the thick level set (TLS) method, a novel approach for modeling fracture able to model crack initiation, propagation, branching, and merging. The analysis starts with a fully‐solved load step, after which two consecutive reduction operations—the proper orthogonal decomposition and the empirical cubature method—are performed. Numerical features specific to the TLS method are used to define an adaptive domain decomposition scheme that allows for three levels of model reduction coexisting on the same finite element mesh. Special solutions are proposed that allow the framework to deal with enriched nodes and a dynamic number of integration points. We demonstrate and assess the performance of the framework with a number of numerical examples.


INTRODUCTION
The search for numerical methods capable of accurately capturing the complex mechanisms involved in the fracture of solids is still one of the most active research fields in computational mechanics, despite its long history and vast body of literature. Although fracture mechanics can readily predict how a single existing crack grows, 1 modeling of phenomena such as crack initiation, branching, and merging is still a challenge. At the other end of the spectrum, damage models are well suited for predicting initiation and interaction between cracked regions but tend to suffer from spurious strain localization 2 and mesh bias. Two popular solutions for this mesh size dependency are to either use the crack band method 3 to introduce a length scale in the damage formulation or to use a rate-dependent damage evolution in order to retard strain localization. 4 These, however, do not remove the bias introduced by the orientation of the mesh. A number of more sophisticated damage models such as the nonlocal integral damage model 5 and gradient damage models [6][7][8] offer alternative ways to couple damage evolution with an intrinsic material length scale. More recently, two new approaches to couple damage and fracture mechanics in a single regularized framework based on superposed solution fields have been introduced, namely the phase field method 9 and the thick level set (TLS) method. 10 The latter is the method of choice for modeling fracture in this work.
The TLS method, first proposed by Moës et al, 10 attempts to bridge the gap between damage and fracture mechanics. In this method, cracks are represented by a level set front that can naturally represent interaction phenomena such as branching and merging. Fracture mechanics is incorporated by making the rate of front growth dependent on the energy release rate of the material. Furthermore, damage is incorporated by introducing a thick band of material behind the front where the damage variable gradually increases until a zero-stiffness, completely cracked region emerges. The introduction of a length scale represented by the thickness of this damaging band effectively introduces nonlocality in the damage formulation, eliminating spurious strain localization. Furthermore, the use of a continuous level set field independent from the underlying finite element discretization alleviates the issue of crack orientation bias. Since its inception, the method has been expanded upon by multiple authors in order to deal with asymmetric constitutive behavior, 11 couple the damage formulation with plasticity 12 and cohesive zone laws, 13 improve the representation of traction-free sliding in shear, 14 and treat fatigue loading. 15 Although being a versatile and robust method, TLS is a computationally demanding analysis approach. Accurate modeling of front growth requires element sizes significantly smaller than the width of the damage band. 11 Depending on the geometry of the problem being modeled, this width is constrained to be smaller than the length scale of geometric features in the domain-for example, a crack propagating between two inclusions very close to each other. Moreover, since the level set is usually updated in a staggered fashion, numerical stability of the scheme requires that the front does not advance for more than one element length per time step. 11,14 These constraints render TLS computationally prohibitive for a number of relevant problems. It is worth mentioning that opting for other regularized damage approaches (eg, gradient-enhanced damage 6 or phase field models 9 ) does not alleviate the issue, as those methods also require dense finite element meshes and involve additional degrees of freedom (eg, nonlocal strains) in the complete domain while the TLS solves a damage update problem that involves additional unknowns only within damaged regions.
Model order reduction (MOR) techniques may be used to reduce the computational effort associated with the TLS method. These techniques essentially consist in finding suitable lower-dimensional solution spaces that can be employed as surrogates for full-order models. The goal is to construct surrogate models with significantly lower computational effort while minimizing the associated loss of solution accuracy. For the equilibrium problem that constitutes the most demanding phase of a TLS analysis, 10,11 the number of degrees of freedom can be drastically reduced through the proper orthogonal decomposition (POD) technique 16 while the cost of computing the global internal forces and stiffness matrix can be reduced through hyper-reduction techniques such as the empirical cubature method (ECM). 17 However, using MOR for highly nonlinear fracture problems such as the ones treated with TLS is a complex endeavor for which no definitive solution is currently available. MOR methods rely on an offline training phase during which representative loading cases are executed and the resulting solution snapshots are used to compute the reduced solution spaces. For fracture problems, a difficulty arises when choosing loading cases for training: subtle changes in boundary conditions can lead to considerable differences in crack initiation and propagation behavior. In other words, the parameter space of possible solutions that the surrogate model should be able to approximate is prohibitively large, leading to inefficient training and a lower acceleration level through an increase in the size of the reduced-order solution space.
In recent years, a number of strategies have been proposed in order to alleviate this issue and allow for projection-based MOR techniques to be applied to fracture problems. In Reference 18, the authors employ a mesh morphing technique that effectively reduces the size of the reduced space necessary to accurately approximate the solution, while authors in Reference 19 combine POD with domain decomposition in order to allow for locally-refined bases to be used in zones of strain localization. Although promising, these approaches still rely on the construction of a reduced basis offline. For TLS problems, aside from the difficult task of selecting training cases, the adaptive nodal enrichment and numerical integration schemes necessary for the accuracy and robustness of the method 11 lead to stress and displacement snapshots of variable sizes and orderings, further complicating the task of a posteriori construction of reduced bases. An alternative class of strategies which seems more suitable to the problem at hand consists in adaptively refining the reduced basis during the online analysis either through the use of Krylov subspaces 16,20,21 or by locally solving critical mesh regions in the full-order space. 22 It seems clear that resorting to pre-trained models would be a cumbersome and inefficient approach to reducing the computational complexity of the TLS method. This paper is instead focused on adaptively reduced models that preclude the need for an offline training phase. In a recent work, 23 we propose an adaptive reduction framework that starts with a fully-solved load step and combines a number of state-of-the-art MOR techniques 17,19,22,24 in order to progressively build a hyper-reduced model that allows for different levels of reduction coexisting on different subdomains of a single finite element mesh. Here, the framework is applied and tailored to the specific needs of a TLS analysis, although it can easily be used to accelerate other types of fracture models. We start by looking at the full-order model of a compact tension test 14,25 and reviewing the basic components of the TLS method. MOR ingredients are then gradually added to the model while exploiting the full-field information that is a by-product of the TLS front update-the level set field, front velocities, and parametrized front length-in order to achieve a fully-adaptive domain decomposition strategy to accommodate the different levels (full, reduced, hyper-reduced) of MOR. The performance of the modified framework is assessed and additional numerical examples are shown.

Mathematical notation
Throughout this work, vectors and matrices are written as boldfaced lower-case and upper-case symbols, respectively. Sets of indices are written in upper-case calligraphic script (eg, ). Entities in the full-order solution space are indicated by the subscript h, while reduced-order entities are left unmarked for compactness. A subscript with a set between parenthesis (eg, M (, ) ) indicates a selection operation based on one or two sets of indices that extracts submatrices or subvectors.

THE THICK LEVEL SET METHOD
In this section, we review the main analysis steps that comprise the TLS method. Let Ω be a volume in equilibrium subjected to Dirichlet constraints at boundary Γ u and Neumann boundary conditions at Γ f such as Γ u ∩ Γ f = ∅. The TLS method consists in using a level set function = (x, y) in 2D or = (x, y, z) in 3D to define a damage front Γ that divides Ω into a damaged volume Ω d and an intact volume Ω i , as shown in Figure 1. The level set function is a signed distance function to the front, satisfying the eikonal equation for unit velocity, written as: subjected to = 0 on Γ . The presence of a front introduces in Ω the curvilinear coordinate system − s shown in Figure 1. At distances larger than the length scale parameter l c in the direction normal to the front (along the -axis for every value of s), the material is considered to have lost its complete stiffness (d = 1). With these definitions, the damage variable can be written as: where f ( ) is a damage profile function satisfying f∕ ≥ 0 for 0 ≤ ≤ l c . In this paper, the arctangent profile proposed by Bernard et al 11 is used for all numerical examples: with c 1 = 10, c 3 = 0.5, and the other coefficients given by: With the signed distance function for , the closest distance to the damage front is known at every point in the domain. This information will be useful for the adaptive scheme in Section 3. The TLS implementation considered in this work is based on a staggered scheme: from an existing level set configuration, an equilibrium problem with unit load * is solved for the displacement field u. The displacements are then used to solve a second system of equations on Ω d that computes the configurational forces at the damage front Γ . By comparing the configurational forces with a material resistance parameter, the evolution of the level set front is computed, along with the load factor that is used to scale the unit load equilibrium solution. Finally, the level set field is updated and new damage fronts are allowed to nucleate if the material strength is surpassed at any point on Ω i . In the following sections, further details on each of these analysis phases (global equilibrium, front evolution, and level set update) will be provided.

Equilibrium problem
For the present study, we opt for a simple isotropic damage formulation for which the free energy can be written as: where D is the elastic stiffness tensor. The stresses are obtained by differentiating the energy with respect to strain: It is important to note that the isotropic nature of this simple damage model leads to unrealistic predictions of damage nucleation and propagation under compression. Although asymmetric damage models have already been proposed for use with TLS, 10,11 those are also not applicable to general loading directions, leading to spurious stiffness recovery under shear. In Reference 14, van der Meer and Sluys propose an interphase material that eliminates this issue and allows the TLS to capture traction-free sliding, but it can only be employed on real material interfaces. In order to keep the discussion focused on accelerating TLS using MOR techniques, the simple and numerically robust isotropic formulation is adopted. With the stress tensor, the full-order internal force at integration point i can be written as: where B is a matrix that relates nodal displacements to local strains and f h ∈ R N is the contribution of point i to the full-order (as indicated by the subscript h) global internal force vector, with N being the total number of degrees of freedom in the model. In the TLS method, the region with > l c represents a stress-free crack. Due to the discrete nature of the finite element method, this region must be at least one element wide in order to achieve a stress-free state, which would otherwise introduce mesh dependency to the problem. 14 In order to circumvent this issue, the adaptive enrichment scheme proposed by Bernard et al 11 is employed and extra degrees of freedom are added to nodes inside the cracked region whose element support is cut at least twice by the iso-l c . In practice, this leads to a dynamic number of degrees of freedom N, with *This is only allowed because a linear material model is employed in the staggered solution scheme. For an alternative approach suitable for nonlinear material models, the reader is referred to Reference 12. nodes being enriched or unenriched as the front moves. Furthermore, enriched elements will have a different B matrix than regular elements. These two consequences of the enrichment scheme will be important when building the reduced models in Section 3.
The global internal force vector f Ω h is numerically integrated by combining the contribution of all M integration points in the mesh: where w i is the integration weight of the point located at x i . In order to improve the accuracy of the TLS solution, extra integration points are added to the elements cut by the iso-zero as well as to the elements cut by the iso-l c . This means that M is not constant during the analysis, a fact that must be accounted for when building a hyper-reduced model for f Ω (Section 3). Finally, the displacement vector u h ∈ R N that leads to global equilibrium can be obtained by solving: where r h ∈ R N is a residual vector and f Γ h ∈ R N is the global external force vector.

Front evolution
As damage is directly coupled to the distance to the level set front, a movement of this front leads to changes in damage distribution and, consequently, to energy dissipation. This dissipation is related to the configurational force Y , obtained from Equation (5) as: In the TLS method, the classic local dissipation measure Ξ = Yḋ is substituted by a measure of dissipation over the complete damaged band when it moves by a distance : where g (s) is the band configurational force at coordinate s: with being the curvature of the iso-zero at coordinate s and 0 ≤ l ≤ l c is the width of the damaged band. Since g (s) → 0 when l → 0, an average configurational force Y which is only a function of s is computed in such a way as to enforce: 11 Numerically, this average configurational force along the front is computed by solving Equation (13) on the domain Ω d with Y as nodal values while enforcing the constraint ∇Y ⋅ ∇ = 0 with Lagrange multipliers. 11 The front propagation criterion at a given point i at the front simply becomes Y i = Y ci , with Y c being the homogenized value of a resistance parameter Y c ( , s) computed by solving a system of equations similar to Equation (13) but substituting Y by Y c . Although a constant propagation resistance Y c could be used throughout the analysis, 10,11 we follow the approach proposed in Reference 14 instead and make Y c a function of the size of the damaged zone in order to take into account the difference in the stress levels necessary for damage initiation and propagation. With values for Y c related to the strength (Y f c ) and fracture energy (Y G c ), the following interpolation is adopted: , (14) where is the average value of inside Ω d , l c ∕3 ≤ max ≤ l c ∕2 is the band size after which Y c ceases to increase, and the two parameters related to initiation and propagation are given by: where f ( ) is given in Equation (3). With values of Y and Y c at every node i on  , defined as the set of nodes on elements cut by the front Γ , the load factor necessary to promote front growth can be computed by solving: Finally, the front velocity v in the direction perpendicular to the level set (in the negative direction along the -axis of Figure 1) for every node in  can be obtained through: where the operator ⟨⋅⟩ + returns the positive part of its operand and v max is the maximum growth the front can undergo in a single load step. In order to guarantee the stability of the staggered analysis scheme, a value v max = h min ∕2 is adopted here, with h min being the size of the smallest finite element present in the mesh. The parameter c controls the amount of velocity spread to nodes with lower values for the ratio Y ∕Y c . For c → 1, only the point used to compute in Equation (16) will have nonzero velocity.

Level set update and damage nucleation
With the nodal velocities at the front, the last analysis phase consists in updating the level set field (and consequently the damage distribution). This in turn involves extending the front velocities to every node not belonging to  . In order to guarantee that the updated level set will satisfy Equation (1), the velocities are extended along the -axis, satisfying the constraint: which is solved with a fast marching algorithm. 26 The level set field at every node is then updated as: with i ∈  (the complete set of nodes in the mesh) and the subscript o indicating values from the previous time step. Even though this extension strategy is designed to guarantee the signed distance property of , the discrete nature of the level set update leads to increasingly larger violations of Equation (1) as the analysis progresses. A reinitialization procedure in which Equation (1) is solved with a fast marching algorithm is therefore periodically performed. 26 During this update phase, a nucleation check is performed in order to allow for new damage fronts to emerge. The size of a newly created damage nucleus is considered to be infinitesimal, therefore reducing the interpolated Y c value of Equation (14) to simply Y f c and making the initiation criterion local. The check is performed at every integration point and a new nucleus is created at coordinates: where Y f c is made a function of x to take into account the general case of multiple materials being used in the same mesh. After nucleation, the value of is checked at every node and updated if the distance between the node and the new nucleus is smaller than .
With the updated level set field, the damaged band size at Γ is computed in order to be used in Equation (14) during the following time step. This is done in a similar way as in the computation of Y and Y c , by solving the following weak form: 14 while using Lagrange multipliers to enforce the constraint ∇ ⋅ ∇ = 0. The analysis flow of a single TLS step is shown in Figure 2, including the information exchanged between analysis phases.

Example: Compact tension test
Before attempting to accelerate TLS simulations with MOR techniques, we show a full-order numerical example to illustrate the method. In Section 3, the same example will be used to gradually introduce each reduction technique. The example involves the compact tension test used by Li et al 25 to investigate mode I crack growth in fiber-reinforced polymers and later modeled with TLS by van der Meer and Sluys. 14 The specimen is a square (100mm × 100mm) plate modeled in plane stress with a notch with circular tip (r = 1mm) pulled vertically from the circular load application points seen in Figure 3. The material parameters are E = 7GPa, = 0.3, G c = 40 N∕mm 2 , f t = 79MPa, l c = 2 mm, c = 2, and max = 0.45l c . The plate is discretized with a total of 76 784 constant-strain triangles with one integration point each, resulting in N = 77 234 degrees of freedom (DOFs) and M = 76 784 integration points. A relatively dense discretization is employed along the expected crack path (Figure 4), with a minimum characteristic element size h min = 0.15 mm.
No damage fronts are present during the first time step. The unit load problem is solved and a very high is set in order to guarantee that a first nucleation will occur during the level set update phase. Since the structure behaves elastically if no damage fronts exist, this effectively promotes a jump to the moment at which the first nucleation occurs, at a load of approximately 100 N. From the second step onwards, is computed according to Equation (16). The global behavior starts with a hardening branch during which a fracture process zone of radius l c develops and transitions to a softening branch as the crack grows towards the opposite edge of the plate. The load-displacement curve can be seen in Figure 3, as well as the extended velocity field from a time step during the softening branch. As the damage front propagates, the size of the equilibrium problem (N) changes as nodes from elements cut by the iso-l c are enriched. Furthermore, the number of integration points of elements cut by both the iso-zero and the iso-l c changes from 1 to 9, increasing the total number of integration points M. Figure 4 shows the damage front after the last analysis step, as well as a zoomed-in view of the damage band indicating which nodes and elements are modified.
The analysis runs in 1670 s on a workstation equipped with a Xeon W-2123 processor running Ubuntu 18.04.2. From this total execution time, 1494 s (92%) is dedicated to solving the equilibrium problem, with the remaining 5% and 3% being spent on computing the front advance and updating the level set field, respectively. Since Equations (13) and (21) are only solved inside Ω d , the equilibrium problem is the main complexity bottleneck of the TLS method. 11,14

ADAPTIVE MODEL ORDER REDUCTION
In this section, the mechanical equilibrium problem associated with the TLS method (the first analysis phase in Figure 2) is accelerated with a number of the MOR components presented in Reference 23. Two main reduction avenues can be identified: reducing the number of DOFs N and the number of integration points M necessary for integrating the internal force vector f Ω of Equation (8). As the parameter space-defined here as the space of possible load histories to which the model can be subjected-associated with fracture problems is prohibitively large, the reduced spaces for N and M are constructed online in an adaptive fashion.

Reduction by projection
Instead of solving for u h ∈ R N , the solution to the equilibrium problem of Equation (9) can instead be sought for on a lower-dimensional manifold defined by a set of n orthonormal basis vectors i . After solving for the n ≪ N mode contributions , the full-order displacement field can be recovered by: A popular approach for constructing the reduced solution space spanned by is the POD method: a set of P full-order displacement snapshots X u is decomposed into orthonormal contributions through a singular value decomposition (SVD) operation: where the basis is the left-singular matrix of X u , V is a matrix with right-singular vectors, and S is a diagonal matrix with singular values. When the number of snapshots P is large, the SVD is usually truncated to the first n ≪ P modes, associated with the highest singular values. The reduced version of the equilibrium problem is obtained by imposing the Galerkin projection constraint T r h = 0, which yields: For pre-trained models, obtaining the snapshot matrix X u is the goal of the offline training phase. Here, X u is updated whenever a load step is solved in the full-order space. Since the analysis starts without an initial basis , the very first load step is solved fully and subsequent steps are reduced. As the analysis progresses and changes in damage distribution occur, the reduced space gradually loses the ability to describe the global behavior of the structure being modeled. At this point, even though the reduced problem is in equilibrium (f Ω = f Γ ), the full-order equilibrium of Equation (9) is not exactly satisfied. This deviation from equilibrium can be used to trigger a fully-solved retraining step if a certain tolerance threshold force is crossed: 23 where the load scaling used in Reference 23 is dropped because the mechanical equilibrium problem is based on a unit-load solution. After the retraining step is solved, the field u h and its change from the previous time step Δu h are added to X u and a new basis is computed with Equation (23). In order to limit the computational overhead associated with these SVD operations, X u is only allowed to have a number n f of snapshots, with older solutions being gradually discarded as new snapshots are added. Furthermore, the SVD operation is truncated once singular values drop below a threshold SV in order to guarantee that only the most relevant basis vectors are included in the basis.

Compact tension example
We now return to the compact tension example of Section 2.4 and attempt to use the present adaptive POD approach to solve it in the reduced space. With the current approach, we immediately run into a problem when using the nodal enrichment scheme shown in Figure 4: any new enrichments occurring between two consecutive retraining steps cannot be included in the basis, since can only be updated after a fully-solved step. Delaying these new enrichments would in turn lead to an enlargement of the damage front until the region with > l c is at least one element wide. Since the level set advancement is irreversible, these thicker cracked regions would persist even after a full step is triggered. The enrichment scheme is therefore disabled until a solution for this issue is proposed in Section 3.2.
For the additional parameters related to the POD reduction, we adopt n f = 6 and SV = 10 −6 . This means that the number of reduced DOFs n is at most 6, a significant reduction when compared to the original N = 77 234. The tolerance force is adjusted such that the crack is able to correctly propagate from the notch. For higher values of force , the deficient reduced basis blocks the front advance and causes the cracked band to widen. This is an unphysical effect that does not occur if a low enough tolerance is adopted. Figure 5 shows the comparison between damage front shapes obtained with two different tolerances.
With force = 0.025, the load-displacement curve shown in Figure 6 is obtained. Although both the shape of the damage front and the quasi-static equilibrium path are correctly reproduced by the reduced solution, it produces a curve with The noisy load-displacement response hints at a very high number of retraining steps driven by the highly nonlinear structural behavior caused by the propagating crack. Indeed, plotting the cumulative number of reduced steps versus the total number of load steps shows that approximately half of the steps are fully solved (Figure 7). The reduction scheme is particularly inefficient towards the end of the analysis, when the error control condition of Equation (25) is triggered as soon as is updated and no more reduced steps are allowed. Nevertheless, the reduced mechanical equilibrium model runs approximately 1.4 times faster than the full one. However, it is important to note that even though the quantity of interest in this case (the load shown in Figure 6) is reproduced with reasonable accuracy by the reduced model, the net benefit of reducing the problem might become negative depending on the application: it is difficult to justify even very minor losses in accuracy when the gains in execution time are so limited.
Finally, it is interesting to investigate the spatial distribution of the deviation from equilibrium r h that triggers retraining steps. In Figure 8, this error is plotted for a reduced load step immediately prior to a retraining step. As expected, the reduced space obtained with information from the latest retraining step struggles to represent what happens immediately F I G U R E 8 Deviation from full-order equilibrium from a reduced step immediately prior to retraining. The reduced basis does not adequately represent the behavior of the region immediately ahead of the crack tip ahead of the crack tip, the area of Ω d where most of the changes in damage profile occur-that is, the region with the highest slope d∕ of the arctangent damage expression of Equation (3).

Equilibrium system partitioning
As illustrated in Section 3.1, two issues arise when attempting to accelerate the TLS equilibrium problem with an adaptive POD approach. First, the enrichment scheme proposed in Reference 11 is incompatible with the fact that the basis is only updated during fully-solved steps. Second, the inability of the reduced basis to correctly capture changes in mechanical behavior immediately ahead of the crack tip leads to a high number of retraining steps, negating most of the acceleration associated with solving an equilibrium problem of size n ≪ N. The partitioning strategy first proposed by Kerfriden et al 22 and later adopted as part of the framework in Reference 23 offers a potential solution to both of these issues, as it allows for a group of DOFs to be detached from the reduced solution space and directly solved for in the full-order space.
We divide the complete set of DOFs  into a set  of reduced DOFs and a set  of fully-solved DOFs ( ∩  = ∅). The DOFs in  are solved for in the full-order space while the reduced basis is used to approximate the solution in . With this partitioning, the reconstruction of u h becomes: where u f is a vector with displacements associated to DOFs in  and () is a submatrix obtained by selecting the rows of associated with the DOF indices in . After partitioning, the reduced solution vector becomes: By detaching  from the Galerkin projection constraint T r h = 0, we obtain the following partially-reduced versions of the force vectors and stiffness matrix: 22,23 f Ω = The question now becomes how to define  in order to minimize the number of retraining steps. In Reference 23, we have proposed a number of strategies to populate  when the framework is applied to problems with plasticity, including two criteria that search for regions with energy dissipation concentrations caused by plastic strain development. Here, we devise a similar strategy while taking advantage of full-field information already computed by the TLS method.
The region ahead of the crack tip shown in Figure 8, which concentrates most of the deviation from equilibrium that triggers retraining steps, is an intuitive choice for  . As this deviation is caused by the front movement F I G U R E 9 Definition of the fully-solved DOF set  from the intersection of the level set curve f with a viscous cumulative velocity field. DOF, degree of freedom [Colour figure can be viewed at wileyonlinelibrary.com] as the crack propagates, the extended velocity field computed in order to update the level set field becomes a useful tool to identify regions undergoing changes in damage distribution. A straightforward approach would be to include the DOFs of every node with nonzero velocity in  . However, two additional points must be taken into account: • The velocity field is extended to the whole domain, which means that points far away from the front also have positive velocities (see Figure 3). This would lead to an overly large  set; • Only points directly ahead of the crack tip have nonzero velocities. As the tip moves forward between two consecutive retraining steps, the nodes behind it will immediately drop out of  , but the basis would still be unable to correctly capture their behavior.
We address the first point by also requiring points in  to fulfill the constraint ≥ f -that is,  is the intersection between the region delimited by a level set f and the nonzero velocity field. As for the second point, we substitute the instantaneous extended velocity field v by a decaying cumulative velocity measure v * in order to retard the removal of points behind the crack tip from  . At the beginning of every load step, the velocity field v o computed during the previous step is used to update v * as follows: where is a decay parameter. A node i ∈  is added to  if the following conditions are satisfied: where is a cutoff velocity parameter that controls the shape of the domain formed by nodes in  and v * max is the value of v * at the node with the highest accumulated velocity. Figure 9 illustrates how each of the parameters , , and f influence the shape of  . At this point, it is important to mention that even though the deviation from equilibrium of Figure 8 is limited to the > 0 region, extending f beyond the front may prove beneficial in reducing the number of retraining steps. The region ahead of the front is where a gradual increase in configurational force takes place up to the point where Y c is reached. The movement of the front may therefore be hindered if the reduced basis does not allow for this stress concentration to emerge.
The partitioning strategy also allows for the reintroduction of the enrichment around = l c : any DOFs created by the enrichment scheme ( Figure 4) that are not already present in are added to  . This allows for each snapshot x u on X u to have a different size, depending on the value of N at the time when it is stored. The basis computed from this set of snapshots has, therefore, a number of rows N c = min i (size (x ui )). As older snapshots are overwritten, N c increases and the enriched DOFs now included in the basis are removed from  . With this strategy, the enrichment scheme is allowed to operate between two consecutive retraining instances. It is worth noting in passing that a similar scheme could also be used to combine a POD-reduced model with a remeshing strategy, other methods based on nodal enrichment such as XFEM 27 or overlapping domain techniques such as CutFEM. 28

Compact tension example
We now apply this partitioned POD strategy to the compact tension example of the previous sections. Figure 10 shows snapshots at two load steps of three different model executions with = 0.001 and different combinations of and f , resulting in different topologies for the region formed by the nodes included in  . For the first model, the decay parameter is high enough to guarantee that the size of  increases monotonically and covers the whole damaged band. In contrast, the decay is disabled for the second model by setting to a very low value, which means that only points with a positive instantaneous velocity are included in  . Finally, the third model seeks a balance between including nodes behind the crack tip and those ahead of the damage front while keeping  relatively small throughout the analysis. It is interesting to note that the third model, although having a smaller number of nodes in  , needs significantly less retraining steps and is therefore more efficient in terms of acceleration than the other two. In any case, the fact that  is updated at the beginning of every load step leads to an efficient and fully-adaptive partitioning scheme that naturally follows the crack as it moves through the domain. The influence of the shape and size of  on the efficiency of the reduced solution is further investigated by running the model with different combinations of 0.01 ≤ ≤ 100 and −10 ≤ f ≤ 3, with = 0.001. The resultant number of retraining steps for each combination is plotted in Figure 11, while the maximum size attained by  during the analysis is shown in Figure 12. For f = 3 mm,  is only used to accommodate DOFs from newly enriched nodes (since l c = 2 mm) and the same number of retraining steps (317) is obtained for all values of . As f decreases, an increasing number of nodes ahead of the crack tip are added to  .
At first there is a sharp decrease in the number of retraining steps, as the critical region shown in Figure 8 is included in  , with still significant decreases for f < 0, albeit at a gradually decreasing rate. This confirms that fully solving a finite region immediately ahead of the damage front is beneficial in reducing retraining frequency, depending on the problem being modeled and the choice of l c (see Section 4.1 for an extended discussion on this point). For even lower values of f , the number of retraining steps tends to stabilize. This is consistent with the fact that the stress concentrations that drive crack propagation gradually vanish as we move away from the damage front towards regions where deformation can be correctly reproduced in the reduced solution space.
The decay parameter also has significant influence on the number of retraining steps. In general, increasing leads to a decrease in retraining frequency, although no improvement is observed for 0 ≤ ≤ 0.1 due to the exponential nature of Equation (30). However, increasing to such extent that  covers the whole damaged band (eg, the first model in Figure 10) seems to be unnecessary. Because the basis is constantly being refreshed, regions of Ω d far away from the crack tip can be well approximated by the POD solution.
It is important, therefore, to seek a balance between populating  with enough nodes both ahead and behind of the crack tip while minimizing its size. As can be seen in Figure 12, unfavorable choices for and f can lead to highly inefficient reduced models with sizes of  that quickly approach the total number of DOFs N with only negligible reductions to the number of retraining steps.
Opting for the combination of parameters used on the third model of Figure 10 ( f = −2 mm, = 10, = 0.001), we obtain the load-displacement curve of Figure 13. In contrast to the curve obtained with  = ∅ (cf Figure 6), the reduced With this combination of parameters, the reduced equilibrium problem runs in 406.8s, accounting for 76% of the total execution time if all TLS phases are included. Of this total time, only 0.54s is dedicated to the SVD operations used to update . Although the resulting speed-up of 3.7 times with respect to the full-order equilibrium problem is already more than twice as high as the one obtained with  = ∅, in the following section we attempt to further increase the level of acceleration by reducing the time spent on assembling K h and f Ω h .

Domain-based hyper-reduction
We now seek a way to compute f Ω -and consequently K as consistent linearization of f Ω (u)-in a faster way and with minimum loss of accuracy. Because the basis reduces the size of the equilibrium problem to n ≪ N, the minimum amount of information needed to assemble f Ω is also greatly reduced. However, in the POD model f Ω is obtained by compressing the data gathered in a loop over all M integration points. In the final example of Section 3.2, 42% of the total time spent on the equilibrium problem was dedicated to this assembly of f Ω h and K h . Strategies for accelerating this assembly operation are referred to as hyper-reduction methods.
The problem becomes to choose from the complete set of M integration points  a m ≪ M subset  ⊂  with associated modified integration weights in such a way as to minimize the error between the M-integrated f Ω and its m-integrated counterpart: (32) In order to solve Equation (32), we use the so-called ECM originally proposed by Hernández et al 17 and later adopted in the framework of Reference 23. At the end of every fully-solved step, the reduced internal force contribution f (x i ) ∈ R n of each integration point is used to construct a single force snapshot X f ∈ R M×n given by: where reductions are performed with the updated basis. Since only a single snapshot is used to perform the reduction, a slightly different approach than the one used in Reference 23 is adopted: instead of storing stress snapshots and using SVD to obtain a basis matrix for stresses, here the SVD is used to decompose X f : where ∈ R M×p is a basis matrix truncated to the first p singular vectors. With this basis for internal forces, the minimization problem of Equation (32) is rewritten as the nonnegative least-squares minimization given by: where J and b are: and the modified integration weights are obtained through element-wise multiplication of by the original integration weights w () . For more details on the formulation and implementation of the method, the interested reader is referred to References 17, 23.
The preceding development implies that the integration of the whole volume Ω is going to be reduced (see Equations (33) and (36)). For the current framework, this would not be desirable because: • With ECM, the full-order force vector f Ω h is only defined at a small set of nodes and the modified weights deprive it of its physical meaning. Reducing the whole domain Ω would therefore make computing the deviation from equilibrium used to trigger retraining steps (Equation (25)) impossible; • Solving for DOFs belonging to the full-order set  requires the internal forces at those nodes to be fully integrated.
Reduction of the complete domain would therefore make it impossible to use the equilibrium partitioning scheme of Section 3.2.
It is clear that at least part of Ω should still be fully integrated even after hyper-reduction. We therefore divide Ω into an arbitrary number † of hyper-reduced domains  and one fully-integrated domain , as illustrated in Figure 15. With this domain decomposition strategy, the reduced internal force vector is computed as: This limited set of points might not be enough to accurately represent the whole volume Ω. Defining multiple  domains allows for this integration error to be controlled. and the minimization problem of Equation (35) is solved separately for each domain  i . Upon retraining, the domain configuration is updated in order to take into account changes in crack topology and new sets of points and weights are computed for each  i . From the preceding discussion, the minimum requirement for  is that it should at least contain the elements in  , including those with newly-enriched nodes. For simplicity, we adopt here the same strategy used to define  but using a value p < f for the minimum level set value that defines the contour of the domain. This results in a  domain that is always larger than  in order to account for the movement of the front between two consecutive retraining steps. The motivation behind this choice stems from the fact that  cannot cross over the boundary of  into the region where internal forces are not defined. The difference between p and f should therefore be large enough as to allow the front to grow until the next retraining step is triggered without allowing  to touch the  −  border.

F I G U R E 15
The cracked region (where > l c ) is also included in  in order to correctly represent the stress discontinuity introduced by the enrichment scheme. Since this is a small region composed of a single line of elements, the acceleration loss caused by extending  to cover its whole length is negligible. Since now f Ω h is only correctly defined in , the retraining check of Equation (25) becomes: where the subscript p refers to quantities from nodes in .
The remaining volume Ω ∩  is subdivided into ECM domains . In Reference 23 we have proposed to use a k-means clustering algorithm to divide the domain into clusters with similar strain and use these clusters to define . Here we exploit the introduction of the − s coordinate system of Figure 1 to propose a simpler and more efficient approach:  can either be defined as TLS bands (regions between two specific values of ) or as bands defined by a range of s values after using a fast marching algorithm to extend the nodal s values at the front to the rest of the domain. Alternatively, domains may be defined by combining both strategies, forming regions delimited by iso-and iso-s curves.
It is important to recall that as the front Γ moves and new cracked regions are created, the integration scheme of elements cut by iso-zero and iso-l c changes ( Figure 4). It follows that a scenario might arise in which the integration of an element containing an ECM point changes between two consecutive retraining steps. Although this situation is mostly avoided by including cracked regions and regions with positive velocity in , it might still happen under certain circumstances (eg, if p > 0). As an additional safeguard, we store the coordinates of the points in  at the time of training and use them when computing strains at ECM points instead of relying on the updated ‡ integration point locations.

Compact tension example
We now revisit the compact tension example one last time. The adaptive partitioned POD strategy of Section 3.2 is complemented with the present domain-based hyper-reduction approach. The parameters used are n f = 6, SV = 1 × 10 −6 , f = −2 mm, force = 0.025, p = −4 mm, = 10, and = 0.001. A precision greedy = 1 × 10 −10 is used to truncate the SVD ‡ We still allow the integration scheme in  to be updated because the two remaining analysis phases ( Figure 2) are still fully integrated and rely on a higher integration precision along the damage front.  Figure 16 shows snapshots of three model executions with different partitioning strategies: using only the level set field (n = 4), using only the parametric coordinate s (n s = 4), and combining both and s (n = n s = 4). The adaptive nature of the present domain decomposition approach is evident: as new retraining steps are triggered, the domain topology changes and automatically follows the damage front as it propagates to the right edge of the specimen. The benefit of increasing the number of ECM domains can be seen in the load-displacement curves of Figure 17. For n = n s = 1, the global reduced behavior tends to deviate from the full-order curve in a similar way as observed in Figure 6. Furthermore, a higher number of retraining steps are triggered, with 94 steps for n = n s = 1 and only 48 steps for n = n s = 10. Since the number of POD modes is at most n = 6, the maximum number of ECM points on a single domain is p + 1 = 7. These 7 points are not enough to accurately integrate the whole domain  = Ω ∩ , resulting in a poor approximation for f Ω which in turn leads to a higher retraining frequency. For n = n s = 10, the presence of 100 domains allows for a higher number of integration points to be used (up to 700 points in this case).

F I G U R E 16
Regarding the numerical stability of the hyper-reduced model, ECM guarantees that the reduced-order stiffness matrix retains its positive definiteness and thus its numerical stability. 17 Furthermore, solving Equation (35) leads to the best possible approximation (in a least-squares sense) of the reduced force vector and is found to be robust for  domains of any size: in the lower-bound situation in which a domain  comprises only a single element, the error of the greedy procedure drops to a value close to zero after the first point is added and the procedure is stopped.
With n = n s = 10, the obtained response agrees very well with the full-order one while still achieving a compression ratio of M m ≈ 110. Even with 100 ECM domains, assembly remains a very cheap operation, meaning that the increased Another way to look at differences in efficiency between the models with one and 100 ECM domains is by plotting the cumulative number of reduced steps versus the total number of load steps ( Figure 18). The model with 100 domains behaves similar to the fully-integrated adaptive POD model of Figure 14, while the higher number of retraining steps associated with the model with a single ECM domain makes the curve shift further away from the linear upper bound associated with a nonadaptive pre-trained model. The model is executed for different combinations of 1 ≤ n ≤ 10 and 1 ≤ n s ≤ 10 and the resulting number of retraining steps is shown in Figure 19. Increasing either n or n s in isolation leads to lower retraining frequencies, although increasing the number of domains in the s direction seems to be slightly more beneficial in this case: for n s = 5 and n = 1, the number of retraining steps of the hyper-reduced model is already close to that of the fully-integrated model. Although the retraining frequency tends to converge to the one obtained with only POD, no strict lower bound seems to exist for the number of retraining steps after ECM, being at times lower than the one from the fully-integrated model.
Aside from affecting retraining frequency, the number of ECM domains also dictates the total number of constitutive model computations and matrix assembly operations. Both of these aspects affect the speed-up of the hyper-reduced model. With n s = 1 and 1 ≤ n ≤ 100, the acceleration levels obtained for the equilibrium problem with respect to the full-order solution are shown in Figure 20. For 1 ≤ n ≤ 10, the acceleration sharply increases due to the decrease in retraining frequency. After this point, the frequency stabilizes and further increases in the number of domains do not affect the acceleration level considerably. For n = 10, the total time dedicated to solving the equilibrium problem is 279s, of which 4.1s are spent on updating , , and . Comparing the time spent by the model from Section 3.2 on updating (0.54s), we see that hyper-reduction significantly increases the overhead associated with updating the reduced bases, although it still represents a very small fraction of the total execution time. Interestingly, this overhead does not increase as we add more ECM domains, as the complexity of the minimization problem of Equation (35) scales linearly with the total number of integration points of each domain. Although the acceleration level of approximately 5 times is higher than the one obtained with the fully-integrated partitioned POD strategy (Section 3.2), the benefit of using hyper-reduction in combination with the linear material employed here is limited: the computational effort associated with computing stresses at the integration points is already very low. It follows that the speed-ups obtained here may be seen as the lower bound of what can be expected from the present hyper-reduction approach. Furthermore, acceleration can be expected if nonlinear material models are employed (cf 23,29 ). However, it is worth mentioning that for nonlinear material models with history, an additional history recovery component must be added to the framework since history must be known at every integration point during retraining steps.

ADDITIONAL EXAMPLES
In this section, two additional numerical examples are shown in order to further assess the performance of the reduction framework. The first is a beam loaded in three-point bending with a single straight crack. The second is a doubly-notched plate loaded in tension with two curved cracks that eventually merge. Apart from demonstrating that the framework can be used on models with various loading conditions and crack topologies, we investigate the effect of changing two parameters which have been kept fixed until now, namely the thickness l c of the damaged zone and the force tolerance force .

Three-point bending
The first example concerns the three-point bending test shown in Figure 21. The geometry is taken from Bernard et al 11  Since the introduction of Y c values related to strength and fracture energy decouples l c from G c (see Equation (14) and Reference 14), we are in principle free to choose the most convenient value for l c and still obtain the same crack propagation behavior. As the definitions of  and  are directly linked to the size of the damaged band, it is interesting to investigate how the performance of the framework is affected by l c . We therefore solve the same problem for both l c = 5 mm and l c = 15 mm.
We start by investigating the influence of f on the retraining frequency of the POD-reduced model (fully integrated). The number of fully-solved steps for −40 ≤ f ≤ 15 is plotted in Figure 22 for both values of l c . Although the value of f at which the retraining frequency stabilizes changes with l c ( f = −20 mm for l c = 5 mm and f = −10 mm for l c = 15 mm), the total width of the material band that comprises  (w  = 2l c + 2 f ) at stabilization seems to be the same for both models.
Most interesting is the fact that the model with l c = 5 mm needs almost three times more retraining steps than the model with l c = 15 mm when both are executed with an empty §  domain. This can be more clearly visualized in Figure 23, where we plot the size of  versus the number of retraining steps. Increasing l c leads to a lower stress at the crack tip, which in turn leads to a slower buildup of the norm of the deviation from equilibrium used to trigger retraining § With the exception of DOFs of enriched nodes not currently covered by the basis (see Section 3.2). (Equation (25) and Figure 8). This suggests the possibility of increasing l c with the specific purpose of improving the efficiency of the reduced model. However, the rate of decrease in retraining frequency is steeper for the model with smaller l c , and any acceleration gained by increasing l c is quickly compensated by the sharp reduction in retraining steps as DOFs are added to  .

F I G U R E 24
For this specific example, the f values that lead to the highest speed-ups are f = −10 mm for l c = 5 mm and f = 0 mm for l c = 15 mm. This corresponds to an  domain with w  = 2l c + 2 f = 30 mm for both models and leads to a similar number of retraining steps (Figure 22).
Adopting p = f − 10, the time spent on the hyper-reduced equilibrium problem is 4.9 times shorter and 5.2 times shorter than the full-order one for l c = 5 mm and l c = 15 mm, respectively. Although this measure of speed-up only includes the time spent solving the equilibrium phase of the TLS, it is interesting to note that the model with l c = 5 mm spends only 25% of the total analysis time on the two remaining TLS phases, while the model with l c = 15 mm spends 53% of the total time in those phases due to its larger Ω d domain.
A comparison in terms of load-displacement response between the full and reduced models is shown in Figure 24, while domain topologies at two different load steps are shown in Figure 25.

Doubly-notched plate
The final example is the doubly-notched square plate of Figure 26, inspired by the crack merging example by Moës et al. 10 Material properties are kept the same as the ones from the previous examples and a value l c = 0.6 mm is adopted. The full-order model is discretized with 86 068 constant-strain triangles and has N = 86 696 DOFs and M = 86 068 integration points at the beginning of the analysis. Two damage fronts nucleate at the notches and grow towards the center of the plate. The interaction between the fronts cause their trajectory to become curved and eventually merge, leaving a region of intact material at the center of the plate. We initially adopt the same tolerance force = 0.025 used in the previous examples. By running the POD-reduced model for different sizes of  , we find that the combination f = −1.2 mm, = 7, = 0.001 gives the best balance between the number of DOFs in  and the number of retraining steps. The optimum value for we find here is different from the one used in the previous examples ( = 10), suggesting that optimum values for the viscosity parameters used to define  depend to some extent on the problem being solved. With n = 100, n s = 1, and p = −2.4 mm, the integration domain topology at four different load steps is shown in Figure 27, together with the associated displacement u top of the top edge of the plate.
The adaptive reduction approach is capable of automatically following the two damage fronts simultaneously while taking into account changes in propagation direction. We note that even though some reduction domains are spatially F I G U R E 27 Doubly-notched plate: ECM domain topology and crack shapes at four different load steps (n = 100, n s = 1). ECM, empirical cubature method F I G U R E 28 Doubly-notched plate: ECM domain topology and crack shapes at four different load steps (n = n s = 10). ECM, empirical cubature method F I G U R E 29 Doubly-notched plate: load-displacement curve for the full model and hyper-reduced models with different force tolerances disconnected-for example, the  domain or the domain  100 at the edges of the models-they are treated as single domains for training and numerical integration purposes. As the damage fronts start to merge, the  domain becomes distorted as the velocity ahead of the crack tip and towards the center of the plate start to vanish. Nevertheless, the model is able to maintain an efficient retraining frequency, with 34 retraining steps out of a total of 440 steps. For n = n s = 10, the more irregular domain topology shown in Figure 28 is obtained due to the fact that a single s coordinate system is used for both damage fronts (when switching between fronts, the value of s at the starting point of the second front is made to be the same as the value at the final point along the first front). In any case, the performance in terms of acceleration and retraining frequency is similar for both models.
The load versus the displacement at the top of the plate is plotted in Figure 29. Although the curve with force = 0.025 closely follows the one from the full-order model, the precision is lower close to the onset of global softening. Although the correct crack topology is still obtained and the post-peak behavior is correctly captured, this loss of precision might F I G U R E 30 Doubly-notched plate: speed-ups with respect to the full-order solution for models with different force tolerances. The marked points correspond to the two models shown in Figure 29 be undesirable. Running the same model with force = 0.01 eliminates the oscillations and produces a response almost indistinguishable from the full-order one ( Figure 29).
Naturally, the increased precision obtained with a lower force comes at the cost of additional retraining steps and reduced speed-ups. We execute the model for multiple values of force and plot the obtained acceleration levels in Figure 30. The speed-up appears to decrease linearly as force is decreased up until a value around force = 0.005, after which point a sharp loss in efficiency is observed. In any case, acceleration levels similar to the ones from the previous examples are obtained even though the framework needs to cope with two cracks instead of one.

CONCLUSIONS
MOR techniques have been used in an attempt to accelerate the mechanical equilibrium problem that constitutes the main computational bottleneck associated with the TLS method. The need for an offline training phase was circumvented by resorting to adaptive reduction techniques that refine their reduced solution spaces through online retraining. The framework is based on a domain decomposition strategy that uses the curvilinear coordinate system associated with the level set front to adaptively define a region with fully-solved DOFs, a POD-reduced region with full integration and a user-defined number of hyper-reduced regions. The framework was gradually constructed starting from a POD-reduced model without partitioning. Although the reduced model was able to capture the correct crack propagation behavior, a large number of retraining steps was necessary because the reduced solution space could not represent changes in damage distribution ahead of the crack tip. Furthermore, the unpartitioned POD solution has been found to be incompatible with the enrichment scheme used in the TLS.
The equilibrium system was therefore partitioned in order to allow for part of the mesh to be solved in the full-order space. The level set and velocity fields were used to define a region with full DOFs that includes enriched nodes and regions both behind and ahead of the crack tip. The partitioning strategy allowed for the enrichment scheme to be used and was found to drastically decrease the number of retraining steps necessary to maintain precision, leading to higher levels of acceleration.
Domain-based hyper-reduction based on the ECM method was used to reduce the effort associated with computing the full-order internal force vector and stiffness matrix. In order to still be able to solve DOFs in the full-order space and estimate the error associated with model reduction, full integration was maintained on regions close to the crack tip. Using a single ECM domain was found to yield a high integration error and increase retraining frequency. This error vanishes as the number of domains is increased, while maintaining a reduction of several orders of magnitude in the number of integration points.
The reduction framework was demonstrated using three different numerical examples. The method was found to be suitable for different loading scenarios and crack topologies and the adaptive domain decomposition strategy was able to follow multiple cracks with complex propagation paths simultaneously. However, a number of parameters were found to be problem-dependent and had to be individually tuned for each example. The hyper-reduced equilibrium problems ran approximately 5 times faster than their full-order counterparts, with even higher speed-ups being expected if more complex constitutive models are used. Although the framework has been applied to the specific case of the TLS method, it can also be employed to accelerate other failure models as long as suitable strategies for defining the domains  , , and  are developed.