Lippmann‐Schwinger solvers for the computational homogenization of materials with pores

We show that under suitable hypotheses on the nonporous material law and a geometric regularity condition on the pore space, Moulinec‐Suquet's basic solution scheme converges linearly. We also discuss for which derived solvers a (super)linear convergence behavior may be obtained, and for which such results do not hold, in general. The key technical argument relies on a specific subspace on which the homogenization problem is nondegenerate, and which is preserved by iterations of the basic scheme. Our line of argument is based in the nondiscretized setting, and we draw conclusions on the convergence behavior for discretized solution schemes in FFT‐based computational homogenization. Also, we see how the geometry of the pores' interface enters the convergence estimates. We provide computational experiments underlining our claims.


INTRODUCTION
Numerical algorithms for computational homogenization based on the fast Fourier transform (FFT) have been a subject of intense research in recent years, see Matouš et al 1 for an overview, and a discussion of related computational techniques for homogenizing nonlinear material behavior. Originally introduced by Moulinec-Suquet, 2,3 FFT-based computational homogenization techniques operate on a regular grid, integrating nonlinear, or inelastic material behavior is straightforward and (most often) the ensuing algorithms exhibit mesh-independent convergence behavior. Thus, they are well suited for treating complex microstructures, for instance characterized by image-based techniques, and the integration into concurrent 4,5 or order-reduced multiscale methods. 6,7 The original solution method of Moulinec-Suquet, what they call "the basic scheme," and solvers with improved convergence behavior, may be essentially divided into two classes. The first class is derived from the basic scheme, and operates on the displacement 8 (or, equivalently, on compatible strain fields). These solvers include Newton's method, [8][9][10] the linear and nonlinear conjugate gradient methods, 11,12 fast gradient methods, 13,14 the Barzilai-Borwein scheme, 15 the recursive projection method, 16 the Anderson-accelerated basic scheme, 17 and other Quasi-Newton methods. 18 The second class of solution methods are the polarization schemes, which operate on so-called polarization fields which are neither compatible nor equilibrated. These include the Eyre-Milton method, 19 the augmented Lagrangian scheme, 20 and the Monchiet-Bonnet method(s). 21 It was realized 22 that all of these methods may be interpreted as damped variants of the original Eyre-Milton method, where the damping leads to an increased stability at the cost of computational speed. Polarization methods find application, in particular, for strongly convex problems, where the complexity-reduction trick 23 is available, including linear media and elastoplastic as well as elastoviscoplastic small-strain media. Furthermore, the linear solvers of 25 also belong to the class of polarization schemes (although they operate on a different polarization 26 ).
Besides research on solution methods, research effort was devoted to discretization schemes. In addition to the original scheme of Moulinec-Suquet, which can be interpreted either as a trigonometric collocation method 27 or as a Fourier-Galerkin discretization with numerical integration, 28 fully integrated Fourier-Galerkin discretizations were explored. 29,30 Also, finite difference, 31,32 finite element discretizations 25,33,34 and B-splines 35 were exploited.
The original motivation for studying different discretization schemes on regular grids was to reduce ringing artifacts in the solution fields. Later, it was realized that the original discretization of Moulinec-Suquet is numerically unstable for porous materials, whereas certain finite difference and finite element discretizations are stable. In particular, solvers that were developed for the original Moulinec-Suquet discretization may converge quickly when combined with other discretization schemes, whereas they only converge slowly for the Moulinec-Suquet discretization, as also noticed by Moulinec-Suquet-Milton 36 .
In this work, we show that the basic scheme, applied to a class of materials involving pores and for suitable reference material, converges linearly. The convergence factor depends not only on the material contrast of the nonporous phase, but also on the geometry of the interface. Furthermore, we discuss several basic-scheme based methods whose (super)linear convergence may be concluded as well. We point out that these results, derived for the continuous, nondiscretized iterative schemes, are not preserved by all discretizations. Indeed, Fourier-Galerkin discretizations do not lead to linearly converging iterative schemes, in general (but have other merits for nonporous materials with small contrast, in particular for polycrystalline materials 37 ).
This article is organized as follows. We develop the convergence theory for the basic scheme and porous materials, see Section 2, based on a functional analytic setting 13,30,38 . More precisely, it is shown that there is a "natural" subspace, on which the Lippmann-Schwinger equation is nondegenerate and which is preserved by the basic scheme. Thus, we may draw upon the convergence theory of gradient descent applied to strongly convex functions, see Appendix A1, to conclude linear convergence of the basic scheme for a safe choice of reference material. We then go on to discuss associated schemes that also converge, including fast-gradient methods, the (linear) conjugate gradient scheme and Newton's method.
In Section 3, we discuss how these results relate to finite element and spectral discretizations. We wrap up our discussion by a few well-selected computational examples, see Section 4.

CONVERGENCE OF GRADIENT-BASED LIPPMANN-SCHWINGER SOLVERS
The purpose of the first Section 2.1 is to introduce the class of homogenization problems that we consider in the functional analytic framework in a form similar to Schneider 13 and Vondřejc, 30 deriving the basic scheme of Moulinec-Suquet 2,3 as a gradient-descent method, see Bellis-Suquet 38 for a derivation in the convex setting. However, we pay special attention to minimum assumptions ensuring convergence of the gradient-descent method.
A linear convergence behavior of the gradient-descent scheme may be established under an additional (uniform) strong convexity assumption on the strain part of the free energy. In particular, for materials which lack coercivity, linear convergence may fail. In practical computations, it is frequently observed that for some discretization schemes, the basic scheme still converges linearly. This observation is shown to originate-in the continuous, nondiscretized setting and in the presence of pores-from a specific subspace which is preserved by the basic scheme, and on which the variational problem under consideration is strongly convex, see Section 2.2. From the latter results, also (super)linear convergence of associated numerical methods may be concluded, see Section 2.3.

General setup and logarithmic convergence
For positive L 1 , L 2 , … , L d , we consider a periodic unit cell Suppose two disjoint open sets Y 0 and Y 1 are given whose closures cover Y . Y 0 models the pore space, whereas Y 1 represents the volume where material is present. A schematic is shown in Figure 1.
Suppose a stored energy density 1 w : Y × Sym(d) → [0,∞) is given, which we assume to be measurable in the first variable, and continuously differentiable in the second variable. Suppose, furthermore, that w(x, ) = 0 for all x ∈ Y 0 and any ∈ Sym(d) and that there are positive constants + and − , s.t. and hold. These conditions ensure that, for any x ∈ Y 1 , the function  → w(x, ) on Sym(d) is − -strongly convex with + -Lipschitz gradient. If we assume, in addition, that the stress at rest (associated with zero strain) has finite energy, that is, holds, w gives rise to a well-defined (Nemytskii) operator on L 2 (Y ;Sym(d)) via see section 10.3.4 in Renardy-Rogers, 40 which we call the stress mapping, as it associates to any square-integrable strain field its corresponding Cauchy stress field. For the normalized inner product, see Schneider 13 or remark 3 in Bellis-Suquet, 38 , the stress mapping arises as the L 2 -gradient of the function and is + -Lipschitz continuous and monotone, that is, satisfies The latter monotonicity is equivalent to the function W being convex, see Theorem 2.1.9 in Nesterov. 41 Let H 1 # (Y ; R d ) denote the space of (periodic) H 1 -displacement fields with vanishing mean. We endow this space with the Korn-type inner product where ∇ s denotes the symmetrized gradient operator (∇ s u) ij = 1 2 ( i u j + j u i ) (i,j = 1, … , d). For prescribed macroscopic strain E ∈ Sym(d), we seek to minimize the functional The Euler-Lagrange equation computes as see Schneider,13 where div is the divergence operator (div . From a mechanical perspective, Equation (6) encodes the quasi-static balance of linear momentum at small strains (without volume forces). For the Korn-type inner product (4) on H 1 # (Y ; R d ), the gradient of f , cf Equation (5), computes as where G = (div ∇ s ) −1 is the negative of the Riesz map, also called Green's operator, see Kabel et al. 8 For fixed step size s > 0, the gradient-descent scheme for f reads see Equation (A2) in Appendix A1. The latter scheme is the displacement-based variant of the basic scheme introduced by Moulinec-Suquet 2,3 if the reference material is chosen as a multiple of the identity, C 0 = 1 s Id, that is, as an isotropic linear elastic reference material with Lamé constants 0 = 1 2s and 0 = 0. Indeed, applying the ∇ s -operator to Equation (8) and adding E yields see Vondřejc et al 28 or Schneider. 13 For the analysis, the displacement-based iterative scheme (8) will turn out to be more convenient.
For the general setup we just described, the function f has an + -Lipschitz continuous gradient and is convex. To see the Lipschitz bound, recall that, for any Banach space X and ∈ X ′ we have see any text on elementary functional analysis, for instance Megginson. 42 Then, using the defining property of the Riesz map, we have Thus, integrating by parts, we have cases which are excluded by the geometric assumption where we used the Cauchy-Schwarz inequality in the second line. Thus, the gradient ∇f is + -Lipschitz. The convexity, that is, the monotonicity of the gradient, may be deduced by similar arguments. Indeed, by the monotonicity of (3), we have . These derivations are slightly more general than in Schneider, 13 where only the case of twice differentiable (in ) stored energy function w was discussed.
As we assumed w to be bounded from below by 0, f is also bounded from below by zero. Thus, we may apply the convergence assertion and rate of estimate (A3) in Appendix A1. Choosing s = 1 + , we obtain the logarithmic convergence estimate This result holds for any distribution of the porous phase Y 0 . In particular, heavily degenerated cases are included, for instance pores with irregular interfaces, or freely floating material inside the void space, see Figure 2. In the next section, we will show that a particular regularity condition on the material distribution permits drawing stronger conclusions on the convergence behavior, and on the convergence rate.

A subspace on which the basic scheme converges linearly
In this section, we will show that the convergence rate (10) can be drastically improved provided we make stronger assumptions on the distributions of voids.
Geometric assumptions: 1. The interface I between Y 0 and Y 1 is Lipschitz-continuous.

For any
For the first condition, Lipschitz continuity means that the interface may be locally parameterized by the graph of a Lipschitz-continuous function. Thus, interfaces I = Y 0 = Y 1 , which are too irregular, are excluded. For instance, the interface in Figure 2A includes cusps which are not Lipschitz. However, any continuously differentiable interface is Lipschitz. Also, corners are allowed.
Lipschitz continuity of the interface ensures that there are well-defined restriction operators to the boundary, and, conversely, extension operators from the boundary, all in suitable Sobolev spaces, see Adams 43 or Morrey. 44 With these operators at hand, standard elliptic theory on Y 0 and Y 1 , respectively, may be developed, see Fichera. 45 The second condition rules out that Y 1 may be displaced by an infinitesimal rotation. Also it ensures that Y 1 is connected. Then, a suitable Korn's inequality 46 may be derived, which permits establishing elliptic theory for linear elasticity. 45 We introduce the space where H 1 0 (Y 0 ; R d ) denotes the completion of all smooth vector fields on Y 0 that vanish, together with all their partial derivatives, on the interface I = Y 0 , in the H 1 -norm. As it is defined as the intersection of a collection of kernels of continuous functionals, V is a closed subspace of To gain some intuition into the space V, we notice that the definition of V is just the weak form of the partial differential equation Put differently, vector fields in V may take arbitrary values in Y 1 , and their boundary values on the interface I are extended elastically (12) into the pore space Y 0 .
Instead of the full homogenization problem (5), we restrict the objective function to the subspace V, that is, we set Under the assumptions on the stored-energy density w of Section 2.1, f V is + -Lipschitz-continuous. Indeed, f V is the restriction of f onto V, and f is + -Lipschitz-continuous. However, f is merely convex. By contrast, f V is strongly convex. This can be seen as follows.
The fundamental functional-analytic ingredient is the estimate involving a positive constant c I , which depends only on the interface I. The estimate (14) asserts that the elastic energy stored in all of Y is determined, up to a universal constant, by the elastic energy stored only in the Y 1 -part. This appears convincing, as for elements u ∈ V, the behavior on Y 0 is fully determined by the behavior on Y 1 . For reasons of readability, the derivation of the estimate (14) under our assumptions is outsourced to Appendix B1.
With the estimate (14) at hand, the strong convexity of f V (13) is immediate. Indeed, for u 1 ,u 2 ∈ V, we have In the first line, we use that the stress vanishes on Y 0 . In the second line, we insert the − -strong convexity of w (2). The fourth line is a consequence of the estimate (14), applied to the third line.
Thus, we see that f V is strongly convex, and the strong convexity constant c I − does not only depend on the materials involved, but is strongly influenced by the shape of the interface I.
As f V is strongly convex and bounded from below, the minimization problem is well-posed, and solvable, for instance, by gradient descent, with a linear convergence rate, see Appendix A1. At this point, a subtlety enters. For a general closed subspace V, the gradient-descent scheme for f V , may differ from the gradient-descent scheme on the entire space (8), because the gradients differ. Indeed, the standard gradient ∇f (u) where P V denotes the orthogonal projector onto V. However, we shall show that, for the subspace V (11) and the inner product To show this equality, in view of the characterization of ∇ V (15), it suffices to show that Indeed, then we have We will even derive the inclusion which is even stronger than the statement (17). Indeed, setting = (E + ∇ s u) for u ∈ V implies (17) in view of the gradient formula (7).
Then, as G is the negative of the Riesz map which reads, more explicitly upon integration by parts. However, showing (18). As a consequence of this section, we state the following. Let a decomposition Y = Y 0 ∪ Y 1 be given, which satisfies the boxed assumption with constant c I , cf estimate (14). Then, for any stored-energy function w whose gradient on Y 1 is − -monotone (2) and + -Lipschitz (1) for positive constants + ≥ − > 0, for u 0 ∈ V, the associated gradient-descent scheme (8) for s ∈ for the contrast = + c I − . At the end of this section, several remarks are in order.
1. We introduce the space V in an ad hoc manner (11). However, the space arises naturally from considerations based on linear functional analysis. Indeed, for linear elasticity, w(x, ) = 1 2 ∶ C ∶ in terms of a heterogeneous stiffness tensor C ∶ Y → Sym(d), the Euler-Lagrange equation (6) is equivalent to for the bounded linear operator The solvability of Equation (21) may be established under the geometric assumptions based on elliptic theory. However, of greater interest to us is that the Equation (21) is even uniquely solvable for in terms of the kernel A straightforward calculation shows that (ker A) ⟂ is identical to the space V (11). In particular, it is independent of the stiffness tensor field C.
To keep the technical level to a minimum, we do not go into detail about the linear case. Instead, we directly start with the space V (11). In particular, we chose the variational setting, as discussed by Kabel et al 8 and Bellis-Suquet, 38 with simplicity of exposition in mind. The results are readily extended to the setting of Lipschitz-continuous and strongly monotone stress operators, as discussed in section 14.11 in Milton or Lemma 6.8 in Schneider, 27 as long as primal formulations are used. 2. In the same way that porous problems may be solved by (super)linearly convergent solvers in the primal formulation, problems with rigid inclusions may be solved by (super)linearly convergent solvers in the dual, that is, stress-based, formulation. 47 The line of argument proceeds by interchanging quantities which are dual, that is, interchanging compatible strains by equilibrated stresses, and interchanging displacement boundary conditions by traction boundary conditions, and by defining a suitable subspace on which the problem is nondegenerate. However, upon discretization, some care has to be taken. Indeed, the compatibility of the strain field is often preserved by the discretization scheme (for instance by discretizing the displacement, in the first place). However, the divergence constraint on the stress is incompatible to low-order finite elements, see Ciarlet. 48 As a remedy, nonconforming finite-element spaces might be used, and some care has to be taken. An extension to mixed formulations, that is, those involving both primal and dual variables, see Bellis-Suquet, 38 would require a closer look. In particular, it might be possible to treat materials having pores and rigid inclusions at the same time by effective numerical methods. 3. The interested reader might wonder about the form of the elastic extension operator (12). The results of this section may be extended to the more general inner product in terms of a reference stiffness C 0 , as discussed in section 2 of Bellis-Suquet, 38 and the subspace which leads to the C 0 -elastic extension However, then the constant in the estimate (14) depends on the interface I and the reference stiffness C 0 . 4. For linear elasticity, the initial u 0 ∈ V may be chosen as zero, that is, 0 = E. Indeed, the zero vector is contained in any subspace. For (pseudo)time-dependent problems, u 0 may be chosen as an extrapolation of converged iterates from previous steps, as long as they are elements of U, see the appendix in Moulinec-Suquet. 3 For the strain-based formulation, any convex combination of previously converged strain iterates is admissible. 5. In contrast to the nonporous case, 3,19 the number of required iterations depends not only on the material contrast + − , but also on the geometry (and regularity) of the interface. Unfortunately, we are not aware of a simple or cheap way of computing the constant c I . Thus, we use the conservative choice s = 1 + for the estimate (20).
6. Adapting these arguments to mixed boundary conditions 49 is straightforward. 7. In practical computations, * is unknown, in general. However, the estimate (A8), valid for strongly convex functions with Lipschitz gradient, permits us to translate L 2 -convergence rates of k toward * into H −1 # -convergence rates of div ( k ) and vice versa, that is, are identical ways to measure the residual, see section 4.1.2 in Bellis-Suquet. 38

Other at least linearly convergent schemes
In this section, we discuss a variety of solution schemes where linear convergence may be established with similar arguments as for the basic scheme. They all have in common that they are gradient-based optimization schemes.

The Barzilai-Borwein method
The Barzilai-Borwein method 15,50 may be interpreted as a variant of the basic scheme (8) with variable step-size s k determined via s 0 = 1 respectively, and C k = 1 s k Id as well as Γ k = s k ∇ s Gdiv. In the context of FFT-based methods, 15 the formula for the step-size becomes For the same class of admissible initial values, and linear elastic material behavior, the Barzilai-Borwein method (24) converges r-linearly for a constant ∈ (0, 1) depending on the contrast = + c I − , see  If the stored energy density is twice differentiable, local r-linear convergence may be established. 51 There are counterexamples to global convergence of the Barzilai-Borwein method even for strongly convex functions. 52 However, so far, the Barzilai-Borwein scheme (24) without additional globalization technique appears to work well in the context of FFT-based methods for computational homogenization.

Fast gradient methods
Fast gradient methods come in essentially two flavors. The first prototype is the heavy-ball method 53 with u −1 = u 0 , involving, in addition to a step-size, an inertial parameter ∈ [0, 1), see also the strain-based variant 14 Notice that the heavy-ball method (26) may profit from the strong convexity of the functional, because u 0 ∈ V and u −1 ∈ V imply u k ∈ V for all k = 1,2, … Indeed, the iterative scheme (26) differs from the classical gradient step (8) only by a linear combination of elements in V.
In case of linear elasticity, choosing , there is a constant C , s.t.
|| k+1 − * || L 2 ≤ C l || k − * || L 2 for k = 0, 1, · · · , and l = 0, 1, · · · , where C need not be bounded as → . 54 Thus, for the linear elastic case, the heavy-ball method could be extremely useful for problems involving pores. Unfortunately, both the step size and the inertial parameter critically depend on knowing the strong convexity constant, limiting the practical usefulness of the heavy-ball method. Also, for minimizing general strongly convex twice differentiable functions, the heavy-ball method converges only locally with a fast rate (27). In general and in terms of the global convergence behavior, the heavy-ball method is not faster than pure gradient descent. 55,56 The second prototype of a fast-gradient scheme is Nesterov's method 41,57 v k = u k + (u k − u k−1 ), with u −1 = u 0 , see Schneider 13 for the implementation in the context of FFT-based methods. For Nesterov's fast gradient method, the extrapolation step precedes computing the gradient, whereas these roles are reversed for the heavy-ball scheme. Clearly, also Nesterov's method may profit from the strong convexity of f V (13), as u k ∈ V and v k ∈ V provided u 0 ∈ V and u −1 ∈ V.
Nesterov's method leads to accelerated convergence behavior for a more general class of objective functions. For the stored energy function w, with properties as in Section 2.1, choosing see remark B.2 in Chambolle-Pock 58 (and theorem 2.1.8 in Nesterov 41 ). For linear elastic parameters, a more aggressive parameter choice is also possible see Lessard et al. 54 The optimal parameters (29) for Nesterov's method (28) also involve the unknown constant c I . However, in contrast to the heavy-ball scheme, methods for estimating the parameter are available, typically in the form of restarting schemes. 59,60

Conjugate gradient methods
Nonlinear conjugate gradient methods 61 for minimizing f iteratively update with d −1 = 0, k is determined by a line-search procedure and for k , a variety of formulas is in use. [62][63][64][65] Provided u 0 ∈ V, nonlinear conjugate gradient methods also profit from the strong convexity of f on V. Indeed, the iterative procedure (30) may be equivalently rewritten in the form 66 Thus, nonlinear conjugate gradient methods may be interpreted as heavy-ball methods (26) with variable step size and momentum parameter.
For linear elastic material behavior, all nonlinear conjugate gradient methods reduce to the standard linear conjugate gradient method of Hestenes-Stiefel, 62 where the line-search may be avoided as the objective function is quadratic. In the FFT-based context, using the conjugate gradient method has been pioneered by Zeman et al 11 and Brisard-Dormieux. 25 For the linear conjugate gradient method, we obtain the convergence estimate see Hackbusch's book. 67 In contrast to other solvers, the linear conjugate gradient method does not need access to the constants c I , − and + . Unfortunately, the general nonlinear conjugate gradient method (30) does not match the performance of the linear special case. Still, there is recent progress in the FFT-based setting. 12

The Newton-Kantorovich method
The Newton-Kantorovich method 68 cannot be applied to the problem ∇f (u) = 0 directly, because the Hessian ∇ 2 f is degenerate, in general, and, thus, not invertible. However, provided w is sufficiently often differentiable, we may consider the Newton-Kantorovich method for finding a critical point on the subspace Indeed, we may update iteratively, where s k ∈ (0,1] is a backtracking factor. In the inexact setting, we may rewrite the update (31) in the form where k ∈ V is an inexact solution of the linear equation For the exact Newton update (31), local quadratic and global linear convergence may be established. 69 For the inexact version (32) and (33), superlinear convergence may be established, based on appropriate forcing-term choices, see Wicht et al. 18 For solving the linear equation (33), the linear conjugate gradient method may be used.
Convergence theory may be established for Quasi-Newton methods operating on V, as well. We refer to Wicht et al 18 for a discussion of Quasi-Newton methods for FFT-based computational micromechanics.

Finite-difference and finite element discretizations
To understand the consequences of Section 2.2 for discretized versions of the gradient-descent scheme (8) and the related solution methods, we consider a triangulation  h of the unit cell Y , and an associated finite-element space  h associated with the triangulation, parameterized by a mesh parameter h, see Ciarlet. 48 We assume the finite-element discretization to be conforming, that is,  h should be a finite-dimensional subspace of H 1 # (Y ; R d ). Clearly, for FFT-based computational micromechanics, we are mostly interested in regular discretizations, that is, hexahedron-based finite elements. However, for our line of argument, this is not necessary. As in Section 2.1, we suppose that a decomposition of Y into Y h 0 and Y h 1 to be given. We assume that Y 0 and Y 1 are, up to measure zero, covered by finite elements, that is, we exclude elements that cross the interface I h = Y h 0 ≡ Y h 1 . An example of such a decomposition is shown in Figure 3A in two spatial dimensions. We chose as (regular square) elements of Y h 0 those with nontrivial intersection with Y 0 , although this is not mandatory. Furthermore, a discretized variant of the stored energy density w h ∶ Y × Sym(d) → R should be given, which vanishes on Y 0 and satisfies the bounds (1) and (2) for the same constants − and + , respectively. This discretized variant may be necessary if a finite element T ∈  h crosses the (nondiscretized) interface I. Then, the strong convexity estimate (2) no longer holds on this element, as w vanishes on a part of it. As a remedy, we assume, in that case, that we may extend w in a reasonable way to the entire element. Of course, as h → 0, the fraction of such elements crossing the interface becomes smaller and smaller, rendering the influence of the choice of extension smaller, as well.
For fixed macroscropic strain E ∈ Sym(d), we wish to minimize the function whose critical points satisfy div h h (E + ∇ s u h ) = 0 (34) in terms of the stress operator As in the infinite-dimensional case, compare Vondřejc et al 28 for some step size s > 0. Here, G h = −R h is the discrete Green's operator, defined in terms of the Riesz mapping In a completely analogous fashion as in Section 2.1, one may establish that f h is a convex function with + -Lipschitz gradient, and that for a suitable choice of the step-size, the basic scheme (35) converges logarithmically, see estimate (10), in general.
To obtain a linear convergence rate, stronger assumptions are necessary. Thus, we suppose that the geometric assumptions of Section 2.2 hold, with Y 0 and Y 1 replaced by Y h 0 and Y h 1 , respectively. Let us remark that we consider Y h i to be approximations of Y i for i = 0,1, that is, the geometric assumptions for Y h i should be a consequence of the geometric assumptions for Y i for sufficiently small mesh size h.
In analogy to the continuous case (11), we define the subspace This choice is illustrated in Figure 3B, where the nodal degrees of freedom of Q 1 -finite elements vanishing on Y h 1 are shown. Vice versa, the nodal degrees of freedom of elements of V h are unconstrained except for those nodes marked in Figure 3B.
As for the continuous case, see Appendix B1, we find a constant c h I , s.t.
Here, the key technical argument is based on the (trivial) fact that finite elements admit well-defined boundary values on the interface I h , that is, nodal values, and that the definition of V h (36) involves a discretized variant of the elastic extension (12).
It is straightforward to show that the minimization problem for f h is strongly convex on V h , and that the basic scheme (35) preserves V h . In particular, similar conclusions as in Sections 2.2 and 2.3 may be drawn, with c I replaced by c h I . We close this section with a few remarks.

It is certainly of interest under which conditions the constant c h
I is related to its continuous cousin c I . However, this goes beyond the scope of this article. 2. This section was devoted to fully integrated and conforming finite-element discretizations. However, also underintegrated variants 31,33 or those involving hourglass stabilization 34 may be covered. Attention has to be paid to the geometric assumptions. Indeed, for Willot's discretization, hourglassing instabilities may occur on Y 1 h that are not present on Y , adversely affecting the conditioning, see Section 4. 3. Finite difference 31,32 and finite-volume discretizations 70 may be treated similarly. On a regular grid, these may be considered as finite-element methods with numerical integration.

Discretizations based on trigonometric polynomials
We investigate the discretization by trigonometric polynomials introduced by Moulinec-Suquet, 2,3 and subsequently analyzed from different perspectives. 25,27,28 For simplicity of exposition, we restrict to an equal number N of subdivisions for each coordinate axes, and suppose that N is odd, see Vondřejc et al 30 for the rationale behind this choice.
We build upon the setting of Section 2.1 and define where i denotes the complex unit and the abbreviation is used for any ∈ Z d .
 N consist of vector-valued trigonometric polynomials of degree N, as all Fourier coefficients outside of the box [− N/2,N/2] d vanish. 28 For fixed macroscopic strain, we wish to minimize the function where that is, f N evaluates f (5) by the trapezoidal rule, 28 restricted to  N . The Euler-Lagrange equations for minimizing f N (38) compute as where Q N denotes the trigonometric interpolation operator of degree N w.r.t. Y N , see Schneider. 27 It is not difficult to see that f N defines a convex function with + -Lipschitz continuous gradient, see Schneider,27 where the more general case of a Lipschitz-continuous and monotone stress operator is treated. In particular, the gradient-descent scheme associated with f N converges logarithmically for the step size s = 1 + . Up to this point, the developments are very similar to Section 3.1. However, differences arise if we wish to emulate the approach taken in the continuous case of Section 2.2 and the closely related finite-element case, even if the geometric regularity assumptions are satisfied.
The invariant subspace reads where Y N 1 = Y N ∩ Y 1 , that is, those integration points which lie inside Y 1 . The space V N arises naturally in the linear elastic case, see Equation (22) in the first remark of Section 2.2. However, in contrast to the continuous case (and for finite elements), we may no longer conclude that u(x) = 0 for all x ∈ Y N 1 . Indeed, a trigonometric polynomial may oscillate wildly although the derivative is horizontal at some collocation points.
Thus, we may no longer conclude that elements of V N satisfy an elastic-extension property. For instance, it is not clear how to define suitable interface values for vector fields u ∈ V N . Another problem to take into consideration is that, in contrast to finite-element basis functions, any nontrivial element of  N has global support. This global support implies that numerical errors propagate with infinite speed during iterations. This contrasts with finite-element problems, where a numerical error introduced for a single node only affects the neighboring nodes.
At the end of this section, several remarks are in order.

In the linear elastic setting, the operator A N
defines an isomorphism. Indeed, it is a finite-dimensional linear operator with trivial kernel. However, based on the numerical experiments, the coercivity constant is supposedly very close to zero. In particular, it may not be related to the constant c I of the fundamental estimate (14) and does not lead to a linearly converging basic scheme, in general. Indeed, mathematical analysis is not limited by machine precision, whereas practical computing is. 2. In this section, we only considered the Moulinec-Suquet discretization. For the linear elastic setting, the Fourier-Galerkin discretization (with full integration) 29,30 may be considered, and is also possibly susceptible to similar problems as its underintegrated cousin. Also, due to its nonlocal support, the B-splines-based discretization 35 is expected to suffer from convergence difficulties in the porous setting, although we did not investigate this numerically.

Setup and materials
The Moulinec-Suquet discretization, 2,3 Willot's finite-difference discretization, 31 and the staggered-grid discretization 32 were integrated into an in-house FFT-based computational micromechanics code (written in Python with Cython extensions). The staggered-grid discretization 32 is a finite-difference discretization on a regular voxel grid. In contrast to Willot's finite-difference discretization, 31 it is devoid of checkerboarding (or, rather, hourglassing) artifacts, because its Green's operator may be computed by solving four Poisson equations involving the standard six-point finite-difference Laplacian. By contrast, Willot's discretization uses a completely different finite-difference star.
As the convergence criterion, we use for prescribed tolerance tol, see Schneider-Wicht-Böhlke, 23  In addition to being computationally consistent-see section 4.1.2 in Bellis-Suquet 38 for an in-depth discussion-using such a convergence criterion permits us to draw conclusions about the distance of the current iterate to the solution of the discretized problem, as in Equation (23). Indeed, if the iterative scheme under consideration converges (super)linearly to the solution, the convergence criterion (40) will also decrease linearly in a log-plot. In this way we obtain a robust indicator for the convergence rate also for problems without exact solutions.
The convergence criterion (40) quantifies how accurate the balance of linear momentum (34) is satisfied and depends on the discretization used. This discretization-scheme dependence is encoded by the proper Γ h -operator. Thus, some care has to be taken when comparing the different discretization schemes. Still, these differences should be small for h ≪ 1.
The computations were carried out on a Laptop with 16 GB RAM and a dual core processor. For the nonporous phase, we use the material parameters of quartz sand, 71,72 E = 66.9GPa and = 0.25.

A two-dimensional porous structure
We start our investigation with a simple two-dimensional structure, including 16 = 4 2 circular pores of equal radius. The microstructure with a pore area fraction of 30% was generated by the sequential linear programming based Torquato-Jiao algorithm, 73 and discretized by 256 2 pixels. The geometry is shown in Figure 5A. Subjected to 5% uniaxial extension 49 in y-direction, the residual (40) vs the iteration count for the basic scheme (8) in the strain-based formulation (9) and step size s = 1 + is shown in Figure 4A.
For all three discretization schemes considered, the residual (40) decreases monotonically. For Moulinec-Suquet's discretization, after an initially strong decrease, the convergence behavior flattens, and does not decrease below 10 −3 up to 10 000 iterations. For Willot's discretization, the residual converges rather quickly to a level of about 10 −4 and subsequently enters a region of slow, but steady decrease. A residual below 10 −5 is reached after 8237 iterations. For the staggered grid discretization, the basic scheme reaches the tolerance 10 −5 after 352 iterations.
All discretizations feature an initially strong decrease of the residual. For the staggered grid discretization, this initial decrease already suffices to pass the 10 −5 -mark. For the two other discretizations under investigation, the initial phase is followed by a more modest convergence behavior. However, it appears difficult to distinguish a truly logarithmic convergence behavior, as predicted in the general case (10), and a linear convergence rate involving a very small strong convexity constant (20). Indeed, the finite precision of the computer arithmetic may interfere, as the potential progress predicted by a linear convergence behavior may be nullified by accompanying round-off errors.
To gain deeper insight, we also investigate the (linear) conjugate gradient (CG) method, as proposed by Zeman et al 11 and Brisard-Dormieux, 24,25 applied to the identical problem. The results are shown in Figure 4B. As in case of the basic scheme, the Moulinec-Suquet discretization fails to reduce the residual below 10 −3 . In fact, it appears that the conjugate gradient method is not faster than the basic scheme. Notice that-in contrast to the basic scheme-the residual for the conjugate gradient method does not decrease monotonically, in agreement with general theory. However, the F I G U R E 4 Residual vs iteration for different discretization schemes and the two-dimensional porous structure, cf Figure 5A (A) (B) convergence rate of the conjugate gradient method is improved for strongly convex quadratics compared with vanilla gradient descent. We observe such an improvement both for Willot's discretization, converging after 152 iterations, and for the staggered grid, requiring 67 iterations. The fact that no such improvement by CG is observed for the Moulinec-Suquet discretization indicates that the constant c I (14) for the latter discretization is exceedingly small for the considered example.
We take a look at the local solution fields, more precisely the Frobenius norm of the strain field, corresponding to the three discretization schemes under investigation. Within the material, the Moulinec-Suquet discretization, cf Figure 5B, exhibits the well-known ringing artifacts, characteristic for discretizations based on trigonometric polynomials in the presence of discontinuities. For Willot's discretization, cf Figure 5C, the checkerboarding typical for underintegrated Q 1 finite elements may be observed. The solution fields for the staggered grid discretization, cf Figure 5D, is smooth away from the interface, where spikes occur. This is a consequence of using the "inconsistent" staggered grid 32 for performance reasons.
Of particular interest to us is the strain field within the pores, where it is devoid of any physical significance. Rather, as predicted by the theory of Section 2.2, it corresponds to the elastic extension (12). For all three discretizations, the strain fields within the pores appear similar if we ignore the artifacts for the moment. For the staggered grid discretization, the strain fields inside the pores are as smooth as outside of the pores. By contrast, for Moulinec-Suquet's discretization heavy oscillations are visible within the pores, reflecting the numerical instability discussed in Section 3.2. For Willot's discretization, also (less pronounced) artifacts within the pores are visible.
We wish to stress, however, that the latter observations are mostly of cosmetic nature, that is, the solution field is only investigated for completeness, and visual artifacts will not necessarily have an adverse influence on the convergence behavior of solution schemes, in general.

A three-dimensional porous structure
As our next example, we consider a three-dimensional porous structures with 30% pore-volume fraction. The geometry, cf Figure 6A, is composed of 64 = 4 3 spherical pores of identical shape and was generated by the mechanical contraction algorithm of Williams-Philippse. 74 Subsequently, the geometry was discretized by 32 3 , 64 3 , and 128 3 voxels. The porous structure was subjected to 5% uniaxial strain in x-direction. The von Mises equivalent strain is shown in Figure 6B for 128 3 voxels and the staggered grid discretization, for 5% uniaxial extension in x-direction.
First, we carried out a numerical experiment as in Section 4.2, but for the three-dimensional structure and 32 3 voxels, see Figure 7. The results are qualitatively similar. However, due to the increased complexity of the three-dimensional microstructure, the computational methods also require a higher iteration count. Indeed, the constant c I enters the linear convergence estimate (20), and we see its influence at this point. Taking a look at the residual history for the basic scheme, cf Figure 7A, we observe that, although the staggered grid discretization leads to a linearly convergent basic scheme, the required iterations (659) for convergence are much higher for the three-dimensional structure. Both the Moulinec-Suquet discretization and Willot's discretization did not converge to the desired tolerance within the prescribed 10 000 iterations. Furthermore, the residual vs iteration plot for the conjugate gradient method is shown in Figure 7B. Not surprisingly, for the staggered grid discretization, CG converges also in a linear fashion, requiring 43 iterations to reach a residual of 10 −5 . Willot's discretization leads to a linearly converging conjugate gradient scheme. For reaching a residual of 10 −5 , 1687 iterations are required. For Moulinec-Suquet's discretization, the conjugate gradient method actually improves upon the basic scheme in terms of the residual reduction. However, after 1000 CG iterations, the residual is still at 0.00099.  To sum up, we see that the used discretization method has a profound influence on the convergence behavior of the solution method. In general, the finite-difference discretizations lead to linearly converging schemes. However, for practical purposes, using the basic scheme appears to be not competitive. For the linear elastic problems considered, the conjugate gradient method profits from a lack of tedious parameter tuning, leading to a quickly converging method which closely reflects the conditioning of the problem under consideration. Thus, it may serves as an indicator for the conditioning of the linear systems to be solved.
For the staggered grid, a resolution study is shown in Figure 8A. For the basic scheme that is shown with dashed lines, the required iterations (659-627-579) decrease slightly for increasing resolution, but the convergence rates, that is, the slopes of the final linear decrease, appear to be almost identical. Thus, we see that the convergence rate is affected primarily by the geometric structure of the interface. For comparison, also the results for the conjugate gradient method are included in Figure 8A. For the conjugate gradient method, the required iteration counts also decrease (43-42-41), but the difference is minimal.
Due to its limited convergence speed, the basic scheme is not always the solution method of choice. Thus, we included a comparison of different solvers in Figure 8B for the three-dimensional porous structure, 128 3 voxels and the staggered grid discretization. All considered solvers lead to a linear rate of convergence, confirming the results of Section 2.3. The (linear) conjugate gradient (CG) method 11 is fastest, and exhibits a monotone convergence behavior. CG is only slightly trailed by the Barzilai-Borwein (BB) scheme 15 in terms of iterations to convergence. However, the convergence behavior of the Barzilai-Borwein method is nonmonotonic, which is reflected in the r-linear convergence estimate (25). Last but not least, the speed-restarted Nesterov's method 13 is included in the comparison. Based on previous experience, see section 3.2 in Schneider, 15 for porous materials, it is the fastest method among the fast gradient methods of Section 2.3.2. It requires slightly more than twice as many iterations as the CG method.

CONCLUSION
In this work, we investigated FFT-based computational micromechanics, as introduced by Moulinec-Suquet, 2,3 applied to materials with pores. We showed that, in the continuous setting and under an assumption on the distribution of the void phase, the basic solution scheme of Moulinec-Suquet, and several of its variants, converge with (super)linear rate.
The key insight was to identify a specific subspace of displacement-fluctuation fields which is preserved by the iterative schemes and on which the optimization problem is nondegenerate. However, a constant depending on the geometry of the interface enters the convergence estimate. This linear convergence behavior of the basic scheme and its derivatives contrasts with the convergence behavior of the "accelerated schemes" [19][20][21] for media involving pores. Computational experiments, see section 3.2 in Schneider, 15 suggest that the logarithmic convergence rate predicted by the Krasnoselski-Mann theorem (theorem 5.13 in Bauschke-Combettes 75 ) for the polarization schemes with nonvanishing damping 23 cannot be improved, in general, even for microstructures satisfying the assumption of Section 2.2.
As the continuous problem is well posed on this subspace and the iterative schemes of basic type preserve this subspace, the impact of the discretization scheme on the convergence behavior of the solvers becomes strikingly clear. On the other hand, for finitely contrasted media, the convergence behavior of the solvers is primarily dependent on the material contrast (and rather independent of the discretization), for porous materials, the discretization scheme plays an important role. We saw that, for finite-element discretizations, the line of argument that worked in the continuous case may be transferred in a straightforward way. In contrast, spectral discretizations, due to their ansatz functions with global support, do not preserve the argument involving the elastic extension 12. Due to their associated computational burden, fully integrated finite-element discretizations are seldomly used in FFT-based micromechanics, at least in the strain-based formulation. 33 The different underintegrated variants 31,32,34 work admirably for porous materials, but exhibit differences in performance from case to case.
In this conclusion, we wish to draw attention to the convergence criterion (40) used. In FFT-based micromechanics, different convergence criteria are in use, see Schneider et al 23 for a discussion. Different convergence criteria may create confusion, as it could happen that a fixed solution scheme converges for one criterion quickly, whereas it exhibits slow convergence for another. The convergence criterion (40) is "the" consistent one from the viewpoint of functional analysis, that is, it makes sense in the Hilbert space setting, cf see section 4.1.2 in Bellis-Suquet, 38 and, for the different discretizations and in the strongly convex setting, permits estimating the distance to the true solution of the discretized problem. Of course, other normalizations within the criterion (40) may be used.
Last but not least, let us point to questions of further research interest. It would be interesting to prove that the constants c h I in estimate (37) are bounded away from zero independently of h. Furthermore, attention may be devoted to constructing a greater variety of discretization methods compatible with the reduction to the invariant nondegenerate subspace, for instance in terms of nonconforming finite elements.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.

APPENDIX A. CONVERGENCE RESULTS FOR GRADIENT DESCENT AND CONVEX PROBLEMS
In this section, we collect results from convex analysis in a form that serves our purposes. On a Hilbert space X, let f ∶ X → R be a continuously (Fréchet-)differentiable function which we assume convex and bounded from below. Provided the gradient is L-Lipschitz continuous, that is, ||∇f (x) − ∇f (y)|| ≤ L ||x − y|| holds for all x, y ∈ X, for any starting value x 0 ∈ X, the gradient method see Corollary 2.1.2 in Nesterov. 41 Here, f * is the infimum of f on X (which exists as f is bounded from below), and x * is a minimizer of f on X, that is, f * = f (x * ). In particular, the function values at x k converge to the minimum value only logarithmically. Suppose that f is even -strongly convex for some > 0, that is, ⟨∇f (x) − ∇f (y), x − y⟩ ≥ ||x − y|| 2 holds for all x, y ∈ X.
Then, there is a unique minimizer x * of f on X and the gradient method (A2) with step-size s ∈ ( 0, 2 +L ] satisfies the estimate see Theorem 2.1.15 in Nesterov. 41 In contrast to the general case (A3), we obtain a convergence estimate for the iterates and not only for the objective values. Furthermore, we obtain a linear rate of convergence. Notice that the interval of admissible step sizes, that is, those step sizes for which the scheme is guaranteed to be stable, is strictly larger than for the general convex case. The best convergence rate is reached for s = 2 +L , leading to However, for this rate, we need to know both and L explicitly. Suppose, for the time being, that we know L, but do not have access to (except that we know that there must be a positive ). Then, the best theoretical bet is to use s = 1 L , which leads, in view of the convergence estimate (A5) to the estimate Last but not least, the second geometric assumption directly implies Korn's inequality on Y 1 , see Fichera, 45 that is, there is a constant C 2 , s.t. for any w ∈ H 1 (Y 1 ; R d ) with mean zero we have ||w|| H 1 (Y 1 ;R d ) ≤ C 2 ||∇ s w|| L 2 (Y 1 ;Sym(d)) . (B6) With these preliminaries at hand, we may deduce the estimate (B1). First, note that any element u ∈ V, cf (B2), satisfies the Dirichlet problem (B3) on Y 0 . Modify u to have zero mean on Y 1 , and denote it byũ. Clearly,ũ also satisfies the Dirichlet problem (B3) on Y 0 . Thus, we have, by the elliptic estimate (B4), .
Applying the trace estimate (B5) further yields By Korn's inequality (B6), we further have where we used ||M + M T || ≤ 2||M|| for any M ∈ R d×d in the Frobenius norm and that u andũ differ only by a constant. Thus, we may conclude that is, the estimate (B1) holds for c I = 1/(1 + (C 0 C 1 C 2 ) 2 ).