Towards robust, fast solutions of elliptic equations on complex domains through HHO discretizations and non-nested multigrid methods

The use of modern discretization technologies such as Hybrid High-Order (HHO) methods, coupled with appropriate linear solvers, allow for the robust and fast solution of Partial Diﬀerential Equations (PDEs). Although eﬃcient linear solvers have recently been made available for simpler cases, complex geometries remain a challenge for large scale problems. To address this problem, we propose in this work a geometric multigrid algorithm for unstructured non-nested meshes. The non-nestedness is handled in the prolongation operator through the use of the 𝐿 2 - orthogonal projection from the coarse elements onto the ﬁne ones. However, as the exact evaluation of this projection can be computationally expensive, we develop a cheaper approximate implementation that globally preserves the approximation properties of the 𝐿 2 -orthogonal projection. Additionally, as the multigrid method requires not only the coarsening of the elements, but also that of the faces, we leverage the geometric ﬂexibility of polytopal elements to deﬁne an abstract non-nested coarsening strategy based on element agglomeration and face collapsing. Finally, the multigrid method is tested on homogeneous and heterogeneous diﬀusion problems in two and three space dimensions. The solver exhibits near-perfect asymptotic optimality for moderate degrees of approximation.


INTRODUCTION
We address in this work the numerical solution of elliptic equations on complex domains.Such equations are encountered, e.g., in the modelling of Darcean flows in porous media, and constitute a simplified model of elasticity in the small deformation regime and of viscous terms in fluid mechanics.Hybrid discretization methods have gained growing interest for this kind of problems in recent years thanks to a number of favorable features, including the conservation of numerical fluxes across interfaces and the possibility to reduce the number of unknowns by hybridization.Modern hybrid methods include, in particular, Hybridizable Discontinuous Galerkin 1,2 , Hybrid High-Order 3,4,5 (HHO) methods, and Nonconforming Virtual Elements. 6These methods hinge on degrees of freedom (DoFs) located in elements and on faces, which can be globally viewed as broken polynomials respectively on the mesh and its skeleton.The element-defined DoFs being only locally coupled, they can be expressed, element by element, in function of the DoFs on the faces, and subsequently eliminated from the global linear system.This gives rise to a Schur complement of smaller size where only face unknowns remain.This process is known as static condensation in the mechanical literature, and the resulting system as a statically condensed system, or trace system, in reference to the mesh skeleton as the support for the set of globally coupled unknowns.The solution of the trace system, yielding the face unknowns, remains the costliest operation, after which the values of the element unknowns can be inexpensively recovered by solving small, independent linear systems.
As a consequence, the practical usefulness of hybrid discretizations in an industrial context depends on the existence of efficient linear solvers for the condensed system.Multigrid methods constitute an appealing option, that has been widely explored in the context of fully nonconforming (discontinuous Galerkin) discretization methods; see, e.g., 7 and references therein.Domain decomposition methods also deserve to be mentioned 8 in this context.The main difficulty in the design of a geometric multigrid algorithm for a trace system resides in the location of the DoFs, which makes intergrid transfer operators designed for elementdefined functions unsuitable.A literature review of trace system solvers shows the variety of paths one can follow to tackle this particular setting.The first such multigrid algorithm 9 aims at converting the face-defined polynomials to element-defined ones via the adjoint of the trace operator in order to make use of a known standard geometric multigrid.A variant 10 using this time an algebraic multigrid instead of a geometric one was also experimented.Later, a different approach 11 is considered through the design of an ℎ -multigrid algorithm for unstructured meshes based, for the first time, on trace functions at every level, where the intergrid operators depend on Dirichlet-to-Neumann maps to preserve energy between levels.Finally, also conserving face-defined polynomials at every level, the latest multigrid algorithm, 12 developed by the present authors and which this work follows up on, is based on a prolongation operator that internally reverses the static condensation to recover coarse element-defined polynomials, before computing their traces on the fine faces.Prior to detailing this solver, we also point out the efforts made to design -multigrid preconditioners for non-elliptic equations, 13,14,15 as well as other techniques such as domain decomposition 16,17 and nested dissection. 18n this paper, we focus on HHO discretizations, which support general polytopal meshes and arbitrary degrees of approximation.The defining feature of HHO methods is the use of a higher-order potential reconstruction, which allows the gain of up to one additional order of approximation. 19The starting point of the present work is the nested multigrid solver developed in our previous work, 12 which takes advantage of this higher-order potential reconstruction operator to improve convergence rate.Nonetheless the general methodology developed in this work applies, in principle, to other hybrid methods, such as the ones mentioned in the first paragraph of this introduction.
When directly used as a solver, that multigrid method 12 exhibits ℎ-independent convergence rates on structured 2D and 3D meshes, as well as robustness with respect to the polynomial degree.Working with a hierarchy of nested meshes obtained from successive refinements of an initial coarse mesh has, however, some drawbacks.The first one is the loss of the flexibility offered by unstructured meshes to accurately approximate complex regions, as the initial coarse mesh must already be a good approximation of the geometry.Starting from an initial coarse grid can still offer possibilities of very efficient multigrid implementations, such as Hierarchical Hybrid Grids. 20,21This requires that the coarsest grid is fine enough to capture the geometry.Recent techniques 22 based on polynomial mappings of the elements allow to approximate curved boundaries upon each step of refinement.The second drawback of the nested approach is linked to the fact that, in 3D, most methods of tetrahedral subdivision generate elements of degraded shape quality, 23,24,14,25 which can affect the accuracy of the approximate solution as well as the performance of the linear solvers.Note that Bey's refinement method 26 mitigates the problem of asymptotic mesh deterioration.As the nested multigrid method 12 indeed shows high sensitivity to the mesh regularity, poor convergence may occur on unstructured meshes that are obtained by tetrahedral refinement.Exploring non-nested grids as an alternative to hierarchies produced by mesh refinement is the purpose of the present work.Non-nested multigrid methods have been also investigated, although none of them seems applicable to trace systems issued from hybrid discretizations. 27,28,29,30,31ompliance to non-nested settings is performed through the orthogonal projection, in 2 -norm, of a coarse broken polynomial onto the non-nested fine mesh (as already performed in the context of DG discretizations 7 ).The numerical evaluation of this operator hinges on the prerequisite of computing the geometric intersections between coarse and fine elements.This preliminary step can occur as computationally prohibitive.So, instead of this exact computation, we propose the implementation of an approximate operator that does not require intersections.We also refer to a previous work 32 that tackles the practical computation of 2 -orthogonal projections as well.
Our non-nested multigrid method is validated by numerical tests using independent retriangulations of the domain at every level.However, in practice, building a high-quality mesh for a real, industrial case study can already be an arduous task, which may occupy a meshing engineer for several months.Requiring multiple high-quality meshes of the same geometry at different granularities in order to feed a multigrid solver is then not always conceivable.From the user's standpoint, providing the solver with the sole fine mesh is a preferable option.To this end, we also want to pave the way for the construction of suitable coarsening strategies for this type of multigrid method.We will provide guidelines by designing a full example of this strategy implementing the relevant features.In particular, in a face-defined multigrid method, as the smoother operates on the mesh skeleton, the efficient reduction of the low-frequency components of the error relies on accessing coarse representations of the face-defined polynomials.This implies that faces must be coarsened between levels, 12, §4.4.3 which is a new constraint imposed to any suited coarsening strategy.Unfortunately, usual agglomeration methods, 33 typical candidates for the coarsening of unstructured meshes, do not meet this requirement.More generally, methods that conserve embedded meshes do so by conserving fine faces on the coarse mesh, which goes against this new constraint.This observation points out non-nested methods as suitable alternatives.As such, strategies which build coarse tetrahedra from the fine tetrahedra's vertices 34 seem well-adapted to our problem, as long as the shape quality of the coarse elements is sufficient.This condition drives us towards polytopal coarse elements in order to make use of their shape flexibility to construct high-quality coarse meshes.Methods based on Voronoi diagrams 35,36 fulfill that purpose, as well as our face coarsening constraint.Nonetheless, we choose to propose another option based on element agglomeration.Indeed, additionally to being easy to understand and implement, these methods also benefit from simple derivations to anisotropic grids. 37Thus, we suggest to add to any chosen agglomeration process a step of face collapsing, already used in different fashions. 38,39Moreover, we will show that agglomeration-based methods ease the construction of our approximate 2 -orthogonal projection.
Having established the reasons driving us to non-nested settings, this work is organized around the adaptation of the existing nested solver 12 and its efficient implementation for the purpose of its practical use for the solution of condensed linear systems arising from the HHO discretization of scalar elliptic equations on complex geometries.Thus, we begin by describing in Section 2 the HHO discretization of the diffusion equation as model problem.In Section 3, we recall the ingredients in the construction of the nested multigrid method, 12 and extend its range of application to non-nested meshes through the use of the 2 -orthogonal projection.Insofar as the newly used 2 -orthogonal projection depends, in its exact evaluation, on the expensive computation of the intersections between coarse and fine elements, we devote Section 3.3 to the definition of a cheaper approximate operator.We also propose in Section 4 a suited non-nested and polytopal coarsening strategy based on element agglomeration and face collapsing.Finally, numerical tests are presented in Section 5, which demonstrate the optimality of our multigrid method, used as a solver, on 2D and 3D diffusion problems.

HHO Discretization
This section briefly recalls the standard HHO discretization 5, Section 3.1 of problem (2.2).

Mesh definition
We consider a mesh ( ℎ ,  ℎ ) of Ω in the sense of 5,Definition 1.4 , where  ℎ denotes the set of polyhedral elements,  ℎ denotes the set of faces, and ℎ ∶= max ∈ ℎ diameter( ) is the meshsize 5,Def. 1.4 .Meshes of practical relevance included in this definition correspond to decompositions of the domain into polyhedra not necessarily convex or star-shaped, and possibly including hanging nodes.In the context of problem (2.2), we further assume that the diffusion tensor is constant over every element ∈  ℎ , and we denote ∶= | for all ∈  ℎ .The faces that lie on the boundary of Ω are collected in  B ℎ and are referred to as boundary faces.Then,  ℎ may be partitioned as , where  I ℎ denotes the set of internal faces.For ∈  ℎ ,  denotes the set of faces that lie on the boundary of .Reciprocally, for any face ∈  ℎ ,  ∶= { ∈  ℎ | ∈  } denotes the set of elements that have as one of their faces.

Discrete setting
With ≥ 0 a given integer, we denote by ℙ ( ) the space spanned by the restriction of -variate polynomials of total degree ≤ (in short, of degree ) to ⊂ Ω.If is a mesh face ∈  ℎ , then ℙ ( ) is isomorphic to the space of ( − 1)-variate polynomials of total degree ≤ .Denote by ≥ 0 the polynomial degree of the skeletal unknowns in the HHO scheme.We introduce the following broken polynomial spaces, respectively supported by the mesh and its skeleton: from which we build the global space of hybrid variables For all ∈  ℎ , its local counterpart is 3) The homogeneous Dirichlet boundary condition is strongly accounted for in the following subspaces: Given the exact solution of (2.2), the element and face unknowns of the HHO method (2.6) can be interpreted as the local 2 -orthogonal projections of onto the respective mesh elements and faces (refer to 5,Eq. (2.34) ).For all ∈  ℎ , the local potential reconstruction +1 ∶ → ℙ +1 ( ) is defined such that, for all = , ( ) ∈ ∈ , +1 satisfies (2.4b) where denotes the unit vector normal to pointing out of .Given the local interpolate ∈ of a function ∈ ( Ω), +1 reconstructs a higher-order approximation of .In fact, it is shown 5, Section 3.1.2that +1 is the local elliptic projection of onto ℙ +1 ( ).

Discretization of the model problem
The global bilinear form ℎ ∶ ℎ × ℎ → ℝ is assembled from elementary contributions as follows: where for all ∈  ℎ , the local bilinear form ∶ × → ℝ is defined as In this expression, the first term is responsible for consistency while the second, involving the bilinear form ∶ × → ℝ, is required to ensure stability of the scheme.The stabilization bilinear form must follow specific design conditions 5, Assumption 3.9 implying, in particular, that it must depend on its argument only through the difference operators ∶ → ℙ ( ) and ∶ → ℙ ( ) for all ∈  , respectively defined such that, for all ∈ , ∶= ( +1 − ) and ∶= ( +1 − ) for all ∈  .
These operators capture the higher-order correction that the reconstruction +1 adds to the element and face unknowns, respectively.A classical expression for is the following: where ∶= ⋅ for all ∈  .The global discrete problem then reads Remark 1. (Links with Hybridizable Discontinuous Galerkin methods) In the present formulation, element and face unknowns represent local polynomials of equal degree .A variant consists in taking element unknowns of degree + 1 instead of , in which case the method is linked to a special formulation of Hybridizable Discontinuous Galerkin methods. 40,41As shown in 19 , this formulation admits a reduced stabilization enabling improved convergence properties comparable to those of HHO methods.For a broad discussion on the links and differences between HHO and Hybridizable Discontinuous Galerkin methods, see 5, §5.1.6 .

Assembly and static condensation
After fixing bases for the spaces , +1 and for all ( , ) ∈  ℎ ×  ℎ , the local algebraic contributions of the bilinear form (cf. (2.5)) and of the linear form ∋  → ( , ) ∈ ℝ are, respectively, the matrix and the vector such that in which the unknowns have been numbered so that element unknowns come first and face unknowns come last; see 5, Appendix B for further details.After assembling the local contributions and eliminating the boundary unknowns by a strong enforcement of the Dirichlet boundary condition, we end up with a global linear system of the form Since element-DoFs are coupled with each other only through face-DoFs,  ℎ  ℎ is block-diagonal, therefore inexpensive to invert.The static condensation process takes advantage of this property to locally eliminate the element-DoFs: it goes by expressing in the first equation of (2.8): and then replacing  ℎ with its expression (2.9) in the second equation: thus yielding the so-called statically condensed system.After solving (2.10) for the face unknowns  I ℎ , the element unknowns are locally recovered by the local counterpart of (2.9): where denotes the restriction to of  ℎ , and the restriction of  I ℎ to the faces of interior to the domain, completed by zeros for boundary faces.

MULTIGRID METHOD ON NON-NESTED MESHES
The multigrid method presented in this section aims at solving the statically condensed system (2.10).It extends the nested algorithm of our previous work 12 to non-nested mesh hierarchies.The core of the algorithm lies in the definition of the prolongation operator, which transfers broken polynomials lying on the coarse skeleton to broken polynomials on the fine one.Compliance to non-nested settings is performed by inserting an additional step in the definition of the prolongation operator: starting with polynomials lying on the coarse faces, the nested version 12 begins with the reconstruction of a broken elementdefined polynomial on the coarse mesh.This step is unchanged.We then propose to orthogonally project in 2 -norm this coarse broken polynomial onto the non-nested fine mesh.Finally, the end of the process also follows the nested version: the trace of the result is computed on the fine faces.
Consistently with the multigrid literature, we use the subscript ∈ {1, … , } to index the levels in the mesh hierarchy, and we denote by ℎ the meshsize at level .We sort the levels so that refers to the finest mesh, whereas 1 corresponds to the coarsest one, i.e. ℎ < ℎ −1 < ⋯ < ℎ 1 .Importantly, we further assume that, from a level > 1 to the immediately coarser one ( − 1), not only are the elements coarsened, but so are the faces.Although this assumption seems natural, it discards mesh hierarchies obtained from a fine mesh by agglomerating elements and keeping fine faces at all levels.To ease the notation, the subscripts ℎ may be replaced with , so that the hierarchy of non-nested polyhedral meshes may be denoted by ( ,  ) = …1 .Now, considering two successive levels > 1 (fine) and − 1 (coarse), we define the prolongation operator ∶  −1 ,0 →  ,0 as the composition of three operators, where the coarse level potential reconstruction operator reconstructs, from the face unknowns, a broken polynomial of degree is now taken equal to the 2 -orthogonal projection to include the non-nested case; finally the trace operator Π ∶ +1  →  ,0 maps the polynomials of degree + 1 defined on the fine elements to a broken polynomial of degree on the fine skeleton.

The potential reconstruction operator Θ
Given a trace error function  ∈  ,0 as the operand of Θ , we recover an element-defined error function in  by locally reversing the static condensation process.For all ∈  , let us denote by  ∶= ( ) ∈ the restriction of  to the faces of , and by  ∶= ( ) ∈ its algebraic representation as vectors of coefficients in the selected polynomial bases.With these notations, we define the algebraic volumic error via the linearized version of (2.11): Note that the removal of the affine term of (2.11) is justified by  representing error functions. 12,3.2.1 Once the element-defined error ∈ is retrieved via its algebraic representation from (3.2), its accuracy can be improved by one order of approximation through the use of the higher-order reconstruction operator +1 defined by (2.4).The local contribution to Θ is finally given by Note that, although this step of higher-order reconstruction is optional for the convergence and algorithmic scalability of the multigrid method, it improves the convergence rate. 12,4.4.1

The trace operator Π
The trace operator is also locally defined.Let ∈ +1  be an element-defined broken polynomial, and ∈  .If is a boundary face, then (Π ) | is set to be zero; else, (Π ) | is built as the weighted average of the traces of on both sides of , which is subsequently orthogonally projected onto ℙ ( ) in order to lower the degree, i.e.

Approximation of the 2 -orthogonal projection
In the second step of the prolongation, the global broken polynomial, reconstructed on the coarse mesh from the DoFs on the coarse faces, is projected onto the fine mesh via the 2 -orthogonal projection operator denoted −1 ∶  −1 →  .The construction of this projection requires the computation of the 2 -inner products of all pairs of coarse/fine basis functions over the fine elements.Locally, one such integral can only be non-zero if the supports of the fine and coarse basis functions intersect, i.e. if the respective fine and coarse elements overlap.In this case, the non-zero part of the integral is restricted to the intersection of the fine and coarse elements.
For any element , we denote by  the selected basis for ℙ ( ).For all ∈  −1 and ∈  , we then have Consequently, the exact construction of −1 preliminary requires the computation of the geometric intersection of every pair of overlapping fine/coarse elements.We can see two drawbacks to this method: firstly, the computation of the intersections adds a costly load to the computational burden.Although the actual overhead depends on the efficiency of the algorithm and its implementation, our numerical tests based on the state-of-the-art geometric library CGAL 42 find it too heavy for practical use.Secondly, the intersection of usual element shapes, even plain simplices, generally yields a polytope.In practice, if only simplicial meshes are used, one might want to avoid introducing other shapes, over which integral computation may be more expensive.
Given these practical limitations, we introduce a mechanism to compute the operator −1 in a cheaper and simpler way, though approximately, by avoiding computing element intersections altogether.It is based on the following approximation (exact when the meshes are nested): For any pair of coarse and fine elements ( , ), ∩ ≈ if is the coarse element which "embeds the most", According to the element shapes, determining which coarse element embeds the largest part of a given fine one may have multiple interpretations, which may also be implemented in different fashions.For simplicial elements, it suffices to choose the coarse element containing the barycenter of the fine one.Following this model, each integral over an intersection is either discarded or computed over the whole fine element, in which case the support of the coarse basis function is implicitly extended to include the entire fine element.The number of integrals to compute is then reduced to the number of fine elements.
The validity of this method hinges on the assumption that, if the solution is smooth enough and if the fine elements do not stick too much out of the coarse element they are associated to, global approximation properties are preserved.Figure 1a shows on a triangulated 2D example how the fine elements are distributed among the coarse ones and how much they can stick out.Graphically, the more the colored clusters of fine elements resemble their respective coarse elements, the less the coarse basis functions must be extended outside of their supports, thus losing accuracy, and therefore the better the approximation of the integrals.Especially, the higher the polynomial degree, the greater the negative impact of the support extension.Outside the element, a high degree polynomial may be an especially poor approximation of the solution.For the same reason, local smoothness of the solution is required, which emphasizes the restriction not to have elements crossing the boundaries of physical subdomains, i.e., where coefficient discontinuities can occur.
Our numerical tests show that in 2D, on the triangulated square, the approximate operator constructed by this method exhibits good enough accuracy to reproduce the results obtained with the exact computation only for ≤ 1, and fails to be a good approximate for higher orders.In 3D, the method is unsuccessful for all orders.The method is improved the following way: instead of approximating intersections by an all-or-nothing result (either the whole fine element or the empty set) as in (3.7), we can increase the granularity by subdividing the fine elements, and distribute the subshapes among the coarse elements the same way as before.Assuming, for any fine element ∈  a given subdivision Sub( ), and denoting by closest(⋅) the function  associating to any element or subelement its "closest" coarse element, we then define the new approximate intersection as (3.8) Figure 1b shows how sub-triangles obtained by connecting the middle-edges of the fine elements now approach the coarse ones.
It is clear that the quality of the approximation depends on the granularity of the subdivision.A fortiori, the higher the polynomial degree, the finer the subdivision must be.For that purpose, the fine elements can then be refined multiple times, though at the cost of an increasing number of integrals to compute.However, the approximate operator derived from the fine elements being subdivided only once is found to be sufficient to achieve good multigrid results in 3D and for low polynomial orders.We refer to Section 5.2 for the details of the numerical tests.
Remark 2. (Heterogeneous case with curved region boundary) If the geometry is partitioned into multiple physical regions with curved boundary, the different levels of grid may approximate this boundary in a non-nested way.Then, imposing that jumps in the coefficient do not occur inside elements inevitably leads to coarse elements overlapping fine ones of different regions.Figure 2 illustrates a two-region domain separated by a circular interface.The approximation of the circle and therefore the discrete delimitation of the regions depend on the granularity of the mesh.Figures 2a and 2b, respectively, show a fine and coarse mesh obtained by Delaunay triangulation, while Figure 2c shows how coarse elements belonging to the red region overlap fine triangles of the blue region.These cases must be handled with great care.In such a configuration, it is crucial that values of DoFs belonging to one region should never be transferred to another one.The 2 -orthogonal projection must then keep this separation.Indeed, the solution cannot be locally represented with accuracy unless it is kept smooth inside elements.In other words, the intersection between a coarse element and a fine one that belong to two different regions must never be computed, and be assimilated to the empty set, even though, in fact, they geometrically overlap.Infringing this guideline results in a quick deterioration of multigrid convergence, and divergence can even occur for small-size problem provided a large jump in the coefficient.

Multigrid components
Having defined the prolongation operator, the restriction is set to its adjoint in the usual fashion, i.e. with respect to the coarse and fine face-defined global inner products.Algebraically, the matrix of one transfer operator in the chosen polynomial basis is then the transpose of the other: given P the matrix representation of the prolongation operator , the matrix representation of the restriction operator is given by R ∶= P ⊤ .Coarse grid operators come from the rediscretization of the problem on the coarse meshes.In order to relax together all the unknowns related to the same local polynomial, block versions of standard fixed point iterations (like damped Jacobi or SOR) should be used, with a block size corresponding to the number of DoFs per face.For instance, for a 3D problem with the polynomial degree = 1 chosen for the face-defined polynomials, the blocks will be of size 3 × 3 (whatever the element types used).For our numerical tests, the relaxation method is set to the block Gauss-Seidel iteration.In order to keep the computational cost as low as possible, we use the post-smoothing-only cycles that were found to be the most efficient 12, §4.2.2 in terms of total theoretical work or CPU time.In particular, we use V(0,3) in 2D and V(0,6) in 3D.Finally, on the coarsest level, the system is exactly solved by a direct method.

AGGLOMERATION-BASED COARSENING STRATEGY WITH FACE COLLAPSING
In this section, we lay the foundations of an abstract coarsening strategy for polytopal meshes that also coarsens faces.The steps are descibed by Algorithm 1.We recall that in this abstract setting, the generic term 'face' refers to the interface between elements (it being an actual face in 3D or an edge in 2D), and the term 'neighbours' refers to a pair of elements sharing a face.
Step 1 is standard and defines any agglomeration method.The face coarsening comes from Step 2, where multiple fine faces are collapsed into a single coarse one.Figure 3 illustates the result of successive coarsenings in 2D.Recall that the created polygons are not necessarily convex.Also, note that this algorithm does not ensure that every fine face is either removed (being interior to an agglomerate) or coarsened (by face collapsing): some fine faces may find themselves unaltered by the process.However, numerical experiments show that the number of unaltered faces between two successive levels decreases with the number of times the coarsening strategy is applied, i.e. the number of levels built.

Algorithm 1 Abstract coarsening strategy with face collapsing
Step 1. Agglomerate each fine element with its non-already agglomerated neighbours to form one polytopal coarse element.
Step 2. Collapse into one single coarse face the interfaces between two neighbouring coarse elements that are composed of multiple fine faces.

Remark 3. (Preventing domain erosion)
The face collapsing step removes vertices from the mesh.In order to keep the domain from "eroding" at its corners (whether at domain boundaries or at interfaces between inner regions), one must make sure that the vertices that describe the domain geometry do not vanish in a face collapsing operation.To do so, it suffices to prevent the faces sharing such vertices from collapsing into one coarse face.Applied to a square domain or inner region, it means that at each of the four corners, the two orthogonal corner edges should never collapse together, so that the global squared shape would be preserved.

Coarsening in 2D
In 2D, the abstract Step 2 can be detailed as such: for all interfaces composed of multiple fine edges, remove all vertices interior to that interface and connect the boundary vertices to form a new coarse edge.This way, we ensure node-nestedness, i.e. that the set of vertices of the coarse mesh is a subset of the fine vertices.Although having a node-nested hierarchy is not mandatory to achieve good multigrid performance, it may help: assuming that the fine nodes adequately capture the geometry, using subsets of those nodes at coarse levels should present some desirable properties with respect to the geometric approximation.
Regarding the computation of the 2 -orthogonal projection developed in the preceding section, numerical experiments show that the rough approximation given by (3.7) is not viable; see Figure 4a for a graphical illustration.The polygons must be subdivided to improve precision and, clearly, the method of subdivision influences the ultimate accuracy of the approximation.While a barycentric triangulation (Figures 4b and 4e) only offers practical usability for ≤ 1, an optimal triangulation (i.e.yielding an exact approximation; see Figures 4c and 4f) can be derived in the context of a coarsening strategy, inasmuch as the intergrid relationships are explicitly formed and can therefore be exploited.In the present coarsening strategy, the agglomeration step does not cause non-nestedness, which can only occur during the edge collapsing phase.Firstly, only fine elements possessing an edge that has been collapsed and that crosses the coarse collapsed edge must be subject to a careful subdivision; the other fine elements are fully embedded in a coarse one, and therefore do not need to be subdivided at all.Then, for the relevant fine elements, it suffices to generate triangles keeping on one side of the coarse edge.Algorithm 2 presents the details of the optimal subdivision we have used.Note that it applies to convex polygons; the non-convex ones require a preliminary step of convex partitioning before Algorithm 2 can be applied.This algorithm starts by triangulating the polygon independently of the coarse edges.Each triangle is then subtriangulated in order not to cross the coarse edges, following Algorithm 3. Note that this algorithm relies on evaluations of intersections between coarse and fine edges, which node-nestedness can certainly ease, insofar as a large part of the crossings between coarse and fine edges will then occur at mesh vertices.

Algorithm 2 SubdivideConvexPolygon( , )
Input: : set of vertices sorted in direct order, representing a convex polygon.: set of coarse edges not to cross.Output: : set of triangles partitioning the polygon and not crossing the coarse edges.

Coarsening in 3D
We have not generalized the preceding coarsening strategy to 3D.Instead, we briefly state the issue and propose directions.
In the 3D setting, it is generally not possible to collapse neighbouring faces into a single one as straightforwardly as in 2D, because the edges framing the fine faces do not usually describe a planar region that could define a new face.However, allowing the addition of new vertices, one can derive a variety of face collapsing methods.A simple one consists in choosing three noncolinear vertices amongst those of the fine faces, thus defining a 2D plane, and then orthogonally projecting the frame's vertices onto that plane to obtain the coarse face.However, in an attempt to preserve, on the coarse mesh, the approximation of the (f) Adapted triangulation preventing subtriangles from overlapping two coarse elements.

FIGURE 4
The top figures show how the fine polygons (in (a)) or their subtriangulations (in (b),(c)) are clustered in the process of approximating the coarse/fine intersection involved in the computation of the 2 -orthogonal projection.The coarse edges are represented by thick black segments.In (b), the fine elements are triangulated by a barycentric method.In (c), the triangulation is adapted to prevent subtriangles to overlap multiple coarse elements.The bottom figures zoom in on a fine element overlapping two coarse ones.In (d), the whole fine element is affected to one of them for the computation of the approximate 2 -orthogonal projection.In (e) and (f), the subtriangles are dispatched on one or the other coarse element according to the location of their barycenters.geometry given by the fine one, a more suitable choice for the plane defining the collapsed face would be one minimizing the distance to the vertices lying on the edge frame of the fine faces.

Experimental setup
The numerical tests presented in this section have been performed on the diffusion problem (2.1), on 2D and 3D domains, where the source ∈ 2 (Ω) is a discontinuous piecewise constant function.The unit square and cube shall be used as preliminary tests before moving on to more complicated domains.The problems are discretized by the HHO method described in Section 2, in which the polynomial degree of the element-and face-defined polynomials is taken between 0 and 3. We recall that the discrete solution ultimately provided by the method after higher-order reconstruction is a broken polynomial of degree + 1.The local polynomial bases in elements and on faces are 2 -orthogonal Legendre bases.
The multigrid method defined in Section 3.4 is used as a solver for the solution of the statically condensed system (2.10), and our main goal is to study its asymptotic behaviour so that it can be used for large scale problems.The fine mesh is obtained by Delaunay triangulation of the domain.All triangulations are generated by the mesher GMSH 43 with default configuration.Coarse meshes are built until the system reaches a maximum size of 1000 unknowns or if the mesh cannot be coarsened anymore.Two different strategies are employed to build the coarse meshes: (i) independent remeshing: letting ℎ be the meshsize of the fine simplicial mesh, the coarse mesh is obtained independently by retriangulation of the domain, enforcing a meshsize ≈ 2ℎ; (ii) for 2D problems, the agglomeration-based coarsening strategy with face collapsing, described in Section 4. Table 1 summarizes, for each strategy, the various methods evaluated in the numerical tests to compute the 2 -orthogonal projection.The stopping criterion is based on the backward error ‖ ‖ 2 ∕‖ ‖ 2 , where denotes the residual of the algebraic system, the right-hand side, and ‖ ⋅ ‖ 2 the standard Euclidean norm on the vector space of coordinates.In all tests, we say that convergence is achieved when the criterion ‖ ‖ 2 ∕‖ ‖ 2 < 10 −8 is reached.

Assessment of the approximate 2 -projection
We want to assess how the loss of accuracy implied by the approximate 2 -orthogonal projection (see Section 3.3) affects the convergence of the multigrid solver.In order not to add other difficulties that would interfere with the results, we use the unit  4e) Exact = Approximate w/ optimal subtriangulation (cf (3.8) and Figure 4f) Summary of the testing combinations.
square as domain of study.The coarse meshes are built independently from each other by retriangulation of the domain, ensuring good quality at every level.As a reference to the best achievable result, a first test is made with the 2 -orthogonal projection operator computed exactly, meaning that the intersections of fine and coarse elements are actually computed, over which the required inner products are evaluated (as per (3.6)).Our implementation uses the CGAL library 42 to compute intersections.In this first setting, Figure 6a shows, for all polynomial degrees , the scalable behaviour of the solver, whose convergence rate appears to be independent of the number of unknowns.ig. 4.5 This first experiment shows the validity of our non-nested approach to multigrid.

FIGURE 6
Number of V(0,3)-cycle iterations to achieve convergence according to the number of unknowns in the system.The geometry is the unit square, and the hierarchy of meshes is obtained by independent remeshing.Each caption states the method of computation of the 2 -orthogonal projection.
It is important to mention that -in our implementation -the computation of the intersections consumes for = 1 over 80% of the CPU time used during the setup phase.The approximate 2 -orthogonal projection we propose allows to reduce the setup cost by a factor of 10 to 40, depending on the granularity of the fine element subdivisions.As an example, Figure 7 compares, for a test problem with = 1, the CPU time consumed to compute the 2 -orthogonal projections during the setup phase, according to the method used: exact evaluation, approximate without subdivision (given by (3.7)), approximate with subdivision (given by (3.8)).One can clearly see that (i) the time spent on the construction of the prolongation operators is essentially consumed by the 2 -orthogonal projections (color bars vs. white bar); (ii) the largest share is spent in the exact evaluation of the intersections, which makes approximate methods remarkably more effective.Note that these measurements depend on the implementation and hardware.Therefore, any comparison of CPU times should be interpreted with caution.With this restriction, we use the timings to illustrate the efficiency improvement when using the approximations.

FIGURE 7
Comparison in CPU time of different methods for the computation of the 2 -orthogonal projection during the setup phase of the multigrid.Each bar is divided into sections corresponding to subtasks.Colored bars corresponds to subtasks of the 2 -orthogonal projection, to which is added a white bar corresponding to the rest of the operations performed to construct the prolongation operators, namely, the assembly of Θ −1 and Π , as well as the construction of the prolongation operators by (3.1).The test problem is the unit square meshed by 10 5 triangles, = 1.Six levels are built.The time displayed for each subtask corresponds to the sum of the times consumed at every level for that subtask.
In order to assess the usability of our approximate methods, we now compare to the first scalability results the performance of the solver delivered by our approximations.The approximate 2 -orthogonal projection without subdivision (i.e.(3.7)) shows, in 2D, a lack of robustness w.r.t. the polynomial degree.Although the approximation does not seem to degrade the convergence rate for ≤ 1, it worsens with = 2, and the solver finally diverges for = 3.Moreover, the same test performed in 3D on the unit cube causes the solver to diverge for all values of , and so does the use of the 2D polygonal coarsening strategy defined in Section 4. These limitations make us discard this simple method.
We next focus on the finer approximation (3.8),where the fine elements are subdivided into subshapes to increase the granularity of the fine-coarse associations.In our tests, we use standard refinement methods to subdivide simplicial elements: connection of the middle-edges in 2D, and Bey's tetrahedral refinement 26 in 3D.Only one step of refinement is performed.Figure 6b presents the performance of the multigrid solver on the unit square using this refined 2 -orthogonal projection.We can see that the results given by the exact method are now reproduced.On the unit cube (Figure 8), the solver exhibits good performance and scalable behaviour for ≤ 2, but diverges for = 3.Note that these 3D results cannot be compared to those of the exact method, as the latter has not been implemented.Referring to the discussion in Section 3.3, we stress that higher orders can be managed with additional steps of refinement in order to improve the accuracy of the approximate 2 -orthogonal projection.Given the CPU times of Figure 7, multiple refinements would still be beneficial compared to an exact computation of the intersections.The geometry is the unit cube, which is independently retriangulated at each level.The 2 -orthogonal projection is computed approximately with subtriangulation of the fine elements via one step of Bey's refinement method.

Assessment of the agglomeration coarsening strategy with face collapsing
We now evaluate how the multigrid method responds to the coarsening strategy described in Section 4, especially when coupled to the approximate 2 -orthogonal projection.Figure 9 then presents the same scalability tests as the previous section, this time using the agglomeration coarsening with face collapsing to successively build the coarse meshes from an initial fine triangulation.As a reference, Figure 9a shows the results when the 2 -orthogonal projection is computed exactly.We can already remark that the convergence rate is slightly worse and the trend slightly less flat than with the independent remeshing (cf Figure 6a).Various reasons can explain these differences, the main one probably being the simplicity of implementation of our coarsening strategy, where we do not control and subsequently improve the element shapes and sizes.
Algorithmic scalability tests with the 2 -orthogonal projection approximately computed without subdivision of the fine elements are not presented here: except for = 0, the solver quickly diverges.In Figure 9b, we use the barycentric triangulation to subdivide the fine elements (cf Figure 4e) and implement the approximation (3.8).We can see that the good perfomance of = 1 is now recovered, while for = 2, the solver still diverges at moderate problem sizes.Finally, we stress that in the context of the coarsening strategy, the determination of an optimal fine subtriangulation (i.e. whose subelements do not overlap the limits of the coarse elements) is facilitated, which yields an exact implementation of the 2 -orthogonal projection, and therefore leads to results equivalent to Figure 9a.

FIGURE 9
Algorithmic scalability plots: number of V(0,3)-cycle iterations to achieve convergence according to the problem size.The geometry is the unit square.The coarse meshes are built using the agglomeration coarsening strategy with face collapsing.The captions state the method of computation for the 2 -orthogonal projection.The absence of the curve = 3 in (b) means a very large number of iterations or divergence of the solver.

Complex geometry test cases
We now extend the experiments to complex geometries requiring unstructured meshes in order to validate the method on a wider class of problems.Figures 10a and 11a respectively draw similar geometries in 2D and 3D containing circular (resp.cylindric) holes, thus requiring unstructured meshes in their discrete settings.Still focusing on the capablility of the multigrid solver to handle large scale problems, Figures 10b and 11b present the respective algorithmic scalability plots of the solver.Starting from a fine simplicial mesh obtained by Delaunay triangulation, the coarse meshes are built using the coarsening strategy with face collapsing for the 2D tests, and by independent remeshing for the 3D tests.Note that in 2D, the displayed mesh then corresponds to a coarse triangulation that is actually not used (the fine mesh is triangular); but it shows how the holes are approximated by linear edges.In particular, whichever the method, the holes are approximated, at each level, with respect to the mesh granularity, i.e. by polygons/polyhedra with less and less edges/faces as the mesh grows coarser.Especially, no curved faces are used.The 2 -orthogonal projection is computed exactly in 2D through the construction of an optimal subtriangulation of the elements.In 3D, the 2 -orthogonal projection is computed approximately with one tetrahedral subdivision by Bey's method.The 2D results presented in Figure 10b show that our multigrid method still exhibits the desired scalable behaviour for all ≤ 3.In 3D, the results of Figure 11b consistently show the same scalable behaviour for expected polynomial degrees, namely = 1 and 2. The case = 0 is a special case 12 which is not expected to necessarily work on unstructured problems, while the approximation of the 2 -orthogonal projection explains the divergence of the solver for = 3 (cf Figure 8 for the test on the unit cube).Although the displayed data points have been obtained with Dirichlet boundary conditions, the results with Neumann conditions on the holes, not reported here for the sake of brevity, indicate the same behaviour.

Heterogeneous test case
The algorithm is finally tested on a heterogeneous domain, plotted in Figure 12a.This test case, provided by Electricité de France (EDF), is a proxy application for an industrial setting, which is characterized by jumps in the diffusion coefficients of several orders of magnitude in an otherwise relatively straightforward geometry.The proxy differs from the industrial setting in its geometry and its diffusion coefficients.However, the most salient feature, the ratio between the largest and the smallest diffusion coefficient, is close to the relevant regime.The domain is composed of four homogeneous and isotropic subdomains.
Having assigned a color to each of those regions, their respective diffusion tensors color , where denotes the identity matrix of dimension 2, define a global discontinuous tensor.The values of the coefficients color are given in the caption of Figure 12a and lead to a maximum jump of 10 8 located at the interface between the gray and blue regions.All meshes in the multigrid hierarchy align with the jumps, i.e. jumps only occur at interfaces and not inside elements.When the coarse meshes come from independent remeshing of the domain, the scalability plot of Figure 12b demonstrates a convergence rate that is near-independent to the mesh size for all degrees except the lowest order.With the coarsening strategy, on the other hand, (cf. Figure 12c), algorithmic scalability is achieved for all degrees.Moreover, consistently with the results on nested meshes 12, §4.3 , the convergence rate of the multigrid method is found to be independent of the size of the discontinuities in the coefficient.

FIGURE 12
The heterogeneous domain described by (a) is composed of homogeneous, isotropic regions.Their respective diffusion coefficients are given in the caption.In (b), the coarse meshes are built by retriangulation and the 2 -orthogonal projection is computed approximately with subtriangulation of the fine elements.In (c), the coarsening strategy is used.
Remark 4. (Corner singularities) Heterogeneity in such a domain creates corner singularities which make the discretization lose its optimal convergence rate in the general case.Consequently, although the linear solver converges fast, it converges towards a solution that is not necessarily accurate.It is therefore not imperative to impose the linear solver to reach such a low algebraic error.In order to recover the optimal convergence rate of the discretization, techniques of local mesh refinement or energy correction 44 must be used.

Computational insight
Having focused our numerical tests on the asymptotic convergence of the solver, we now also comment on the computational cost of the iterations.We emphasize that the operation of prolongation remains local, therefore, the corresponding matrix is sparse.Indeed, given a coarse face interfacing two coarse neighbours, its prolongation stencil is limited to the faces linked to the fine elements that overlap those two coarse neighbours.During the setup phase, the prolongation operators are computed at every level and stored in memory.For the smoothers (namely, block Gauss-Seidel methods), the factorization of the diagonal blocks is also computed once and stored in memory to be reused at every iteration.With that setup, and using the cycles described in the numerical tests, intergrid transfers compose about 30% of the total computational work (in flops) of the multigrid iteration, regardless of the space dimension and the polynomial degree.In our implementation, that corresponds to less than 10% of the CPU time.Assuming negligible cost for coarse solving, the rest of the work is distributed among smoothing and residual computation.

CONCLUSION
In this work, we have successfully extended to non-nested mesh hierarchies an efficient nested multigrid solver.Indeed, without requiring stronger smoothing, our adaptation allows to preserve the optimality of the convergence rate on a wider class of problems.The extra cost implied by the numerical evaluation of the 2 -orthogonal projection is also kept to a minimum thanks to an efficient approximation enabling us to discard the expensive computation of intersections between elements.The agglomeration coarsening strategy with face collapsing that we have developed, although naively implemented here, gives promising results, thus offering leads to more advanced methods fulfilling two purposes: coarsening the faces, and preparing the subtriangulation of the fine elements for the purpose of the construction of an accurate approximation of the 2 -orthogonal projection operator.The management of high orders also leaves room for additional research.Plugging a -multigrid algorithm on top of this one at order = 1 is another approach, which will be investigated in future work.

( a )
Distribution of the fine elements to their "closest" coarse element.(b)Distribution of the fine elements' sub-triangles to their "closest" coarse element.

FIGURE 1
FIGURE 1 (a) shows the superposition of coarse and fine triangular meshes.The coarse edges are represented by thick black segments.The fine triangles are colored according the coarse triangle they are "closest" to, i.e. the one containing their barycenters.(b) shows the same colored partitioning, this time for the fine elements' sub-triangles.

FIGURE 2
FIGURE 2 (a) (resp.(b)) presents a fine (resp.coarse) mesh obtained by Delaunay triangulation of a square domain embedding a round physical region.In (c), the red coarse elements overlapping fine blue ones are plotted.

FIGURE 3
FIGURE 3 Successive coarsenings of a triangulated square.

( a )
Distribution of the fine elements to their "closest" coarse element.(b) The fine elements are subtriangulated by a barycentric method.(c) The fine elements are subtriangulated such that they do not cross any coarse element's edge.(d) Zoom-in on a fine element overlapping two coarse ones.(e) Barycentric triangulation.
First three steps of the triangulation of a convex polygon.(b) Division of a triangle according to the intersection between the red coarse edge and the triangle edge [ , ].

FIGURE 5
FIGURE 5Respective illustrations for Algorithm 2 and Algorithm 3.

3 FIGURE 8
FIGURE 8Number of V(0,6)-cycle iterations to achieve convergence according to the number of unknowns in the system.The geometry is the unit cube, which is independently retriangulated at each level.The 2 -orthogonal projection is computed approximately with subtriangulation of the fine elements via one step of Bey's refinement method.

FIGURE 10
FIGURE 102D complex geometry and associated algorithmic scalability plot for the multigrid solver.The coarse meshes are built using the agglomeration coarsening strategy with face collapsing.The 2 -orthogonal projection is computed exactly.
Algorithmic scalability plot with the V(0,6)-cycle.Details on the last datapoint of = 1.

FIGURE 11
FIGURE 11 3D test case using the geometry (a).The algorithmic scalability plot (b) of the solver is obtained with the coarse meshes independently retriangulated at each level, and the 2 -orthogonal projection computed approximately using Bey's subdivision.
Algorithmic scalability plot using the coarsening strategy with face collapsing.Details on the last datapoint of (c) = 1.