On the relevance of descriptor fidelity in microstructure reconstruction

A common strategy for reducing the computational effort of descriptor‐based microstructure reconstruction in the Yeong–Torquato algorithm lies in restricting the choice of descriptors to an efficiently computable subset. As an alternative, the number of iterations can be reduced by gradient‐based optimization as in differentiable microstructure characterization and reconstruction (DMCR). This allows for, but does not require, the use of a set of informative, high‐dimensional and computationally expensive descriptors that would be unfeasible for a high number of iterations. For this reason, the present work investigates the role of descriptor fidelity on microstructure reconstruction results. More precisely, spatial two‐ and three‐point correlations as well as the lineal path function are computed on 2D planes as well as on 1D lines. These descriptors are used for reconstruction with the Yeong–Torquato and DMCR algorithm and the results are compared throughout various microstructures, respectively.

A modern and simple answer to the microstructure reconstruction problem naturally lies in image-based machine learning methods [5].The central idea is to train a generative machine learning model on a data set of microstructures and then to sample new realizations of the same structure from it.Initial attempts like local neighborhood search [6] have been inspired from the field of texture synthesis [7].Recently, the advances in modern machine learning have been transferred to microstructure reconstruction using various deep architectures like generative adversarial networks (GANs) [8][9][10][11] as well as hybrid methods with autoencoders [12,13], transformers [14], and diffusion models [15,16].While these methods achieve excellent results throughout a variety of material classes, the disadvantage of data-based methods is the necessity of a potentially large data set.
In contrast to these purely data-based techniques, classical approaches often quantify the information on the microstructure morphology in terms of statistical descriptors and reconstruct microstructures from that.Depending on how generic this morphology and the corresponding descriptors can be, the associated reconstruction process can either be formulated as a highly performant, but material-specific algorithm or a more computationally expensive, generic algorithm.Examples for the former are the sequential addition and migration algorithm for fiber reinforced composites [17,18], the CMG algorithm for concrete [19], DREAM.3D,Kanapy and Neper for metallic materials [20][21][22], and various other methods reviewed in Bargmann et al. [23].The latter category contains the well-known Yeong-Torquato algorithm [24] and derived forms of it such as differentiable MCR (DMCR) [25,26] and the numerous approaches reviewed in Bostanabad et al. [27].
The Yeong-Torquato algorithm directly solves an optimization problem in the space of possible microstructures, which are represented directly in terms of pixel or voxel values.For this purpose, the loss function of a microstructure is given by means of a descriptor that quantifies the morphology of the structure.The target descriptor can be computed from a reference image or prescribed directly.In an iterative procedure, random mutations are applied to the microstructure until a convergence criterion is reached.Due to the combinatorial complexity of the microstructure representation, the search space is restricted to structures with the correct volume fraction.Furthermore, a stochastic optimization algorithm is used in order to avoid local minima.This gradient-free, stochastic optimization in high-dimensional spaces becomes computationally challenging at high resolutions and in 3D, where billions of iterations are required to reach convergence [28].For this reason, the simulated annealing procedure is often adapted [27].As an example, a different-neighbor sampling rule increases the probability of pixels at phase boundaries to be selected.Furthermore, multigrid schemes or two-stage methods have been introduced [29][30][31][32][33]. Finally, differentiable descriptors can be used in order to solve the optimization problem using gradient-based optimizers.This is known as DMCR [25,26], which is implemented in the MCRpy software [34] and is validated in Seibert et al. [33].Some recent approaches in the literature can be identified as special cases of DMCR [35][36][37][38].
Finally, a major factor enabling the practical application of the Yeong-Torquato algorithm lies in limiting the choice of descriptors to an efficiently computable subset.For example, the spatial correlations are commonly computed only along certain predefined directions, for instance, in radial direction or on 1D lines along the coordinate axes, but not for all possible 2D vectors [28,39].Furthermore, in order to make high iteration numbers possible, some state-of-the-art algorithms minimize the computational cost per iteration by computing descriptor updates based on pixel swaps instead of recomputing descriptors from scratch in every iteration [28,40].To the authors' knowledge, this has been done for the two-point correlations, but not for the three-point correlations or the lineal path function, and therefore sacrifices some of the descriptor flexibility of the Yeong-Torquato algorithm.This motivates the presented investigation, where the choice of descriptor as well as its fidelity (1D or 2D) is compared by means of three chosen 2D microstructure examples.

Microstructure descriptors
Microstructure descriptors provide a stochastic and translation-invariant quantification of the microstructure's morphology.Similar structures, even from the same specimen, might be very different in terms of the pixel space or any other deterministic parametrization.However, statistical descriptors make the similarity and discrepancy of microstructures quantifiable.
The present work is based on the spatial two-and three-point correlations1 as derived in Seibert et al. [25] and the differentiable approximation to the lineal path function [41] as introduced in Seibert et al. [34].Intuitively, the -point autocorrelation   (⃗  1 , … , ⃗  −1 ) quantifies the probability of the vectors ⃗  1 , … , ⃗  −1 starting and ending in a given phase if they are all placed at a randomly chosen position.While the full  2 and  3 have been used for characterization and reconstruction in the literature, higher-order correlations for  > 3 are usually reduced to a selected subset in view of computational resources [44].An array containing  2 (⃗ ) for all ⃗  on a pixel or voxel grid microstructure  can be efficiently computed as where  and  −1 denote the forward and inverse Fourier transform, ⊙ is the Hadamard product, and (⋅) * denotes the complex conjugate.As an alternative, the convolutions with a mask representing ⃗  can be carried out explicitly in order to obtain  2 [25].Although this scales more unfavorably with the number of vectors and the microstructure resolution than the Fourier approach, it can be beneficial to compute the correlations only along certain 1D lines.Furthermore, it yields the three-point correlation  3 with very little computational overhead as outlined in Seibert et al. [25].Both approaches are implemented in MCRpy and the latter is used in the present work.
Similarly, the lineal path function (⃗ ) quantifies the probability that ⃗  lies entirely within a given phase if placed at a random position.A differentiable approximation to this function is given in Seibert et al. [34] and is used in the present work.
Many further descriptors have been developed in the last decades.In analogy to  2 (⃗ ) and (⃗ ), the cluster correlation function (⃗ ) quantifies the probability of ⃗  starting and ending in a topologically connected region of the microstructure [45].It has been shown to improve the performance of the Yeong-Torquato algorithm in some cases [45], but it is not clear how to define a differentiable formulation or approximation to this descriptor, so it is not applied in the present study.Furthermore, general but low-dimensional descriptors such as Minkowski functionals [46,47] and entropic descriptors [48] have been developed and applied successfully, but cannot trivially be differentiated.In contrast, the total variation has been applied to DMCR especially in the context of 2D-to-3D reconstruction, where it regulates noise by quantifying the amount of phase boundary per unit volume.As a very recent differentiable descriptor, Gram matrices of the feature maps of pretrained convolutional neural networks have shown a wide range of applicability and high robustness [34,35,49,50].However, they are hard to interpret and the choice of unrelated training data sets like ImageNet seems arbitrary.Therefore, a recent contribution is dedicated to developing a wavelet-based descriptor using the scattering transform.Similarly to Gram matrices, this descriptor quantifies the structure in terms of the occurrence and relations between local features detected by wavelets.However, the features, that is, wavelets, are not learned from data but constructed in a principled manner by rotation and dilatation of a mother wavelet.Very good results and further details are given in Reck et al. [51].
This incomplete overview shows that the development of suitable descriptors is still a relevant and active field of research.It is commonly accepted that there is currently no single best descriptor for all different kinds of materials, but that different materials require different descriptors or combinations thereof [26,34].This motivates flexible algorithms and software where descriptors can be exchanged and combined easily.However, the combinatorial complexity of these possibilities exceeds the scope of the present work.Therefore, this work is restricted to (i) the two-point correlation  2 , (ii) a combination of  2 and the differentiable approximation to the lineal path function , and (iii) the three-point correlation  3 , which are all evaluated on lines and areas, respectively.

Descriptor-based reconstruction in MCRpy
Descriptor-based microstructure reconstruction is the inverse of microstructure characterization.Given a desired descriptor  des or a set of  d different descriptors { des  }  d =1 , the goal is to find a microstructure  such that the corresponding descriptors   () are as close as possible to the given values.Often, this is naturally formulated as an optimization problem where the descriptor difference is quantified by a loss function .As an example, if only the two-point correlation  2 is considered as a descriptor, the loss might be chosen as If further descriptors such as the lineal path function  are added, the loss might be formulated as a weighted sum where   2 and   are scalar weights.With its flexible plugin architecture, MCRpy [34] allows to combine any descriptors in any loss function and to solve Equation ( 2) using any optimization algorithm.This work, however, is restricted to two approaches from the literature, namely the well-known Yeong-Torquato algorithm [24] and the recently developed DMCR method [25,26].Both methods are implemented in MCRpy for arbitrary combinations of descriptors for 2D and 3D reconstruction.For reasons of computational efficiency, the present study is conducted in 2D and a multigrid scheme is used.Furthermore, the implementation of the Yeong-Torquato algorithm features a different-neighbor sampling rule.In contrast to some specialized algorithms in the literature [28,40], it does not allow the computation of descriptor updates, but recomputes the descriptor from scratch in every iteration.This has to be considered when evaluating the wallclock time, however it does not affect the number of required iterations or the quality of the solution after a given number of iterations.

NUMERICAL EXPERIMENTS
Three exemplary microstructures with a resolution of 128 × 128 pixels are considered as shown in Figure 1: A single elliptical inclusion, a synthetic, periodic structure from Brough et al. [52] and a real carbonate sample from Li et al. [35], which is released under a Creative Commons license.These structures are reconstructed with the DMCR and Yeong-Torquato algorithm in MCRpy version 0.2.0 with a maximum-fidelity resolution limit of 16 pixels2 and multigrid reconstruction enabled 3 .Three different sets of descriptors are chosen, namely (i)  2 , (ii)  2 and , and (iii)  3 , each of which are computed on 1D lines as well as in all 2D directions.This leads to a total of 36 results, which are summarized for the DMCR and Yeong-Torquato algorithm in Tables 1 and 2, respectively.Exemplary convergence plots are given in Figure 2. As expected, DMCR converges at a faster rate and it can be seen that the optimization on the coarse multigrid levels could have been aborted earlier in order to save computational effort.This is, however, not significant compared to the high-resolution problem.It should be noted that the loss function increases as the structure is interpolated to a finer resolution because the increased descriptor dimensionality adds terms to the loss function.Exemplary intermediate results are shown in Figure 3.
Table 3 provides insight into the required time per iteration for each descriptor (combination) and algorithm.Naturally, computing descriptors on lines is significantly more efficient than a full-field computation.This is why the former option is often employed in practice in the Yeong-Torquato algorithm.However, it can be seen that regardless which reconstruction F I G U R E 2 Exemplary convergence curves for the DMCR (A) and Yeong-Torquato (B) algorithm using  3 in 2D.Both methods converge, however, DMCR converges faster and to a better solution.The rapid jumps in the convergence curves correspond to an increase in resolution in the multigrid algorithm, see Figure 3.At these locations, the loss increases because the newly created, interpolated pixels increase the dimensionality of the descriptor.

F I G U R E 3
Intermediate results from the multigrid reconstruction of the inclusion in Figure 1(A) using DMCR and  3 (see Table 1).The coarse-grid initialization on 32 × 32 pixels (A) converges to a solution (B) that is linearly interpolated to 64 × 64 pixels (C) to serve as an initialization.The result (D) serves as an initialization to the 128 × 128 pixel scale (E), where the final result is obtained (F).The corresponding convergence curve is given in Figure 2A.algorithm is chosen, the full descriptor enables strictly better reconstruction results, see Tables 1 and 2. Furthermore, it can be seen that the Yeong-Torquato algorithm iterates faster than DMCR, because it does not require to compute gradients.Because of the pixel selection overhead, this difference becomes less pronounced for line descriptors, however, it should be noted that the pixel selection in MCRpy is optimized for reducing the number of iterations, not for performance as in Adam et al. [28].

CONCLUSION
This work presents an investigation of the influence of the descriptor fidelity on the result of microstructure reconstruction algorithms.Specifically, different descriptors are computed on 1D lines along the horizontal and vertical direction and are compared to the full 2D descriptors in terms of performance and microstructure result.Therein, the reconstruction is carried out with differentiable MCR (DMCR) as well as a Yeong-Torquato algorithm with a different-phase neighbor sampling rule.A multigrid scheme is used for both algorithms.
As expected, a richer descriptor leads to strictly better reconstruction results, regardless of the reconstruction algorithm.Especially the benchmark structure with the single rotated inclusion makes it clear that with  2 and  computed merely on 1D lines, the missing diagonal information makes accurate reconstructions of anisotropic structures impossible.An interesting exception is  3 , where diagonal information is reintroduced by the usage of two correlation vectors.With only 8 ms per iteration on 128 × 128 pixels and potential to tune the performance further, it might be a feasible choice for performance-critical applications.In this case, however, the reconstruction is only acceptable for the Yeong-Torquato algorithm, not for DMCR.This is likely rooted in the different-phase neighbor sampling rule.Its influence on the microstructure topology requires further attention.
As a final observation, it is noted that the easily observable relation between 1D and 2D descriptors might translate to the transition from 2D to 3D.The implications of these observations on the slice-based approach to 2D-to-3D reconstruction that is often employed in descriptor-based reconstruction as well as GAN-based approaches is to be investigated in future research.

A C K N O W L E D G M E N T S
The groups of M. Kästner and D. Peterseim would like to thank the German Research Foundation (DFG), which funded this work as part of a collaborative project (Grant Numbers KA 3309/18-1 and PE 2143/7-1).

F I G U R E 1
Original microstructures with 128 × 128 pixels that are reconstructed in this work: A single ellipsoidal inclusion (A), a synthetic, periodic structure generated in pyMKS [52] (B), and a real carbonate sample taken from Li et al. [53] (C).

TA B L E 1
Reconstruction results for the DMCR algorithm.TA B L E 2 Reconstruction results for the Yeong-Torquato algorithm.

TA B L E 3
Approximate time per iteration on the finest multigrid level for the DMCR and Yeong-Torquato algorithm.The results may vary on different computers due to hardware and caching.An NVIDIA A100 with 40-GB memory is used in the present work.
The group of Benjamin Klusemann gratefully acknowledges funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation program (Grant Agreement No. 101001567).The authors are grateful to the Centre for Information Services and High Performance Computing [Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)] TU Dresden for providing its facilities for high throughput calculations.Open access funding enabled and organized by Projekt DEAL.