FE‐NN: Efficient‐scale transition for heterogeneous microstructures using neural networks

Numerical modeling and optimization of advanced composite materials can require huge computational effort when considering their heterogeneous mesostructure and interactions between different material phases within the framework of multiscale modeling. Employing machine learning methods for computational homogenization enables the reduction of computational effort for the evaluation of the mesostructural behavior while retaining high accuracy. Classically, one unit cell with representative characteristics of the material is chosen for the description of the heterogeneous structure, which presents a simplification of the actual composite. This contribution presents a neural network‐based approach for computational homogenization of composite materials with the ability to consider arbitrary compositions of the mesostructure. Therefore, various statistical volume elements and their respective constitutive responses are evaluated. Thereby, the naturally occurring fluctuation within the composition of the phases can be considered. Different approaches using distinct metrics to represent the arbitrary mesostructures are investigated in terms of required computational effort and accuracy.


INTRODUCTION
Despite its ecological impact, concrete still is one of the most used construction materials in civil engineering due to its low cost, good processability, and high compressive strength.However, due to the ongoing significant contribution of the field of civil engineering to the production of greenhouse gases, optimizing structures with respect to material use, and therefore, environmental impact has become mandatory.Such an optimization process is standard in engineering design.However, the heterogeneous nature of many engineering materials complicates this task.The combination of different constituents yields advantageous properties compared to the homogeneous counterparts.On the other hand, the properties and interaction of the constituents on a mesoscopic level significantly influence the structural behavior of such a material.Knowledge of the load-bearing behavior on the mesoscopic level is essential to fully utilize the composite's behavior in the design process.
The most commonly applied analysis method for designing structures is the finite element method (FEM).However, this is impractical for solving the multiscale task directly due to the unfeasible computational costs.Utilizing representative volume elements (RVEs) as a surrogate for the mesoscale analysis and consequent computational homogenization of the mesoscale constitutive response to obtain the effective macroscopic properties enables a more efficient approach.When considering a heterogeneous mesostructure of sufficient complexity, this method can still require huge computational effort due to the necessity of evaluating FE simulations on both scales.
To overcome the above-introduced challenges, many approaches applying machine-learning-based algorithms for computational homogenization have been developed recently.Those include model-free data-driven methods that operate directly on the underlying data set, as presented in [1].On the other hand, model-based approaches that rely on training artificial neural networks (ANNs) on the data set have been applied to large success for constitutive modeling [2][3][4].However, all those descriptions rely on a constitutive law or a single mesostructural representation with characteristic properties.This signifies a simplification of the actual physical composition of any heterogeneous material.An ANN could also provide the possibility to generalize this approach to arbitrary structures.They can homogenize mechanical and general physical properties of arbitrary mesostructures [5,6].This work aims at developing a framework for an ANNbased homogenization approach for arbitrary mesostructures, represented by statistical volume elements (SVEs), which can be applied in numerical multiscale analysis with high computational efficiency.The main advantage of this approach compared to a conventional RVE is the smaller required size of the SVEs.The length scale of an RVE must be defined such that the effective properties of the material can be obtained independently of the applied boundary conditions.Alternatively, the effective macroscopic properties can be represented by averaging the effective properties of an ensemble of smaller SVEs.

FUNDAMENTALS OF COMPUTATIONAL HOMOGENIZATION
Consider a material point x as a part of a homogeneous macroscopic continuum body .The effective constitutive behavior of the actual heterogeneous mesoscale domain at that point can be considered through computational homogenization.This approach requires the identification of a periodic, statistically RVE of the heterogeneous mesostructure, which is commonly referred to as RVE and has to fulfill the concept of scale separation.Thereby, the characteristic length of the mesoscale has to be significantly smaller than the characteristic length of the macroscale.Governed by the macroscopic state of deformation, a mesoscopic boundary value problem (BVP) can be defined on this RVE.As both the governing equations and the constitutive behavior on this scale are known, the constitutive behavior on the macro level can be obtained by solving the BVP.For a strain-based homogenization scheme, the primary variable field is the displacement field ū, from which the strain tensor ε can be obtained as within the small strain regime.Through the chosen boundary conditions, the macroscopic strain is imposed onto the boundaries of the RVE.Utilizing a first-order homogenization approach, the effective of macroscopic stresses σ can be obtained from their mesoscopic counterparts  as where || denotes the volume of the RVE .The appropriate boundary conditions need to fulfill the Hill-Mandel condition, stating the equivalence the macroscale stress power with the volume average of its mesoscopic counterpart with ( ⋅) as the time derivative of a field quantity.

ARTIFICIAL NEURAL NETWORK-BASED CONSTITUTIVE MODELING
In the past years, ANNs have been successfully applied to many constitutive modeling tasks.Those include elasticity [6], elastoplasticity [3], and softening behavior [7].The following section aims at providing the fundamental knowledge for General schematic picture of a feedforward neural network (FFNN) [8].
the ANN type that is required within the framework of homogenizing the constitutive behavior of arbitrary heterogeneous mesostructures.It also provides an overview on the generation of the data set and the ANN architecture that are utilized in Section 4.

Feedforward neural networks
Feedforward neural networks (FFNNs) consist of computational nodes, also called neurons, which are arranged in layers.
The number of neurons per layer and the number of layers and the connection of the neurons between the layers determine the architecture of the FFNN.In the simplest case, each neuron of a layer is connected to every neuron of the subsequent layer, which is defined as a dense ANN.Three types of layers exist.The input layer processes the input feature vector  and possesses one neuron per input feature.The output layer computes the predictions of the ANN ŷ, where the number of neurons is equal to the number of components of ŷ.The layers between input and output layer are denoted as hidden layer, as the numerical values within those layers are generally not accessed by the user.The neurons contained in the layers apply a nonlinear transformation to their respective input data  following where   is the output of the layer ,   is the matrix of linear weight factors,   the vector of constant bias terms, and f is a nonlinear function that is at least  0 -continuous.For a dense ANN, it is defined that A general schematic of a dense FFNN with a three-dimensional in-and output and two hidden layers is depicted in Figure 1.

Training
The objective of an ANN is to discover a function that accurately maps values from the input space to their corresponding values in the output space.Therefore, an objective function, the loss function , can be defined and used to optimize the ANN's variables to minimize said function.In the simplest case, this function measures the error between the model's prediction and the ground truth, which is part of the data set.For the regression task at hand, the mean squared error

Materials Young's modulus [GPa] Poisson's ratio [-]
Matrix material 20 0.3 Inclusions 70 0.3 (MSE) is selected as the loss function where N denotes the number of training samples.Hereby,  and  denote the full sets of weights and biases of the ANN.The most common optimization algorithm for ANNs is the gradient descent approach, where the model parameters  = {, } are updated according to their respective influence on the loss function.Therefore, the gradient ∇  () is employed, with which the parameter update can be expressed as where  is called the learning rate and determines the step size toward the minimum of the loss function.Several improvements for this gradient descent algorithm have been developed, leading to increased computational efficiency and better convergence behavior toward the global minimum of the loss function, especially in the presence of local minima.Some examples are Adagrad, RMSprop, and Adam.

Data set generation
Training an ANN-based constitutive model generally requires information on the constitutive behavior of the structure that should be represented.In this case, that information is represented by pairs of the prescribed macroscopic strains ε and the corresponding effective macroscopic stresses σ, cf.Section 2. However, to consider arbitrary mesostructures from the SVEs, the ANN requires more information.For the model presented in Section 3.4, this is the inclusion area or volume fraction of the RVE instance, depending on the RVE's dimensionality.The fraction of inclusions in the RVE is subsequently denoted by the abbreviation " f " regardless of the dimensionality.The SVEs are created as binary images, where a pixel value of 1 denotes an inclusion and 0 the matrix material.For the investigation at hand, only small strain linear elasticity is considered.

Two-dimensional case
Random values for the center coordinates and the radius of the circular inclusions are sampled within a certain domain that represents the edges of the RVE.The geometrical information is then used to construct the image within a (64 × 64) size grid, representing an RVE with 1 mm side length.From 4 up to 11 inclusions are sampled per RVE with radii varying between 0.05 and 0.1 mm.Thereby, values of  f ranging from 4% up to 23.5% are generated.
It should be noted that the approach presented in this contribution does not require the existence of actual image data.However, as shown in [6], a data set based on images allows for more complex ANN architectures to be applied, for example, CNN-based approaches for the mesostructural representation.However, this is beyond the scope of this contribution.
Based on the geometrical information, FE meshes are generated for the SVEs.The range for the imposed strain limits is set from −0.1 to 0.1.The material parameters for this study, which represent general elastic properties of a cementicious matrix and stone aggregates as found in concrete, are given in Table 1.By limiting the investigation to linear elastic materials, the macroscopic stresses for an arbitrary strain state could be computed by evaluating the uniaxial loading directions through an FE simulation of each RVE.Subsequently, the data set can be generated by using data Those are divided into two subsets, where the first contained 200 RVEs with 91 200 load states and the second the remaining eight RVEs and their corresponding 960 load cases.The former data set is split load case-wise with a ratio of (70% ∶ 15% ∶ 15%) into training, validation, and test data sets.Thereby, the generalization ability of the model with respect to arbitrary loading can be evaluated.The latter data set is used to additionally evaluate the generalization with respect to unseen mesostructures.
Normalization is applied to enhance stability and convergence performance of the ANN, with the output features being linearly scaled in the range from 0 to 1.

Three-dimensional case
The three-dimensional RVEs are generated with the approach presented in [10].The resulting data set contains 123 cubic RVEs with a side length of 1 mm and an inclusion volume fraction  f ranging from 2.1% to 29.1%.For the generation of the data set, the same material parameters, range of the applied strain, and overall homogenization procedure as presented in Section 3.3.1 are utilized.Due to the increase in spatial dimensions, the number of components in the input and output pairs is also increased to 7 and 6, respectively.From the evaluation of the six uniaxial load states and subsequent superposition, 150 000 load states are generated, and distributed uniformly over the 123 structures.Similarly to Section 3.3.1,11 RVEs are excluded from the main data set to later on evaluate the generalization ability on unseen RVEs.The data from the remaining 112 RVEs are split load case wise with the same ratio as before into training, validation, and test set.Furthermore, the same normalization as in the two-dimensional case is applied.

Neural network architecture for homogenization of arbitrary mesostructures
The model architecture chosen for the task at hand is an FFNN as described in Section 3.1.In accordance with the data set, the input layer contains four and seven neurons for the two-and three-dimensional case, respectively, whereas the output layer has three and six neurons.For the optimization of the model's hyperparameters, the KerasTuner framework is employed.The search space is given in Table 2. Six hundred models are generated randomly within the given hyperparameter space for both two-and threedimensional cases and initially trained for 10 epochs using the Adam optimizer.Thereafter, an evaluation against the validation data set is performed, from which the 10 best performing models are selected and trained until overfitting is encountered.For the subsequent numerical examples, the best performing of those 10 models is selected.

NUMERICAL EXAMPLES
The following section discusses the approximation performance of the best performing ANNs obtained by training on the data sets described in Section 3.3 for both the two-and three-dimensional data set.For reasons of brevity, only the prediction performance on the unseen RVEs will be evaluated as it is most representative for the generalization ability of the models.As a performance metric, both the mean absolute error () and the coefficient of determination ( 2 ) are given output component-wise.

Two-dimensional mesostructures
Predicting the effective stress for the unseen RVEs and arbitrary load states in the two-dimensional case yields  2 -values, close to 1 and s below 1% for every stress component, see Table 3.This signifies a really high accuracy, and therefore, generalization ability of the ANN, showing that the approach presented in this contribution is valid at least for simple structures and elastic material behavior.The model achieved a minimum validation and corresponding training loss of 3.5 × 10 −8 and 3.1 × 10 −8 , respectively, after around 1000 epochs.For a visual representation of the prediction performance, the  2 -plot for all stress components, normalized component-wise in the range of from −1 to 1 to allow for the representation in one figure, are given in Figure 2.

Three-dimensional mesostructures
The loss of the model for the three-dimensional representation converged already after around 180 epochs to much higher values of validation and training loss of 9.4 × 10 −4 and 8.4 × 10 −4 , respectively.Therefore, it cannot be expected to perform as well as the model presented in Section 4.1.The worse performance can be attributed to the lower number of RVEs under investigation, whereas the spatial dimensions of the RVE, the complexity of the inclusion shapes, and the range of  f are all increased.It is expected that by increasing the number of RVEs as well as the amount of load states in the training data set, a significant improvement in the performance would be possible.However, this investigation is not part of the study at hand.Table 4 displays the formerly introduced performance metrics.The  2 -values for all components still show a relatively high accuracy, being close to or above 0.9.Also, the maximum  of 6% signifies an adequate generalization ability of the model, despite the aforementioned drawbacks.The scatter plots given in Figure 3 show a scattering of the values around the regression line with an increase in deviation when the values approach the boundary of their domain.

CONCLUSION
This work introduces an ANN model to predict the homogenized stresses in the case of elastic small-strain deformations for arbitrary heterogeneous mesostructures.The investigations are performed both in the two-and three-dimensional space.The material under consideration is a two-phase composite representing a concrete mesostructure consisting of stone aggregates and a cement-paste matrix.The inclusions' positions and sizes varied randomly in each RVE, resulting in mesostructures with different fractions of inclusions.The proposed ANN model used the inclusions' volume or surface area fraction to represent the mesostructure composition.In the two-dimensional case, the model shows high accuracy in predicting the homogenized stresses and can possibly achieve a significant reduction in computation time compared to a multiscale  2 approach.However, the extension of the model to the three-dimensional case did not perform as well.Nevertheless, adequate performance can be achieved if the underlying data set's shortcomings are considered.
Currently, investigations on the application of different ANN model architectures, increased data set size, and the consideration of physics-related aspects in training are ongoing to improve the performance of the approach in three dimensions.Furthermore, the work will be extended to nonlinear material behavior, such as viscoelasticity or plasticity, in the future.In these cases, different deep learning architectures, such as RNN, must be considered to incorporate the history information accurately.

A C K N O W L E D G M E N T S
This work is supported by the German Research Foundation (DFG) within Research Training Group GRK 2250/2, Project B3, which is gratefully acknowledged.Furthermore, we thank the Center for Information Services and High Performance Computing (ZIH) at TU Dresden for generous allocations of computing time.
Open access funding enabled and organized by Projekt DEAL.

TA B L E 3 6 F
Model performance in two-dimensional case.I G U R E 2  2 -plot of two-dimensional predictions.

TA B L E 4 2 F
Model performance in three-dimensional case.I G U R E 3  2 -plot of three-dimensional predictions for normal and shear stresses.
TA B L E 2