Convoluted filtering for process cycle modeling

Principles of materials science and engineering, physics, mathematics, and information science are used to extract knowledge and insights from the process‐structure–property‐performance relationships hidden in materials data. The process‐structure modeling can be accelerated without loss of interpretability, with artificial intelligence tools that mimic the salient features of the process and process‐structure relations. In this work, a novel convoluted model‐filtering technique was exploited to build and successfully train the Convoluted Filter (CoFi) artifacts for Fe‐based alloy heat treatment cycles. The artifacts were pre‐trained to filter out deep models that change the surrogate microstructure state after the heat treatment at ambient conditions. Direct representation of the thermal cycle features within knowledge Graph facilitated development of meaningful data models for microstructure evolution, which reduce overfitting to limited datasets.


F I G U R E 1
Illustration for adding trainable layers on top of DeepFreG 3 for modeling microphase volumes (cementite, pearlite, γ and δ Fe-phases) new trainable Graph structures and logical training tasks, for example, Task IV (microphase volumes modeling) on top of DeepFreG (Figure 1) was demonstrated to be robust. 3he biggest challenge, however, comes from the need to design the materials processes that control material's structure and may account for about 80% of the variance in the materials properties and performance. 7High-fidelity forward modeling and other high-performance computing for engineering simulations 8,9 are computationally very expensive, do not often provide sufficient approximation of realistic processing conditions, and thus are not quite suitable for supporting inverse modeling and process design optimization.On the other hand, popular ML algorithms do not handle well a combination of static and dynamic (time sequence) data.In fact, so far AI and ML have not been directly applied to materials processing which determines the ultimate product utilization potential. 10Treating the process parameters, such as the temperatures in a sequence of tempering cycles, like additive (i.e., causally unrelated) contributors to ML model makes such a model unusable for complex process design and/or optimization.
Materials Genome Initiative (MGI) launched by the U.S. Government has accelerated materials discovery but has not considered materials processing. 10Advanced energy technologies require increasingly complex, advanced materials processes.And there is a need for innovative approaches to designing processes (e.g., for process-dependent microstructure control) to increase efficiency and reduce design and development costs for next generation energy applications, that would also be useful across broad areas, to investigators not specialized in particular techniques. 10This challenge was adequately addressed in this work by developing a novel AI architecture and the deep learning techniques (including joint training on data batches of different size, hyperparameter variability during the session run, and feature weights variability in objective function during multi-objective optimization) to facilitate simulation of causally related steps in the materials processing.
The rest of the paper is organized as • Methodology: the foundational framework for the novel AI architecture built to model and ultimately design new materials processing cycles.
• Case study: data-driven modeling of thermo-mechanical performance of Fe-based alloys, with the goal of demonstrating functionality of the novel point-free programming module in composable architectures.
• Results and discussion: the results obtained in the case study presented, along with a discussion on what affects the model performance.
• Summary: the main achievements of this work reiterated, along with opportunities to advance research in this area.

METHODOLOGY
The key refinement to the original DeepFreG architecture was a point-free programming module for interpretable representation of materials processing cycles and process-structure relations (process module) as introduced here.The goal was to demonstrate its functionality in composable architectures, specifically on examples of tempering and test cycles for Fe-based alloys.Like Task IV layer (Figure 1) in the original architecture, this process module can be mounted on top of pre-trained DeepFreG layers for seamless transfer learning (Figure 2).It exploits Convoluted Filtering (CoFi) and agent-activated cycles as detailed below.The agents are tempering temperatures (T1, T2, and T3) as well as test temperature (TT).With such modular configuration, pre-trained deep-layer artifacts may remain fixed during the subsequent training of the mounted modules, and the corresponding LL tasks are passive: I. Reusing DeepFreG LL Task I outputs as data inputs for Task II.II.Reusing DeepFreG LL Task II outputs as data inputs for Task III.Pre-trained linear transform of these outputs can be directly used to estimate AGS.
Similarly, the layers built for the original Task III may remain frozen as well (as shown in Figure 2) with their outputs available for mounting of the top modules.
To imitate the heat treatment cycles in alloy processing, Convoluted Filtering (CoFi) process module was designed as a directed graph with an updatable base layer (generating output corresponding to the cooled-state virtual or surrogate microstructure), an agent-activated layer (generating output corresponding to the hot-state microstructure), and a feedback loop for process cycle modeling (Figure 3).Such approach exploits a set of rules that take advantage of the process knowledge to filter out random data-fit models unless they comply with the filter rules.For example, it can filter out deep models (i.e., graphs with trained artifact values) that change the surrogate microstructure state (outside of the linear manifold) after the full-cycle heat treatment at ambient conditions (Appendix A).
TensorFlow (TF) implementation of the combined (agent-process) CoFi unit supported the corresponding tasks: III. Reusing DeepFreG LL Task III outputs (with pre-trained and fixed artifacts) as data inputs for initializing the process module.IV.Reusing DeepFreG Task IV layer (initially, with LL pre-trained artifacts) as a base in the process module, with its outputs corresponding to the cooled state.V. Learning to imitate heat treatment cycles (from cooled to hot state and back) and the threshold rule (no changes in microstructure without requisite agent's input).VI.Learning non-linear materials property-performance features to test pre-trained models (CoFi ⊗ DeepFreG) and the fine-tuning (layers IV-V).
The agent inputs were linearly combined with inputs from the fully connected ReLU layers in the same task (recurrent Task IV).Weight and biases (IV) were fixed during Task V training.
LL of the baseline rules for simulating heat treatment cycles was performed by training Task V artifacts for modeling microphase volumes with inputs from the prior virtual "hot state" (Task III outputs) to task layer IV (without agent input), cycling through task layer V and back to task layer IV (with zero agent input) and then again through task layer V to the terminal outputs trained on targets with the modular structure similar to the original Task IV (Figure 1).Weights and biases (V) were initialized-with zero or no agent action-by using the microphase volumes as targets (as in Figure 1) and minimizing the weighted objective function in the stepwise fashion employed in DeepFreG training. 3fter Task IV, all microphase volume training errors were within 5% except 6%-7% error for δ-Fe.After Task V pre-training for 602 epochs, the errors for three microphases (γ-Fe, cementite, and pearlite) were within 5%.The conditional training 3 reduced δ-Fe error to the Task IV level as well, thus indicating that CoFi can learn the baseline rules of heat treatment cycles by keeping the "hot state" layer output virtually unchanged (within linear transformation) with zero or no agent input.

CASE STUDY
The research on data-driven modeling of thermo-mechanical processing of materials was conducted as part of the larger National Laboratory enterprise, 11 with an overarching goal of accelerating pace of materials design, such as cost-efficient and heat-resistant structural materials (e.g., Fe-based steel) for more efficient and economical power generation.
Fe-9%Cr steel mechanical properties (such as strength and deformation features) and performance are controlled, at least in early phases of deformation, by the movement of dislocations (linear crystallographic defects).Such movement is generally activated by a combination of heat and mechanical stress: according to classical theory, the external stress field modifies the spatial profile of potential energy and effectively lowers the activation energy of deformation. 12With increasing test temperature, there is a sharp changeover in the underlying natural phenomena, from athermal movement of dislocations to thermally-activated diffusion. 6,135][16] These and other microstructural features are controlled by the process parameters like normalization and tempering temperatures.The tempering cycle(s) follow the austenitization (Figure 2) and are used to alleviate excessive strain and brittleness resulting from the prior austenite-to-martensite transformation during rapid quenching.
Non-linear features of interest in this case study were associated with the inflection points in temperature dependence of mechanical properties 6 and with the edge of precipitate formation that indirectly controls the changes in mechanical properties (Appendix B).The ultimate tensile strength (UTS) and the yield strength (YS) were the main features of interest.Observation of a split of UTS-and-YS paired data into two distinct c-IG clusters (by YS/UTS ratio) 2 was of interest as well.And finally, it was instructive to see if CoFi ⊗ DeepFreG filter would sufficiently preserve the information on potential heterogeneities responsible for non-homogeneous elongation, EL of steel samples and the resulting necking during the test.To that end, a hypothetical (uniform, effectively corresponding to RA) elongation, EL_RA was derived from reduction in area, RA (as a ratio defined on the 0-to-1 interval) on the assumption of perfectly homogeneous elongation, computed as: EL/EL_RA ratio was used here as a measure of heterogeneity reflected in the shape.YS/UTS was used for classification-type training (pattern recognition).
As described in Methodology, Tasks I-III were reproducing prior work 3 ; and Tasks IV and V were re-used for each tempering cycle in training CoFi modules, where each fully-connected ReLU layer had 100 nodes.Task VI objective was to capture the non-linearities by adding another modular structure like those used in Task IV and Task V model training, ReLU layer (with 50 nodes) on top of the pre-trained CoFi ⊗ DeepFreG model (Figure 4).The ratio models, for YS/UTS and EL/EL_RA, were trained on one and the same dataset.However, the data selection for the other two sub-tasks was conditional on proximity to the edges of highly nonlinear patterns.the objective function with weight-averaging across seven compositional clusters (using "clustering for information gain" algorithm 1,2,17 for segmentation); and the test dataset comprised only the cluster-representative alloys.
There were 1041 data records with complete information for all features of interest, including 863 training and 96 cross-validation records.The conditional training 3 near the edges varied depending on the definition of proximity used for soft and hard training targets.For example, TT ranges were defined as "high end" (above 400 • C) and "interval" (between 300 and 600 • C).
The 82-alloys dataset for the demonstration of CoFi functionality was compiled as part of the U.S. Department of Energy's Advanced Alloy Development program 18 and was later augmented by information about additional 17 alloys.The augmented dataset was most recently used for developing a variety of ML approaches for explaining compositional segmentation, quantifying uncertainty, and others. 19,20he initial 82 alloy compositions along with processing parameters were fused with the latent variable datasets (such as the upper-critical-point and start-of-martensite-formation temperatures, Cr and Ni equivalent concentrations, and the microphase volumes) for pre-training DeepFreG to learn empirical domain knowledge. 3ost of the input data were ingested and preprocessed, batched and re-shaped in TensorFlow Python environment similar to prior work. 3Tempering and test parameter values were fed directly to the process module.
As temperature is the top contributor to alloy properties, 2,7 its weight in the overall penalty function was linearly ramped up over 10,000 epochs to ensure stable convergence of the other three models (CN map, YS/UTS and EL/EL_RA) pre-trained over 50,000 epochs prior to joint training with TT range data for additional 50,000 epochs including the ramp.The standard root-mean-square deviation (RMSD) was used as a default penalty function.

RESULTS AND DISCUSSION
The frozen layers of the Graph used here were only pre-trained to ensure that key domain knowledge features make primary contributions to property and performance models, not to converge to the best possible AI model.Some additional training may improve the task models, but it may also short-circuit (by saturating activation functions) artificial neurons carrying auxiliary information of potential interest to the subsequent tasks.This may lead to vanishing gradients and backpropagation ignoring the synapses that carry useful auxiliary data.To avoid killing the artificial neurons, additional incremental training of the deep-frozen layers can be carried out iteratively along with the ultimate property/performance model training.It is important to monitor the deep layers so that they continue to accurately predict the latent features, and thus maintain transparency and trustworthiness of the overall model.Notably, monitoring of the deep layers revealed that Task II model of the average grain size, AGS (Figure 5) appeared to be skewed to more accurate predictions of the larger grain sizes (∼100 nm), while overestimating the smallest ones (∼10 nm).This is a direct consequence of using a linear penalty function across multiple scales (from 8 to >300 nm).Perhaps, AGS# would yield more accurate predictions for smaller AGS values, if it were a sole or ultimate purpose of the  project and if there were enough data available for training in the range of AGS <14 nm. 3 On the bright side, the trained model is not overfitting.As shown in Figure 5, experimental grain-size data (AGS) were available only for six composition clusters: C111, C112, C121, C21, C221, and C222 (using c-IG composition-based classification as introduced in References 1,2).The main clusters were also labeled according to the industry convention: CPJ (NETL project), COST (European project) and modified COST (without niobium), P91/92, modified P91 (without vanadium and niobium), and 12% chromium compositions.
Another observation is microstructure and its evolution can be spatially varied unless it was homogenized.Incidentally, the grain size in homogenized (C111) cluster was predicted very accurately, while most of comparable values (10-48 nm) in non-homogenized clusters were predicted with significant relative errors (Figure 5).However, this single observation is insufficient, and the overall population (of limited size) statistics is not conclusive to support it.Homogenization input was ignored in the following analysis.
CoFi functionality appears to be adequate as low temperature heat treatment does not seem to affect the virtual microphase volumes upon pre-training.All four virtual microphase volumes stayed within 5%wt of the targets on average, before and after the virtual heat-treatment cycle.It is not trivial, however, if this approach is flexible enough to maintain its functionality with multiple cycles over the entire range of relevant temperatures.
The training on mechanical property features was completed with three tempering cycles, in two steps.After the pre-training on three features (YS/UTS, EL/EL_RA, and CN map) the results exceeded expectations: YS/UTS ratio error was 9%, EL/EL_RA ratio error was 6.1%, CN map errors were 40.5 MPa near the edge and 120 MPa at the edge itself; 16 compositions were used to capture the edge of precipitation (steep steps in UTS, of up to 200 MPa) and 40-to capture non-linearity near the edge (denoted as edge* dataset).After the joint training of all four features, the TT range error was 167 MPa, while YS/UTS ratio error was 5%, EL/EL_RA ratio error was 14%, CN map errors were 21 MPa.Additional training helps to bring the TT-range error down but eventually begins to increase the other errors (Table 1).Notably, UTS and YS predictions can be as good or even better than the Random Forest regression for the original 82 alloy compositions. 7ast iteration-1 YS errors within TT range were sufficiently low relative to the baseline linear model errors (R 2 ∼ 0.9) 7 even if those were computed on the entire 82-alloys dataset, not just on the most non-linear slice.Decoupling of the training objectives could obviously decrease all errors, but this was not the purpose of this demonstration.What was accomplished here is a validation of the "virtual microstructure state" concept.The hidden layer output corresponding to such state may carry critical information for all relevant non-linearities observed in key properties and performance of the material with such microstructure under various mechanical and thermal stresses.EL/EL_RA model converged with the 99-alloys dataset poorly (R 2 ∼ 0.8 using CoFi and R 2 ∼ 0.3 using linear model) due to persistent outliers, primarily in the recently added 17-alloys data.The experimental alloys exhibited a distinctly multi-modal performance pattern, that is, using same composition and heat treatment, one can observe up to three distinct datapoint clusters in the mechanical properties data.Such data pattern suggests inherent heterogeneity that cannot be explained by evolution of the uniform (and deterministic) virtual microstructure state assumed in this case study.Some of these issues (i.e., multi-modal stochastic response to stressors) can be addressed using ensemble modeling. 21emarkably, while successfully reinforcing the threshold rule over five convolutional cycles (initializing, three tempering cycles, and testing) CoFi accurately captured the threshold temperature for the onset of microstructural evolution during heat treatment (Appendix A).The thermal cycling model had maintained compatibility with the threshold rule preserving the virtual microstructure state at low temperatures and converged to the threshold temperature range of 74-113 • C, which is in line with metallurgical practice of using ∼125 • C as the lowest heat-treatment temperature for this type of metal alloys.

APPENDIX C. MIRROR-IMAGE GRAPHS FOR JOINT MODEL TRAINING ON DISPARATE BATCHES
"Mirror-Image Graph" (MIG) has identical to default dataflow Graph structure, shares all globally defined classes, functions, and so forth, as well as the names of trainable variables, with the exception of the placeholder-traceable operations.During TensorFlow session run, several MIGs can be used to jointly execute computing workflows by minimizing a combined penalty function on the disparate batches of data.To avoid runtime errors, the batched values should be fed to MIG placeholders via feed_dict argument in prior session runs.The variables maintain state across multiple calls to run and the once computed and stored values can be reused in the subsequent session runs.
In Task VI of the overall training schema (Figure 4) YS/UTS and EL/EL_RA ratios were computed for a standard training data slice, while CN map was learned with a subset near the carbonitrides precipitation edge, and TT range was defined as a subset with the test temperatures confined between the low and high limits around the break-point in tensile strength pattern.
The minimized penalty function was defined as the root mean square (RMS) of the four single-target (YS/UTS, EL/EL_RA elongation shape factor, UTS vs. CN map, and YS vs. TT range) loss functions computed on the corresponding batches.Optimization was run on the training data slice for YS/UTS and EL/EL_RA, while CN map and TT range batch placeholders had their values specified in prior session runs.

F I G U R E 2
Composable mounting of CoFi modules on top of the pre-trained DeepFreG model F I G U R E 3 Imitation of the heat treatment: left box-simplified iron phase transformation and re-crystallization diagram for heating-and-cooling cycle; right box-virtual state transformation schema activated by one-cycle agent (event-driven) For joint training on three disparate batches, an unconventional AI architecture ("mirror-image Graph") and TensorFlow algorithms were developed for simultaneously passing the combined data to optimizer during the session run (Appendix C).The map of CN precipitation edge was learned by training UTS vs composition near the carbonitride precipitation edge; and TT inflection points were learned by training YS vs TT within the target range of temperatures near inflection point.Testing of the YS/UTS model employed F I G U R E 4 Composable mounting of additional modules on top of pre-trained CoFi ⊗ DeepFreG

FFeature
I G U R E 5 AGS (not to scale) by c-IG 1,2 cluster: measured vs predicted (the testing dataset).Conventional classification matches to c-IG clusters are labeled as applicable: CPJ (NETL), P91/P92 (USA), and COST/COST E (Europe); mod = modification TA B L E 1 Multi-objective optimization error tradeoff (pre-training errors not shown): asterisk (*) denotes the range expansion

F
I G U R E B2 Carbon-nitrogen map showing the edge of carbonitride precipitation strengthening