Process description and data preparation
Injection molding,[38, 39] a key process in polymer processing, transforms polymer materials into various shapes and types of products. A typical injection molding process consists of three operation phases, injection of molten plastic into the mold, packing-holding of the material under pressure, and cooling of the plastic in the mold until the part becomes sufficiently rigid for ejection. Besides, plastication takes place in the barrel in the early cooling phase, where polymer is melted and conveyed to the barrel front by screw rotation, preparing for next cycle. All key process conditions such as the temperatures, pressures, displacement, and velocity can be measured online by their corresponding transducers, providing abundant process information.
The material used in this work is high-density polyethylene. Twelve process variables and four quality variables are selected for modeling. The process variables can be collected online with a set of sensors. Two dimension indices, product length (mm), and weight (g), whose real values can be directly measured by instruments, and two surface defects, jetting, and record grooves, whose real values can be quantified by a process operator expert before modeling, are chosen to evaluate the product qualities. Totally 32 normal batch runs are conducted under various operation conditions by design of experiment (DOE) method.[21] Using injection stroke as indicator variable, the reference batches are unified to have even duration (1300 samples in this experiment) by data interpolation, which, thus, results in a regular descriptor array X (32 × 12 × 1300). The measurement data are filtered to remove those obvious noises such as spikes. The qualities are only measured at the end of process, generating the dependent matrix Y(32 × 4). The first 25 batches are used for modeling, whereas the other seven cycles are used for model validation.
Illustration results and analyses
First, 1300 normalized time-slices Xk (25 × 12) are obtained from X (25 × 12 × 1300) as well as the normalized quality variables Y(25 × 4). Then, 1300 time-slice weight matrices are obtained focusing on the data pairs, {Xk,Y}. They are weighted using the time-varying variances of PLS LVs and then fed to the clustering algorithm.[21] The quality-relevant operation process is partitioned into five main clusters, in which, operation time information is included so that process samples are consecutive within the same clustering. The phase clustering result is shown in Figure 2 along with the time-varying process operation trajectory. The clusters II–V are deemed to be consistent with the real four physical operation phases: injection, packing-holding, plastication, and cooling. Moreover, a short period before cluster II is also separated as one individual phase, which have different characteristics and effects on quality, as illustrated later. Thus, five predictor data blocks are prepared by batch-unfolding, Xi (25 × JNi)(i = 1,2,…,5) (where Ni denotes the phase duration), which are associated with the same quality data set Y(25 × 4).
First, the between-phase subspace separation is performed. There are four pairs of between-phase combinations with respect to five phases. The first four between-phase common scores are illustrated in Figure 3 for each between-phase pair over the process. In general, the first two common scores are almost the same between the two neighboring phases. The latter two may be different more or less but still show a similar trend. It is found that the first common score vector for Phases 1 and 2 is quite similar to that for Phases 2 and 3. It is also similar to the second common score for Phases 3 and 4 with respect to their general trend. It is not similar with anyone of those for Phases 4 and 5. This tells that with process evolvement, the phases which are far from each other are more different so the common scores are more different. Based on the extracted common scores, the loadings for each phase are calculated and phase centers are obtained. By Eq. (5), the similarity between time-slice loadings and the phase center is computed as shown in each subplot in Figure 4 for each between-phase pair. Moreover, the predictor variation explained by common scores is evaluated by R2
index and the values are used to help understanding the changing common predictor variation between two neighboring phases as illustrated in each subplot in Figure 4. Note that both similarity and R2
profiles are processed by Kalman filter so that the major changing trend can be captured more clearly by excluding those normal oscillations to a certain extent. The similarity profile with respect to the starting phase is used to confirm the beginning of transition region, which is the time when the steady similarity values begin to show a decreasing trend. The similarity values with respect to the target phase are used to confirm the ending of transition region, which is the time when the increasing similarity begins to present a stable trend. Combining the similarity evaluation results and R2
, steady phases and transition regions are separated as shown in the sketch map in Figure 5. Between different neighboring steady phases, the transition regions may be of different durations and of different characteristics. At the very start of process, the transition region between steady Phases 1 and 2 is longer than steady Phase 1. Actually, the steady Phase 1 may be also called transition region considering that the beginning operation may be oscillatory in each cycle. Here, it should be noted that due to the uncertainty of transition patterns, the transition regions may not be determined definitely. Actually, the analysis of transition regions can be always influenced by subjective factors.
Based on the updated phase division information, phase representative model for each steady phase and transition model for each transition time are developed using PLS-CCA algorithm and used for online quality prediction. Here, it is called between-phase subPLS-CCA (Bp-subPLS-CCA) model. To assess the influence of between-phase transitions on model development, the phase models are also developed based on the original phase division results and PLS-CCA algorithm. That is, transition patterns are not distinguished from steady phase. Instead, they are included in some phases for model development. Here, it is called subPLS-CCA model. The phase representative regression models for the two different methods are compared in Figures 6a–d for four quality indices, respectively. In general, for all quality variables, the model coefficients are more different in Phases 2 and 3. It can be seen clearly that the exclusion of transition patterns from steady phases will result in different phase models more or less. To evaluate whether the exclusion of transition patterns from the steady phase can improve prediction performance, R2Y values with respect to all quality indices are compared between the two methods and shown in Figure 7. They are plotted along time direction through steady phases and between-phase transition regions over the whole process with respect to both training batches and testing batches. In general, the quality predictions are similar with small normal oscillations in the steady phase and more oscillatory in transition regions. It can be seen that when transition patterns are removed from steady phase, the representability of phase models can be improved, generating more stable and better quality prediction, especially for steady Phases 2 and 3. In transition regions, subPLS-CCA method uses some phase model and bad predictions are observed. Comparatively, the prediction performance by Bp-subPLS-CCA is significantly improved. At some transition time, the prediction performance is better than that based on the neighboring steady phase models.
During online application, the end-of-phase quality values are calculated by averaging all real-time predicted values within the steady phase. Moreover, to reveal the variability or diversity of quality predictions in steady phases or transition regions, the mean and standard deviation (STD) indices are plotted in Figure 8 taking the product length (i.e., the second quality variable) for instance, indicating how much variation or dispersion there is from the average prediction, that is, the end-of-phase value. A low STD indicates that all predictions within the same steady phase tend to be very close to the mean, whereas high STD indicates that the predictions are spread out over a large range of values. For each test batch, in different steady phases, the variability is different more or less. In general, steady Phases 3 and 5 generate the least STD values for all batches, indicating the least variability along time direction.
By performing between-phase modeling, the underlying predictor information is decomposed into two parts. Actually, the underlying predictor information in each phase can be forcefully and completely extracted as “common” scores one by one and ordered by descending quality-related between-phase relationship. In this way, no between-phase specific model is developed. However, the latter “common” scores may be quite different from each other and are actually pseudo “common” scores. To illustrate this, focusing on the original phase division, each phase is modeled only by “common” scores. This results in a different regression model (Bpc-MPLS-CCA) in each phase from the conventional calibration methods. It is compared with P-MPLS-CCA model where each single phase is separately modeled by MPLS-CCA. Here, no transition regions are considered. The prediction results are comparatively shown in Table 1 for each quality index. In general, they are similar, revealing that the predictor information extracted only by between-phase “common” scores is comparable to that by conventional single phase algorithms so that they make the similar contributions to quality. It should be noted that when one phase is associated with different neighboring phases, different between-phase models are identified, which, however, shows the similar prediction results for the same phase. Moreover, for different quality indices, different phases are significant for prediction. For example, for the fourth quality index, record grooves, the first three phases (including injection and packing-holding physical operation phases) are more important, which well agrees with the real case.
Table 1. Offline Prediction Comparison [R2Y (%)] for Bpc-PLS-CCA Model and P-MPLS-CCA Model| Quality index | Method | Phase No. |
|---|
| 1 (2)a | 2 (1) | 2 (3) | 3 (2) | 3 (4) | 4 (3) | 4 (5) | 5 (4) |
|---|
|
| 1 | Bpc-PLS-CCA | 35.9 | 71.2 | 72.5 | 89.8 | 87.8 | 88.9 | 89.4 | 58.9 |
| P-MPLS-CCA | 36.1 | 70.4 | 88.3 | 88.9 | 59.8 |
| 2 | Bpc-PLS-CCA | 30.6 | 78.8 | 72.8 | 89.9 | 89.9 | 93.8 | 93.7 | 54.4 |
| P-MPLS-CCA | 30.1 | 76.8 | 90.4 | 93.8 | 53.0 |
| 3 | Bpc-PLS-CCA | 77.8 | 56.0 | 54.8 | 31.6 | 28.3 | 31.9 | 30.7 | 32.3 |
| P-MPLS-CCA | 68.6 | 52.7 | 22.6 | 30.7 | 26.2 |
| 4 | Bpc-PLS-CCA | 99.9 | 98.8 | 99.6 | 98.1 | 96.9 | 48.6 | 48.1 | 38.5 |
| P-MPLS-CCA | 99.9 | 99.1 | 97.1 | 49.0 | 31.7 |
Then, based on the updated phase information, the offline quality prediction models are developed for steady phases and transition regions with respect to the between-phase subspace separation. To reveal the influence of cumulative effects in each time region (steady phase and transition region) on quality prediction, the online predictions and offline predictions based on between-phase calibration modeling are compared in Table 2. Different from the Bpc-MPLS-CCA in Table 1, here, the predictor and quality are interpreted jointly by common and specific scores for offline quality analysis. The quality prediction for each steady phase is the sum of prediction values in common and specific subspaces, as the common scores and specific scores are orthogonal, revealing complementary quality predictions. Moreover, when each steady phase is associated with different neighboring steady phases, different common scores may be extracted, resulting in different prediction models and performance. The better quality prediction results are chosen and show in Table 2. Compared with end-of-phase quality prediction for real-time application, the offline results are better based on paired t test (α = 0.5),[37] revealing that the consideration of cumulative effects can improve the prediction performance. The offline prediction results in transition regions are compared with real-time quality prediction in the same region. When the changing prediction performance in transition region is cumulatively explained by offline analysis, the prediction improvement is also statistically significant in comparison with online quality predictions based on paired t test (α = 0.5).
Table 2. Online and Offline Quality Prediction Results (R2Y (%)) in Steady Phases (SP) and Transition Regions| Application | Region |
|---|
| SP 1 | transition | SP 2 | Transition | SP 3 | transition | SP 4 | transition | SP 5 |
|---|
| Offline | 74.9 | 68.6 | 81.8 | 82.7 | 79.9 | 74.8 | 79.6 | 54.9 | 54.6 |
| Online | 51.6 | 61.7 | 65.8 | 79.0 | 65.1 | 73.9 | 62.0 | 50.6 | 36.0 |