Shifting artificial data to detect system failures



Multivariate statistical process control (MSPC) is used for simultaneously monitoring several process variables. While small changes to normal operating conditions made by this system may not seriously affect the quality of a product, a system failure will be declared if an observation significantly deviates from the in-control region before defective units are mass-produced. Although a number of research works integrating data-mining algorithms with MSPC have been proposed to effectively manage a large amount of data, this combination may not function for the case of system failures due to the extreme imbalance of data. This research proposes a new approach and employs a classification technique, namely, random forest, which overcomes the class imbalance problem. The proposed method systematically shifts artificial data toward the region of failures to ensure the classifier correctly detects system failures. Numerical experiments show that our method outperforms existing methods in terms of failure detection counts.