Guidance for performing multivariate data analysis of bioprocessing data: Pitfalls and recommendations



Biotech unit operations are often characterized by a large number of inputs (operational parameters) and outputs (performance parameters) along with complex correlations among them. A typical biotech process starts with the vial of the cell bank, ends with the final product, and has anywhere from 15 to 30 such unit operations in series. Besides the above-mentioned operational parameters, raw material attributes can also impact process performance and product quality as well as interact among each other. Multivariate data analysis (MVDA) offers an effective approach to gather process understanding from such complex datasets. Review of literature suggests that the use of MVDA is rapidly increasing, fuelled by the gradual acceptance of quality by design (QbD) and process analytical technology (PAT) among the regulators and the biotech industry. Implementation of QbD and PAT requires enhanced process and product understanding. In this article, we first discuss the most critical issues that a practitioner needs to be aware of while performing MVDA of bioprocessing data. Next, we present a step by step procedure for performing such analysis. Industrial case studies are used to elucidate the various underlying concepts. With the increasing usage of MVDA, we hope that this article would be a useful resource for present and future practitioners of MVDA. © 2014 American Institute of Chemical Engineers Biotechnol. Prog., 30:967–973, 2014