Calibration transfer for bioprocess Raman monitoring using Kennard Stone piecewise direct standardization and multivariate algorithms

In the biopharmaceutical industry, Raman spectroscopy is now a proven PAT tool that enables in‐line simultaneous monitoring of several CPPs and CQAs in real‐time. However, as Raman monitoring requires multivariate modeling, variabilities unknown by models can impact the monitoring prediction accuracy. With the widespread use of Raman PAT tools, it is necessary to fix instrumental variability impacts, encountered for instance during a device replacement. In this work, we investigated the impact of instrumental variability between probes inside a multi‐channel analyzer and between two analyzers, and explored solutions to correct them on model prediction errors in cell cultures. It is shown that the Kennard Stone piecewise direct standardization (KS PDS) method enables to lower model prediction errors between probes of a multi‐channel analyzer from 20% to 10% on the cell densities (TCD/VCD). Considering the integration of a new device or the replacement of a previous one, it has been determined that a first cell culture monitoring can be directly performed with the new analyzer calibrated by the KS PDS method based on the dataset from the previous analyzer, with an accuracy better than 10% on the main components of the culture like glucose, lactate, and the cell densities. Then, the new data obtained by the new analyzer can be inserted in a global calibration dataset to integrate instrumental variability in the chemometric model: it is shown that only one batch with the new device in a consistent and equilibrated calibration dataset was sufficient to correct the prediction gap induced by instrumental variability, allowing to exploit the data from previous analyzers considering optimized methods. This methodology provides good multivariate calibration model prediction errors throughout the instrumental changes which is a requirement for model maintenance.

this work, we investigated the impact of instrumental variability between probes inside a multi-channel analyzer and between two analyzers, and explored solutions to correct them on model prediction errors in cell cultures. It is shown that the Kennard Stone piecewise direct standardization (KS PDS) method enables to lower model prediction errors between probes of a multi-channel analyzer from 20% to 10% on the cell densities (TCD/VCD). Considering the integration of a new device or the replacement of a previous one, it has been determined that a first cell culture monitoring can be directly performed with the new analyzer calibrated by the KS PDS method based on the dataset from the previous analyzer, with an accuracy better than 10% on the main components of the culture like glucose, lactate, and the cell densities. Then, the new data obtained by the new analyzer can be inserted in a global calibration dataset to integrate instrumental variability in the chemometric model: it is shown that only one batch with the new device in a consistent and equilibrated calibration dataset was sufficient to correct the prediction gap induced by instrumental variability, allowing to exploit the data from previous analyzers considering optimized methods. This methodology provides good multivariate calibration model prediction errors throughout the instrumental changes which is a requirement for model maintenance.

K E Y W O R D S
cell culture monitoring, model transferability, multivariate model maintenance, process analytical technology, Raman spectroscopy INTRODUCTION The monitoring of mammalian cell cultures with Raman spectroscopy and chemometric tools has been well demonstrated and documented. 1,2 The main metabolites and nutrients of a cell culture can be predicted by partial least square (PLS) models with root mean square error of prediction (RMSEP) quite close to off-line reference measurement accuracy. 3 This real-time monitoring of critical process parameters (CPPs) has been used to implement a glucose feeding control loop, leading to improved productivity. 4 Several critical quality attributes (CQAs) of a cell culture have also been successfully predicted with Raman spectroscopy: protein titer, 5 glycosylation, 6 and aggregation. 7 These results demonstrate that Raman spectroscopy can be efficiently used to monitor cell cultures in real-time and in situ, automate processes and even open the door to the use of Raman spectroscopy for real-time release (RTR) of batches. 8 Another approach in the field of Raman monitoring of bioprocesses has consisted in the attempt of developing generic models. 9 This experiment shows that models based on wide process variability provide poor accuracy with regards to models built with and for a limited design space, as defined by the Quality-by-Design (QbD) rules. An interesting way to proceed may be to find a methodology to select a dataset consistent with a given process inside a large dataset, as presented by Rowland-Jones et al. 10 Recently, Tulsyan et al 11 have proposed a novel machine-learning procedure based on Just-In-Time Learning (JITL) to calibrate Raman models. However, being able to use an existing multivariate model on different hardware configurations is probably a first priority, because generic models and generic datasets may have no use if they cannot be exploited on a variety of hardware units with at least the same design. This study aims to fill this gap and provides the missing piece of the puzzle to achieve such a goal.
Most of these mentioned works are based on model building and prediction based on the same Raman analyzer hardware. Then, they leave aside a key issue when using Raman spectroscopy in biopharmaceutical environments: instrumental variability, including replacements or changes of hardware. Instrumental variability over time on the same Raman analyzer is an issue which can be overcome through regular instrumental re-calibration or multivariate model updates. But the case of instrumental changes needs further attention, since this is related to more radical variability, with sometimes operational unforeseen constraints, and given that it has not been much documented in the engineering literature on Raman monitoring. The same question applies to other transfers: the transfer of models from one device to another and the transfer of models from one hardware configuration to another-where hardware configuration refers to changes in the channel used or replacement of hardware parts (probe's tube or analyzer for instance). This question is crucial since generating a dataset suitable to multivariate models represents a significant effort corresponding to typically three to more than 10 cell culture runs. For example, the calibration dataset may be generated on several channels in parallel and the models built to be used on each of the channels, or models may be built on a R&D analyzer and then transferred to a pilot plant analyzer. Also, the excitation laser or the tube of a probe may need to be changed because their lifetime has expired, after a given number of autoclave procedures for the probe tube ( Figure 1). This topic is actually fully part of the multivariate model maintenance issue. From a chemometric perspective, this question is equivalent to taking into account the principal components related to instrumental variability in the way of managing the model space, which can be enlarged or translated for this purpose. Wise and Roginski 12 have proposed a quite comprehensive roadmap to maintain multivariate models. In this paper, we propose similar tools in the perspective of instrumental variability which are not only unexpected problems but anticipated steps in the use of Raman monitoring in a biopharmaceutical environment. The idea is to apply the model maintenance on a long-term scope, to generalize on whole factories and cross platform facilities. The ambition of the present study is limited to calibration transfers between different hardware units with the same design.
It is worth noticing that the choice has been made to limit the exploration to PLS models, as a basic multivariate tool. Indeed, PLS is the most common one and it generally provides very good results in most of the bioprocessing applications. 13 On top of that, well documented mathematical tools allowing improvement of the multivariate calibration composed of the analyzer (with the laser and spectrometer inside) and only one probe (body and tube immersed in the bioreactor); in multi-channel, the device is composed of the same analyzer (above) and an associated multi-channel unit (below) to sequentially operate up to four probes, each being connected to one channel and one bioreactor through instrument standardization, 14 such as piecewise direct standardization (PDS), 15 have been used to perform calibration transfers in pharmaceutical analysis. 16 The comparison has been made with another conventional chemometric method consisting in the enrichment of the calibration dataset with data from the new Raman system configuration performed on the same process. This work is specific to the monitoring of cell cultures. Indeed, the goal of this study is to assess a methodology to lower the impact of instrumental variability on monitoring accuracy and robustness, so that it is negligible with regards to process variability itself. In the case of cell cultures, the biological and setup variability is for example much more significant than in a pure chemical process.
Finally, this paper explores two cases which have been considered crucial: the first one is related to the use of a four-channel Raman system to efficiently establish first multivariate models to be used with these same channels (inter-probe transferability study). The studied methodology aims to accelerate the implementation of Raman monitoring in a process development context, or the transfer of a process into new bioreactors. The second case deals with multivariate calibration transfer from one analyzer, where models show acceptable performances, to a second analyzer (inter-analyzer transferability study). This is typically the case when an analyzer needs to be replaced for maintenance or when a process is transferred from process development to a pilot plant in production (taking into account that the algorithms for calibration transfer should be trained and then applied on similar or same processes to avoid being impacted by very specific signatures).

Cell culture methods, sample collection and offline analysis
The study is divided in two parts and reflects two different processes in terms of culture medium, feeding strategy and cell line-even if very similar-among others. However, in both cases, the cell lines studied are widely used in the biopharmaceutical industry and the processes are very typical of the field. For the inter-probe transferability study: the CHO-K1 cells expressing monoclonal antibody IgG1 (ECACC 85051005) were inoculated in four DASGIP parallel bioreactor systems (Eppendorf) in 1 L glass vessels at 0.5 × 10 6 cells/mL in Acti-CHO P medium (GE Healthcare) supplemented with 8 mM of glutamine, 1.8 g/L NaHCO 3 and approximatively 30 mM of NaOH. The cultivation conditions were set at 37 • C, pH 7.0 and pO 2 30% regulated by sparging CO 2 for pH and a mix of air, O 2 , CO 2 and N 2 for pO 2 . The foam level was regulated by 0.3% of Dow Corning Antifoam C (DuPont) based on visual inspection. The cells were agitated with a pitched blade impeller at 90 rpm. The aeration of the culture was performed with a sparger with a flow rate up to 0.4 vvm. Starting on day 3 or 4, cultures were daily fed with ActiCHO Feed A and Feed B medium (GE Healthcare). Feeds were added to a calculated glucose concentration of 4.5 to 10 g/L for Feed A and 0.28% vol/vol for Feed B. Alternatively spikes of glucose, glutamine and glutamic acid were performed on one batch. In addition, for some batches, temperature shifts down to 33 • C were performed. Samples were collected once a day when no feed was performed. For each feed, a sample was taken before and after the feed instead of the daily samples. Samples were analyzed with High-Performance Liquid Chromatography (HPLC, Agilent 1200 Series) for quantitation of glucose and lactate, with the BioProfile 100 Plus (Nova Biomedical) for quantitation of ammonium, with the BioProfile CDV (Nova Biomedical) for viable cell density (VCD) and with the Multisizer 4 Coulter Counter (Beckman Coulter) for quantitation of total cell density (TCD). The analyzers for the reference measurement present analytical errors of 0.26% and 0.31% for glucose and lactate respectively on HPLC, Agilent 1200 Series (internal SD), 5% for ammonium on BioProfile 100 Plus (as reported by the manufacturer), 10% for VCD on BioProfile CDV (internal reference measurements), and 10% for TCD on Multisizer 4 Coulter Counter (as reported by the manufacturer).
For the inter-analyzer transferability study: the FreeStyle CHO-S (Gibco) cells were inoculated in a 3 L glass vessel with the BioFLO320 (Eppendorf) at 0.4 × 10 6 cells/mL in CD-CHO medium (Gibco) supplemented with 8 mM of glutamine, 1 ‰ of Anti-Clumping Agent (Gibco) and 0.5% of Penicillin/Streptomycin. The cultivation conditions were set at 37 • C, pH 7.0 and pO 2 40% regulated by sparging CO 2 and 0.5 N NaOH for pH and a mix of air and O 2 for pO 2 . The cells were agitated with a pitched blade impeller at 80 rpm. The aeration of the culture was performed with a ring sparger with a flow rate up to 0.1 vvm. The culture was fed with 15% vol/vol on day 0 and 10% vol/vol every 3 days with EfficientFeed B (Gibco). Glutamine was added when under 4 mM and, in addition, a constant glutamine feed is started on day three. In addition to the EfficientFeed B, glucose was added when under 4 g/L. Samples were collected twice a day when no feed was performed. For each feed, a sample was taken before and after the feed in addition to the daily sample. All samples were taken in triplicate and each triplicate was analyzed with FLEX2 (Nova Biomedical) for quantitation of: VCD, TCD, glucose, glutamic acid, ammonium and lactate. The analyzer for the reference measurement present analytical errors between 1% and 2% for all the chemical parameters and 3% for cell densities on FLEX2 (internal SD).

Raman spectral data collection
To perform these experiments, all spectra acquired in cell cultures have been performed in situ by ProCellics in-line analyzer. This solution is a Raman analyzer dedicated to bioprocess monitoring using a 785 nm excitation stabilized laser source with an optical power of around 350 mW at the probe tip; it is associated with a high-sensitivity spectrometer with about 14 cm −1 average spectral resolution and 3 cm −1 sampling step in the +150 − +4000 cm −1 Raman shifts bandwidth (Stokes signal), using a back-thinned charge coupled device (CCD) detector operating at −10 • C. This large spectral bandwidth up to 4000 cm −1 Raman shifts gives a strategic full access to major O-H contributions from the aqueous media. The Raman spectra were acquired in-situ using 316 L stainless-steel optical ProCellics probes with sapphire windows. The probes were directly immersed into the bioreactors using PG13.5 cable glands adaptors before autoclave sterilization. To avoid disturbing the optical measurement, an appropriate solution has been found in order to isolate the medium from external straylight (daylight and artificial light), using either an opaque double-layer of thick aluminum foil around the vessel or a light proof fabric (Thorlabs). ProCellics analyzer can also be coupled with a Multi-Channel Unit (MCU) which is an add-on to the main base unit which allows to monitor up to four probes with the same analyzer ( Figure 1). The laser excitation and the spectral data collection were controlled by ProCellics Software (RESOLUTION Spectra Systems) using an Ethernet connection between the instrument and the computer which allows remote and network control. For the study of the inter-probe transferability using the Multi-Channel Unit, an exposure time of 47 seconds and averaging of 25 spectra were used for each 20-minute acquisition. For the study of the inter-analyzer transferability based on the single-channel configuration, an exposure time of 45 seconds and averaging of 20 spectra were used for each 15-minute acquisition. The sampling interval for the Raman spectra was set every 30 minutes in a single-channel configuration and continuously for the 4-probe Multi-Channel Unit sequential acquisitions. The inter-probe transferability study and the inter-analyzer transferability study are based on two different culture processes, and these studies are independent. The acquisition parameters for each of these studies result from choices made by the teams for the process under investigation to obtain the best signal-to-noise ratio and to have acquisition times that are compliant with cell culture constraints.
Cosmic ray removal and automatic dark subtraction were directly operated during the spectral data acquisition. Deviant measurements were automatically sorted before the spectra coaddition using a 3-sigma rejection algorithm, in order to avoid including these distorted data in the calibration dataset or validation dataset: these "spectral outliers" are rare but abnormal Raman spectra in comparison with global spectral trend that could be caused by instrument malfunction (variation or shutdown of the laser emission) or experimental variabilities (microbubbles on the optical probe, external stray light, strong inhomogeneity of the biological process). Before chemometric models building, several preprocessing steps (normalization, Savitzky-Golay filter, spectral selection) are performed with ProCellics Software and off-line references are automatically associated with the spectra.
Instrumental variability may come from different hardware elements of the analyzers, from their lasers through their probes to their spectrometers, typically in function of each laser power, each probe spectral transmission (with filters, collimating optics, and tube lengths) or each spectrometer sensitivity and resolution. Most of these variabilities must be taken into account by factory calibration. For each analyzer used in this study, all raw data are thus calibrated in Raman intensities and Raman shift scales, using the same references and protocols compliant with ASTM standards. 17 This factory calibration procedure must enable the inter-comparison of Raman spectra acquired from several instruments, and from the same instrument associated to a Multi-Channel Unit and with different probes.

Instrument standardization calibration transfer
To enhance the calibration and inter-comparison of Raman spectra acquired from different probes or analyzers, the piecewise direct standardization (PDS) method was implemented in an add-on package of ProCellics Software as described by Bouveresse and Massart. 15 The PDS method allows transferring a set of data from an instrument (called the "master" instrument) to another (called the "slave" instrument) by correcting the differences induced by the sensors. First of all, a collection of spectra from the same experiment with the master instrument ( m T ) and with the slave instrument ( s T ) are acquired. From these m T data, a consistent subset of spectra ( m sub T ) can be selected from the data collection. The selection is performed by applying a Principal Component Analysis (PCA) over the data collection followed by the Kennard Stone (KS) algorithm 18 in the PCA space to uniformly cover the whole dataset. The KS algorithm implemented is based on the Euclidean distance in the PCA space of spectra, and is efficient in selecting the most relevant samples and reducing the presence of artifacts in the dataset. As described by Guenard et al, 19 this step is essential in determining a successful multivariate calibration model transfer since the PDS alone does not retain the ability to use statistical outlier diagnostics and, therefore, would be less useful for determining whether the predictions on real unknown samples were truly following the transfer of calibration model. The same data subset acquired with the slave instrument ( s sub T ) is then retrieved from s T . The next step is to apply the PDS algorithm which consists in a multivariate Principal Component Regression (PCR) on the subsets m sub T and s sub T for a given range of wavelengths. From each wavelength index (i = 0, . . , n) of m sub T , a corresponding window (w i ) in m sub T is used to compute the PCR: where w i has a size of 2k + 1 wavelengths and the PCR is solved by: The resulting regression coefficients b(i) are inserted in a banded diagonal transfer matrix ( f T ). Finally, f T is used to transfer all the spectra from the slave instrument in the master instrument: is the spectra data collection of the slave instrument transferred in the master instrument; an additive background vector bk could be used to correct major baseline shifts (typically determining the average of the differences between the master dataset and the slave dataset previously processed with the PDS transfer matrix). So, to be efficient, it is better to use a training dataset for the PDS composed of elements from similar or same processes (same CHO cell line for example) to those which will be studied later with the transferred calibration dataset. In this study, the KS PDS algorithm was applied to the calibrated spectra before the preprocessing in a perspective of real-time use of the instrumental transfer. It was applied to the 300-3900 cm −1 Raman shifts range to focus the transfer on the spectral region of interest (since the signal before 300 cm −1 is filtered by the probe in order to reject Rayleigh scattering of the excitation source, and the signal after 3900 cm −1 does not have a useful Raman signature), with a window size equal to 1 (k = 0) since the instrumental factory calibration allowed to achieve an accuracy better than ±2 cm −1 in Raman shifts which was lower than the spectral resolution sampling. For the size on the KS PDS subset, the inter-instrument transferability presented below shows that considering around 10% of the global number of samples per batch is effective; for the inter-probe transferability, 10-point subset is thus used.

Spectral preprocessing and multivariate modeling techniques
Preprocessing steps were directly performed with ProCellics Software. Bands appearing on the spectra can be linked to chemical structures ("fingerprints"). The spectral regions can thus be selected according to their interest in the creation of the model. In the present typical cell culture cases, the spectral regions of interest were between 350 and 1775 cm −1 and between 2800 and 3000 cm −1 and are used for every component. In addition to the selection of the spectral values, two treatments were chosen as the most efficient combination for the quality of the models built in this study: a customized Standard Normal Variate (SNV) on the spectral region of interest of water followed by the first derivative (derivative order 1, step 15 cm −1 , polynomial order 2) according to the Savitzky-Golay (SG) algorithm. 20 The SNV preprocessing removes a constant offset term in the spectra since it normalizes each value of the spectra by the SD of pooled variables (wavenumber of considered spectral range), thus bringing all spectra to the same scale with a unit SD. Mathematically, it is identical to an autoscaling of the rows instead of the columns of the spectral dataset matrix. The classical SNV is a normalization applied to the entire spectrum calculating the SD of all the wavenumbers for the given spectra without consideration of a specific spectral bandwidth, thus bringing all spectra to the same scale with a unit variance and a zero average, and then rescaling them ( Figure S1, see supplementary material file). The customized SNV is a normalization applied using the Raman signature of water (between 3100 and 3700 cm −1 ) as a reference since it could be consider as an invariant solvent in the process between different batches and conditions, that is to say that each element of the whole spectrum is divided by a common factor equal to the area of this specific range, and then rescaling ( Figure S1): depending on the case studied, the customized SNV may be judged to be more effective-especially for aqueous media-than the classical algorithm which cannot take into account large concentration deviations as no absolute reference is used (local model validity). The goal of the derivative preprocessing is to remove backgrounds and reduce the noise content of the spectra by smoothing, and numerically increase the apparent resolution of the spectra by a sharpening of the zero-order spectra peaks (typically by a factor of 2 for Gaussian bands with a second-order derivative).
The data processes carried out enabled a decrease of the fluorescence impact, and an increase of spectral differences was noticed. PCA is an unsupervised data transformation procedure of complex datasets. 21 PCA's score plots have been used as first analysis of the data in order to see any trends, dispersion of the data or clusters. In addition to the spectrometric calibration, a PLS regression was applied to monitor and control industrial processes. 22 Multivariate data analyzes were performed on SIMCA 15 Users Software (Sartorius Stedim Biotech) and PLS modeling was used to build linear models that specify the relationship between the result variables (Y ) and some input and process variables (X) for each observation. The (X) matrix included the spectral range under study, the (Y ) matrix was composed of the molecule concentrations measured off-line (glucose, lactate, etc.) and an observation is defined as one averaged spectrum associated with the corresponding off-line measurements. To assess the robustness of the models prior to any prediction or monitoring, the PLS modeling was based on a K-fold cross validation (default SIMCA parameters). After developing the model, the omitted data were used as a test set, and the differences between actual and predicted Y -values were calculated for these data points. In all the experiments detailed below, the test sets are individual batches independent from the ones used for model building. A popular parameter to interpret the performances of the models is the RMSEP. It is computed as RMSEP , where Y obs-Y pred refers to the predicted residuals for the observations in the prediction set. To assess bias and precision separately on the RMSEP, it could be useful to separate it into a bias and a bias adjusted RMSEP, noting that mean squared error can be described as the addition of model variance, model bias, and irreducible uncertainty. Global RMSEP remains the tool generally used to measure the predictive power of the model. For each model, R 2 Y index (correlations between components and input variables) and Q 2 (Cum) index (overall measurement of the fit and predictive quality of the model) are given by SIMCA software and used as statistical tools to assess the quality of the models-those indicators vary between 0 and 1, the closer to 1, the better the model is. The number of latent variables chosen (A) for the PLS models are selected to balance R 2 Y and Q 2 (Cum) indicators, cross-validation and to avoid overfitting. As cell cultures are based on complex solutions and conducted with variable processes, the number of latent variables can vary between 3 and 10 depending on the monitored component. 1,3

Inter-probe transferability inside a Raman multi-channel analyzer
Instrumental variability between probes inside a multi-channel analyzer (MCU) has been studied. The study consisted in comparing two calibration model strategies, given that data were collected by all probes: "global chemometric calibration models" integrating the inter-probe variability in the models and "chemometric calibration models per channel" to decorrelate each probe. Then, model prediction errors obtained were compared to determine the best calibration model strategy.
For the chemometric calibration "global models," off-line data from eight cell culture runs, performed during two process sessions of four batches over the four channels, were combined with their respective Raman spectra. These latter were used to produce calibration models for glucose, TCD, VCD, lactate and ammonium. As shown in Table S1, the "global models" have high explained (R 2 Y > 0.90) and predicted (Q 2 (Cum) > 0.90) performances for all parameters, except for ammonium, probably because the principal molecular bounds of NH 4 are found in several elements and consequently difficult to decorrelate. For the chemometric calibration "models per channel," only for three of the four channels, off-line data from two cell culture runs performed during two process sessions, with one batch from both sessions per channel, were combined with their respective Raman spectra. These latter were used to produce calibration models for glucose, VCD, TCD, lactate and ammonium. As shown in Table S1, the "models per channel" have high explained (R 2 Y > 0.90) and good predicted (Q 2 (Cum) > 0.80) performances for all parameters, except for ammonium, probably for the same reason as in global models.
To compare the effect of both strategies on model prediction errors, the maximum value percentage errors of RMSEPs were calculated (Table 1). Models were validated using new data, not included in the calibration models, from a third session of three batches on three channels, performed under similar process conditions as the ones used for the calibration batches. The RMSEPs are determined as an average error of each element for three validation batches of the third validation session, from the first three probes for the global model and from each of the three independent probes for the models per channel, to be comparable in the averaging of errors. Validation results show that models for glucose, TCD, VCD and lactate performed with the global dataset present results with higher prediction accuracy than the results obtained by the models with the "per channel" datasets (Table 1). These results may be mainly explained by the huge difference in the dataset size (x4 for global model datasets), but this can be considered as more in line with a real user case when using a Multi-Channel Unit and therefore generating more data than a single-probe system but in the same acquisition time. It can be seen as a demonstration that a multi-channel solution can accelerate the implementation of Raman monitoring in a process development context, while guaranteeing the possibility to use the same chemometric models. Indeed, global calibration models present low model prediction errors, less than or equal to 10% for glucose, TCD, VCD, lactate and less than 12% for ammonium ( Table 1). As shown in Figure 2, using the global calibration models judged as the most efficient, the predicted kinetics are highly reliable for each component among the three validation batches.
After studying instrumental variability between probes with data collected by all probes, instrumental transferability has been investigated by predicting data collected with one probe unknown by the models built with three other probes of a multi-channel analyzer. In other words, data used in the model building step were collected with three of the four probes and predicted data were collected with the fourth probe. Instrumental variability between probes, highlighted by the PCA's score plot on their preprocessed spectra ( Figure 3A), could not be properly included in the model for the fourth probe: the main differences on the PCA's score plot of the four channels could be identified as a mix between instrumental variability (difference of probes collection), setup variability (probe installation in each bioreactor) and biological variability (process inhomogeneity) which affect the Raman spectra between probes mainly with small intensity variations ( Figure S2).
To correct it and maintain the model prediction accuracy, the Kennard-Stone piecewise direct standardization (KS PDS) method was tested on this dataset. Two different instrumental calibrations were compared: the classical instrumental calibration (without KS PDS) and the particular instrumental calibration with a 10-point subset KS PDS transfer where channel 1 was considered as the master and channel 2, 3 and 4 were considered as the slaves to be transferred. As shown on the PCA's score plot ( Figure 3B), the KS PDS instrumental calibration allowed to drastically reduce instrumental variability between the different channels ( Figure 3A): indeed, the application of the KS PDS transfer calibration on spectra decreased instrumental variability minimizing the intensity differences between probes (correction of mean Abbreviations: PLS, partial least square; RMSEP, root mean square error of prediction; TCD, total cell density; VCD, viable cell density. Off-line data from nine cell culture runs were combined with their respective Raman spectra obtained with channels 1, 2 and 3 of the multi-channel analyzer (MCU) and used to produce global calibration models for glucose, TCD, VCD, lactate and ammonium ( Table 2). The data of channel 4 of the MCU was used as validation dataset. For the KS PDS application, calibration datasets of channels 2 and 3 were considered as "slave" channels of the "master" channel 1; validation dataset of channel 4 used for prediction was then also considered as a "slave" of the "master" channel 1.
For both instrumental calibrations, the models have high explained (R 2 Y > 0.90) and predicted (Q 2 (Cum) > 0.90) performances for all parameters (except for ammonium) as seen in Table S2. Then, models were validated using new data from three batches with Raman spectra acquired with channel 4 of the MCU and performed under similar process  Table 2). Validation results showed that the model for glucose was already little impacted by instrumental variability, probably because glucose is the most defined and characterized compound in the medium and then, the KS PDS method did not improve the prediction accuracy. The increase in the RMSEP for glucose might be linked to the high robustness of its measurement with Raman spectra: ∼6% is a good result and any new PDS correction may only increase this error bringing mathematical noise. Nevertheless, in both cases, with and without KS PDS, glucose models presented less than 10% of prediction errors (Table 2 and Figure 4A). However, it has been found that models for TCD, VCD and lactate performed with the KS PDS calibrated dataset presented results with a slightly higher prediction accuracy (around 10% of prediction errors) than the results obtained with the classical calibrated dataset (from 13% to 20% of prediction errors) (  Figure 4B-D). It is worth noticing that the models with the KS PDS application seem to slightly overestimate the prediction for the TCD and VCD, particularly between 150 and 300 hours of culture time. On other batches of the same process with the same model (data not shown), this bias is not retrieved: one batch does not show bias, the other one shows a slightly negative bias. The average RMSEPs in Table 2 allow to better summarize the results observed on all the batches and are more representative of the impact and the quality of the KS PDS correction. Finally, the ammonium models presented results not improved by the KS PDS with still around 10% of prediction errors (Table 2 & Figure 4E).

Inter-instrument transferability between two analyzers (device to device)
In the second part, the impact of instrumental variability between two different analyzers with each its single probe is investigated. First, two CHO cell culture runs were performed with two analyzers at the same time to test this variability: for each cell culture, the probe from each analyzer was simultaneously placed in the same bioreactor. Due to the absence of biological variability and position of both probes inside the bioreactor, variability observed on the PCA's score plots ( Figure 5A) reflected the instrumental variability between both instruments (analyzers n • 1 and n • 2) only. This variability may already be slightly observable on their spectra ( Figure S3), because of probes collection differences and detection inhomogeneities which can be seen as residues from the factory calibration. Based on user cases, two solutions to reduce the impact of calibration transfer on model prediction errors are tested and evaluated: a calibration transfer solution and a chemometric solution.
The calibration transfer solution with KS PDS algorithm was tested in order to meet the need of a direct monitoring with a new analyzer unknown by the models. It aims to respond to a classical situation of a direct instrument substitution without the possibility to perform a new calibration batch. Given that the new analyzer (analyzer n • 2) was considered as the reference, the strategy was to calibrate data obtained with the old analyzer in order to do as if they were performed with the new one. Resulting data were used to produce calibration models as if they were obtained with the new analyzer. The calibration transfer solution was based on the KS PDS method where the new analyzer was considered as the master and the old one as the slave. As experienced in the previous study for probes, KS PDS instrumental calibration could help drastically reduce instrumental variability between the different analyzers, directly observable on their transferred spectra ( Figure S3) and confirmed by the PCA's score plot ( Figure 5B).
To evaluate the impact of the subset size in the KS PDS transfer algorithm, calibration data from two batches were used to produce eight datasets to build calibration models for glucose, TCD, VCD, glutamic acid, lactate and ammonium (Table 3). Thus, off-line data from two cell culture runs were combined with their respective Raman spectra obtained with analyzer n • 1 that was considered as the old one and consequently as the slave. To do as if these data were obtained with a new analyzer (named analyzer n • 2), seven different instrumental calibrations were set up for the analyzer n • 1 with PDS transfer matrices based on subsets of 5 points (6% of the training dataset), 10 points (12%), 20 points (25%), 30 points (38%), 40 points (50%), 60 points (75%), 80 points (100%) determined by the KS algorithm over 80 references. At this stage of the study, the KS PDS matrices were built from three culture runs analyzed with both analyzers simultaneously. Models were performed for each seven transferred datasets as well as for the original dataset without KS PDS. As shown in Table S3A, the models have high explained (R 2 Y > 0.90) and predicted (Q 2 (Cum) > 0.80) performances for all parameters.
To test the effect of the KS PDS method on model prediction errors, a third batch measured by analyzer n • 2 (the new one) was considered as a validation batch and performed under similar process conditions as calibration batches (measured by the previous analyzer n • 1). The maximum value percentage errors of RMSEPs were calculated for each calibration method. Validation results show that the models for glucose, TCD, VCD, ammonium and lactate performed with the KS PDS calibrated datasets present results with a higher prediction accuracy than the results obtained with the classical calibrated dataset without KS PDS (Table 3). For instance, the KS PDS calibration allows to divide the model prediction errors by four for glucose and by two for TCD. As shown in Figure 6, predicted kinetics were highly reliable for each component when applying the KS PDS data transfer before preprocessing, mainly reducing the instrumental bias between analyzers (Table S3B). Moreover, the comparison between the results of the different KS PDS transfer subsets sizes shows higher prediction accuracy for 10 points (Table 3 & Figure 6). Consequently, it can be assumed that the selection of only 10% of the points (determined with optimization by the KS algorithm) in a shared training dataset representative of the global process could be sufficient to build the PDS matrix. Mostly, this kind of transfer method could be used in a regular instrument calibration process for a new analyzer with reference samples measured before the first use of the analyzer, allowing the very fast integration of this new device. Using such instrument calibration method, the old analyzer could then be considered as the master, and the new analyzer as the slave depending on the model maintenance strategy.
The second solution tested to reduce the impact of instrumental variability between two analyzers on model prediction errors was a chemometric solution, based on the integration of new calibration data collected with the unknown analyzer. This solution responds to a situation in which the analyzer substitution could be anticipated and with the possibility to perform new calibration batches. To correct instrumental variability and maintain the model prediction accuracy, a   Note: Models with 0/3, 1/3 or 2/3 batches from the new analyzer n • 2 in the calibration dataset (others from the previous analyzer n • 1), predicting a validation batch monitored by analyzer n • 2.
Abbreviations: PLS, partial least square; RMSEP, root mean square error of prediction; TCD, total cell density; VCD, viable cell density.
comparison study between models that integrate, or not, new calibration batches (one or two) was performed. Calibration data from three batches and two analyzers were used to produce three different datasets to build calibration models for glucose, TCD, VCD, ammonium, glutamic acid and lactate. Off-line data from the three cell culture runs were combined with their respective Raman spectra obtained by two different analyzers to generate the three calibration datasets for models building in an unweighted way: either with batches from the previous analyzer (analyzer n • 1) only, or with two batches from the previous analyzer (analyzer n • 1) and the third batch with the new one (analyzer n • 2), or with one batch from the previous analyzer (analyzer n • 1) and with two batches from the new one (analyzer n • 2). As shown in Table S4, on the three dataset balances tested, all models have high explained (R 2 Y > 0.90) and predicted (Q 2 (Cum) > 0.90) performances for all parameters. Then, models were validated using one new batch measured by the new analyzer (analyzer n • 2). This fourth validation batch was not included in the calibration models and was performed under similar process conditions as those used for the calibration batches. To evaluate the chemometric capability to integrate instrument variability, the maximum value percentage errors of RMSEPs were calculated in each balance case ( Table 4). As soon as the new instrument variability was integrated in the models, the validation results showed much lower model prediction errors, less than 10% for glucose, TCD, VCD, glutamic acid and lactate and less than 30% for ammonium (the bias on this complex component is still significant and requires the insertion of a second batch to fall below 20%).
As shown in Figure 7, the predicted kinetics were highly reliable for each component as soon as a batch from the new analyzer is included in the calibration. These results demonstrated that the use of an already existing and consistent dataset of a previous analyzer is relevant for the integration or the substitution by a new analyzer: inserting only one batch measured with the new analyzer in the calibration dataset to predict a batch monitored by this same analyzer was sufficient to get rid of the prediction gap induced by instrumental variability, without having to go through a prior calibration transfer between instruments.

DISCUSSION
These transferability studies validate some methods to manage the calibration transfer between different devices and different hardware configurations in the context of an extended use of Raman monitoring inside bioprocessing sites. It is worth noticing that the presented study is based on real Raman and off-line data as well as on predictions performed with multivariate models. This posteriori prediction was necessary in order to compare different methods on the same datasets. However, the objective was to develop solutions which can be integrated in a real-time monitoring data processing. The good results presented in these studies are nonetheless drawn from a limited number of hardware devices (four probes for the inter-probe transferability study and two analyzers for the transferability between instruments study). The extrapolation of the performances of these methods may be studied on a larger scale with a greater number of instruments and multiplexed probes, typically corresponding to production lines in the biopharmaceutical industry. Mathematical instrumental standardization methods such as the PDS can be added to the standard factory calibration step of a single instrument, in order to facilitate the transfer of PAT models between devices before or/and in addition to the chemometric transfer. Strategies for creating and optimizing the parameters of these methods must be carefully considered. The PDS method works well when the transferred associated instruments are similar, sharing the same wavelength range and sampling frequency. However, PDS cannot be used directly to transfer spectra of different wavelengths, or with different preprocessing, or recorded by instruments with different spectrometer configurations in resolution and sampling. Additionally, for models that already have excellent prediction capabilities (such as glucose), the use of PDS should be carefully studied as it might induce some mathematical noise. The combined use of other calibration transfer algorithms based on dataset matrix orthogonalization-such as Dynamic Orthogonal Projection, 23 External Parameter Orthogonalization, 24 and Orthogonal Signal Correction 25,26 -or directly on Raman spectra processing-such as Shenk-Westerhaus algorithm 27,28 or Spectral Space Transformation 29 -are attractive openings to further maintaining the predictive abilities of multivariate calibration models with large scale instrument installations. A challenging extension of this work could be to develop a similar methodology for calibration transfer between different Raman analyzers from different suppliers. The construction of chemometric models on a large number of instruments will require the optimized selection of reference samples for different processes, potentially via the building of local models based on locally weighted partial least square (LW-PLS) used as a Just-In-Time (JIT) learning method. 30 Finally, the long-term maintenance of models is a major question to be taken into account from the beginning of the instrumentation implementation to ensure the continuity of measurements without increasing the chemometric prediction errors, and even improve models over time and changing instruments with the least amount of time and cost. 12 The support of real-time machine learning methods is thus interesting in this perspective, even if the authorization of automatic updating methods within a regulated environment seems still complex.

CONCLUSION
In this paper, based on cell culture user cases, two types of Raman instrumental variability and their impacts on PLS model prediction errors were studied. Firstly, instrumental variability between probes inside a Raman multi-channel analyzer has been explored. It is worth noticing that, in this study, calibration data were collected with one probe unknown by the model. The results showed that the KS PDS calibration method used allowed to correct instrumental variability and to obtain good model prediction accuracy. Secondly, instrumental variability between two analyzers was investigated, in the case of an analyzer replacement. To perform monitoring directly with the new analyzer, it has been shown that the KS PDS method enabled to lower the impact on model prediction errors. However, if it was possible to perform new calibration batches, it has been demonstrated that only one batch with the unknown analyzer in the calibration dataset was sufficient to get rid of the prediction gap induced by instrumental variability. Consequently, to meet the need of an analyzer integration or substitution, a first monitoring can be directly performed with the KS PDS strategy. Then, data obtained in this new shared batch should be inserted in the calibration dataset to integrate instrumental variability in the chemometric model and thereby to correct it. For every transferability application tested in this study, the requirements for the method performance are fulfilled for all the process parameters (except for ammonium): less than 10% for the compounds and less than 15% for the cell densities. It is worth noticing that these good results should be further confirmed with more data and instrument configurations, potentially also on other processes. To conclude, it has been demonstrated that the impact of instrumental variability over Raman probes and analyzers on multivariate calibration model prediction errors can be corrected to maintain an accurate cell culture monitoring. By doing so, this study demonstrates that datasets can be easily transferred from one device or configuration to another, leading to the conclusion that the first step for fully-automated plant is set up.