Applying Near-Infrared Spectroscopy in Downstream Processing: One Calibration for Multiple Clarification Processes of Fermentation Media


  • Licínia O. Rodrigues,

    Corresponding author
    1. Centre for Biological and Chemical Engineering, IST, Technical University of Lisbon, Av. Rovisco Pais, P-1049–011 Lisbon, Portugal
    • Centre for Biological and Chemical Engineering, IST, Technical University of Lisbon, Av. Rovisco Pais, P-1049–011 Lisbon, Portugal. Phone: (+351) 218419838. Fax: (+351) 218419197
    Search for more papers by this author
  • Joaquim P. Cardoso,

    1. CIPAN S.A. Vala do Carregado, P-2601–906 Alenquer, Portugal
    Search for more papers by this author
  • José C. Menezes

    1. Centre for Biological and Chemical Engineering, IST, Technical University of Lisbon, Av. Rovisco Pais, P-1049–011 Lisbon, Portugal
    Search for more papers by this author


The use of near-infrared spectroscopy (NIRS) is demonstrated in the first downstream processing (DSP) steps of an active pharmaceutical ingredient (API) manufacturing process. The first method developed was designed to assess the API content in the filtrate stream (aqueous) of a rotary drum vacuum filter. The PLS method, built after spectral preprocessing and variable selection, had an accuracy of 0.01% (w/w) for an API operational range between 0.20 and 0.45% (w/w). The robustness and extrapolation ability of the calibration was proved when samples from ultrafiltration and nanofiltration processes, ranging from 0 to 2% (w/w), were linearly predicted ( R2=0.99). The development of a robust calibration model is generally a very time-consuming task, and once established it is imperative that it can be useful for a long period of time. This work demonstrates that NIR procedures, when carefully developed, can be used in different process conditions and even in different process steps of similar unit operations.

1. Introduction

The recovery of an antibiotic from culture media takes several processing steps. The whole set of unit operations where the product is concentrated, isolated, and purified is known as the downstream process (DSP). The fermentation process is carried out in the presence of metabolizable sources of carbon, nitrogen, and many others substances. The final fermentation product received for DSP is a very complex mixture, containing both inorganic and organic substances, with an expected (but not desirable) batch-to-batch variability. The first step in DSP is the clarification of the fermented cultivation media by filtration. In this step the soluble product is separated from cells, colloids, and macromolecules from the culture media (1).

The concentration of the active pharmaceutical ingredient (API) in filtered media is assayed for yield calculations and, more importantly, for planning the next process step, whose operating conditions depend on this parameter. The analysis is usually made in a production support laboratory. It is a time and reagent consuming procedure that requires trained personnel. For this reason, a fast method that could be operated by shift personnel would give the opportunity to speed up the transitions between process steps and to perform them safely during night shifts. Near-infrared spectroscopy is such a technique. It is quick, nondestructive, and can be operated by plant personnel (2). The calibration development, however, can be time-consuming and expensive. The time to carry out a robust calibration is not predictable and depends to a large extent on the availability of laboratory data. It can take several months, since it involves the analyses of a big number of samples of different concentration ranges, process locations, and batches, in order to cover a large enough process variability.

The importance of the present work relates to the demonstration that NIR procedures are able to (a) measure very small concentrations of API in aqueous media besides matrix variability, and (b), when robustly developed, the models can be used in different processes.

The work was carried out at an industrial production process of clavulanic acid, a beta-lactamase inhibitor used to overcome resistance in bacteria that secrete beta-lactamase enzymes. This API is produced by cultivation of Streptomycesclavuligerus in submerged fermentations. The process and product have been thoroughly described in the literature (3, 4). However, to the best of our knowledge, no reports on the use of NIR on DSP clarification processes have been reported.

2. Experimental Section

2.1. Equipment and Software. NIR absorbance spectra were measured with a BOMEM MB-160 spectrometer equipped with InAs detector, an 8 mm vial holder, and a temperature controller. For spectra acquisition GramsAI-7 was used, and for spectra pretreatment and calibration models development the software packages PLS-IQ and Matlab (Mathworks Inc., U.S.A.) with PLS toolbox v.3 (Eigenvector Inc., U.S.A.) were used.

2.2. Spectra Acquisition. The spectral data were acquired at-line from 4000 to 1100 cm-1, with a 16 cm-1 resolution. Each spectrum was obtained as an average of 32 scans. Before starting the measurements, a spectrum of distilled water was always obtained and employed for background correction. Both the sample and the reference were measured at a controlled temperature of 20°C (±0.1 °C).

2.3. DataPretreatment. The spectra were processed by the second derivative Savitsky-Golay method, with a 35 data points window and a second-order polynomial. Partial Least-Squares (PLS) (5) models were built after latent variables selection by cross-validation (leaving 10 samples out).

2.4. Calibration and Validation Samples. Production samples have concentration values in a very narrow range, not adequate for calibration. The generation of batches out of the nominal concentration range is not practicable. Therefore, in order to achieve a wider range, samples from routine lab preproduction tests were used. Validation was, however, made only with samples from the production plant.

Production Samples. After a first filtration step with a rotary drum vacuum filter, the outlet is filtered a second time by a press filter and acidified. Samples of this filtered cultivation media were collected.

Lab Samples. Raw culture media is filtered with a static precoated vacuum drum filter. With variations imposed in the precoat composition and in the amount of washing water, samples were collected within a concentration range twice the production values.

The calibration set was made up of 150 acidified samples collected over a period of 6 months from 25 different fermentation batches in order to capture the typical variability of the fermentation culture media. Each sample was analyzed by the reference method, and the NIR spectra were collected at the same time. The synchronized measurement is, in this case, very important because the samples degrade rapidly under low pH.

2.5. Reference Data. The API content was assayed by a colorimetric method based on the absorption at 312 nm of the product of the reaction with imidazole (6). In order to add analytical variability to the calibration set, two to three different analysts performed both procedures (analytical method and NIR spectra collection). The reference analyses were performed at the same time as the NIR scanning. The reference method's precision was evaluated in terms of its repeatability. Three concentrations were analyzed with three replicates each, and the procedure was repeated by two analysts. The confidence interval for the reference method (95%) is x ± 0.02 (% w/w).

3. Results and Discussion

3.1. Calibration Model Development. The spectra obtained showed baseline shifts (Figure 1). The presence of some organic material, such as proteins, scatter the light when in suspension.

Figure Figure 1..

Raw spectra from filtered cultivation media. Baseline shifts are a consequence from light scattering by precipitate proteins.

These shifts were corrected by applying a second derivative to the spectra by a Savitsky-Golay second-order polynomial procedure. The resulting spectra, employed for multivariate calibration, are shown in Figure 2. After the spectra preprocessing, variable selection was carried out. For each wavenumber, the squared correlation coefficient, R2, of the linear regression between the measured response and the API concentration (7) was computed. No high correlations were found as can be seen in Figure 3.

Figure Figure 2..

Spectra after preprocessing with the Savitsky-Golay second derivative method, 35 points window.

Figure Figure 3..

Univariate correlation plot: squared correlation coefficients, R2, versus wavenumber. Gray areas indicate the selected wavenumbers.

Nevertheless, the intervals with the highest correlation with the analyte were selected (total of 105 variables). The exact variables are described in Table 1.

Table Table 1.. Wavenumbers (cm-1) Selected by Univariate Correlation, Used for the PLS Model for API Content in the Rotary Drum Vacuum Filter Stream

The PLS model, built with 7 latent variables, showed an increased linearity (R2=0.953) with the standard error of cross-validation, RMSECV (eq 1), of 0.014%. The validation data set was predicted by this model with an error, RMSEP (eq 2), of 0.012%, indicating that there was no overfitting. If too latent variables were being used, then the solution would become overfitted. If the model was data dependent, then it would give poor prediction results, which is not the case. Figure 4 shows the prediction results for both calibration and validation data sets.

equation image(1)
equation image(2)
Figure Figure 4..

Calibration building results. (circle marks) - calibration set; (triangular marks) - external validation set; RMSECV = 0.014%, RMSEP = 0.012%.

In the equations above, n stands for the number of prediction samples, Yi(cv) and Yi(pred) stand for the predicted values for cross-validation external validation, and Yi(ref) stands for the reverence value of sample i.

The random splitting of the full data set into training (calibration) and test (validation) sets, commonly used in multivariate modeling, has influence on the final results. The idea of the bootstrap is to assess the variance of the estimates by resampling from the set of samples. The procedure of resampling is repeated a vast number of times. The variation observed among these resampled sets is assumed to be representative as to how the samples may vary when being drawn from the population [e.g., see ref 7]. In order to assess the effect of sampling in the data sets on the reliability of the prediction errors estimation, a bootstrap (8) resampling procedure was performed in the whole data set with 1000 iterations (Figure 5). The 95% confidence interval for RMSEP was found to be from 0.011 to 0.015%, which corroborates the first RMSEP result of 0.012%.

Figure Figure 5..

Validation results from the bootstrap resampling method, 1000 runs. Calibration/validation data splitting = 1/1.

3.2. In-Routine Method Validation.Accuracy. The accuracy is expressed as the root-mean-square error of prediction (RMSEP) of an independent data set. The method was implemented as routine analysis for analysts and plant personnel. NIR predictions for 75 production samples were compared with the results on the same samples by the reference method resulting in a RMSEP of 0.01 (% w/w). This value assures the accuracy of the model since it is in agreement with the RMSECV, and it is less than 1.4 times the reference method error (0.02%). Figure 6 shows the validation results. Because samples are only from the production plant, the concentration range in Figure 6 (from 0.25 to 0.36%) is substantially narrower than that used for calibration (see Figure 5).

Figure Figure 6..

Routine validation results. NIR predicted 75 samples from the production plant with a prediction error of 0.01(%).

Precision. The method's precision, i.e., the closeness of agreement between a series of measurements obtained from multiple sampling of the same sample, was assessed by its repeatability (9). Three samples with nominal concentrations were measured by NIRS, seven replicates each. The average standard deviation calculated was 0.007, leading to a confidence interval (95%) for the analytical result of x ± 0.02%. The precision of the NIR method was found to be identical to the reference method's one.

3.2. Adapting the NIR Method to Different Unit Operations. When membrane filtration is used, the soluble product is recovered in the ultrafiltration's permeate, which is then subjected to a final concentration in a nanofiltration unit (see Figure 7). A major process modification such as upgrading from a rotary vacuum drum filter (RVDF) to a membrane system was expected to turn the former calibration obsolete. However, permeates from ultrafiltration processes, nanofiltration retentate, and RVDF samples differ mainly in API and impurities concentrations. As such, the first approach was to predict ultra- and nanofiltration samples using the existing model. The results are presented in Figure 8, where the original calibration range is highlighted.

Figure Figure 7..

Schematic description of the membrane processes.

Figure Figure 8..

Ultra- and nanofiltration samples predicted by the RDV filter model. Reference values (black squares); NIR predicted values (white squares); model's calibration range (gray area).

Although the accuracy at higher concentrations was not very good, the PLS model predictive ability in samples far outside the calibration range was very promising. The linearity of the prediction results was evaluated by plotting the results versus the reference values (Figure 9). A correlation coefficient of 0.99 was achieved with a slope of 0.81, thus indicating that the variable selection and the spectral preprocessing assured the model's selectivity toward the desired analyte.

Figure Figure 9..

Linearity check of the extrapolated values. The dotted line shows y=x direction.

Considering these results, the data were added to the calibration set, and the former PLS was updated using the previously established building conditions (preprocessing, variables and latent variables). The RMSECV of this model (Figure 10) was 0.02%, which when compared to the previous case is not significantly higher, regarding that this value is directly affected by the concentration range of the analyte.

Figure Figure 10..

Model extension to wider nano- and ultrafiltration's operational range. The plot of API predicted values (expressed as percentage) is in relation to the reference method (intercept = 0.006, slope = 0.981, R2 = 0.993, RMSECV = 0.02%).

4. Conclusion

A NIR method is proposed as an alternative to the labor-intensive spectrophotometric method for the assay of an active pharmaceutical ingredient in filtered fermentation culture media. The prediction results of the NIR method, developed for the rotary drum filter permeate stream, proved to be consistent with those provided by the reference method.

The same model can also be used in samples from nanofiltration and ultrafiltration processes. The prediction ability of the original model was enhanced by simple addition of new data. If a proper selection of calibration samples is performed, the same calibration can be used in different unit operations of the production process. The minor changes to the original calibration model are an effort that is largely compensated by the shortening in the calibration development time and by savings in analytical method values.

The use of an NIR method has a positive impact for the production scheduling, since halting production between process steps is shortened and night shifts can operate without laboratory personnel.


Licínia Rodrigues gratefully acknowledges the financial support from the Portuguese Foundation for Science and Technologies (grant BDE/15514/2004). The authors also thank Companhia Industrial Produtora de Antibióticos S.A. (CIPAN) in Portugal for providing the best conditions to carry out this work in the downstream plant and for authorizing its publication.