• iron-based inks;
  • tannins;
  • multivariate analysis;
  • cultural heritage conservation;
  • Raman spectra

In this work, multivariate data analysis methods were applied to the analysis and interpretation of micro-Raman spectra, collected from a broad set of historical iron-based ink samples, previously characterised for the content of organic acids (gallic acid, ellagic acid and protocatechuic acid). The proposed method relies on principal component analysis of the noisy spectra typically obtained on original, degraded, organic samples, where fluorescence could affect the Raman signal. The signal components could be distinguished from the noise components and then used to build a linear discriminant analysis (LDA) model, achieving separation of the spectra into three classes. Selection of pure signal factors also improved effectiveness and performances of partial least square regression (PLS) algorithms, allowing quantification of condensed tannic acid residuals. Application of multivariate methods to discriminate signal from noise removes the need for spectral data manipulation (filtering, smoothing and differentiating). The obtained classification method for discrimination of historic inks and the regression method for determination of condensed tannic acid residuals supports the use of Raman analysis of fluorescing organic materials, and may provide information to scholars on ink composition and potentially on its provenance. Copyright © 2013 John Wiley & Sons, Ltd.