Classifying Degraded Modern Polymeric Museum Artefacts by Their Smell

Abstract The use of VOC analysis to diagnose degradation in modern polymeric museum artefacts is reported. Volatile organic compound (VOC) analysis is a successful method for diagnosing medical conditions but to date has found little application in museums. Modern polymers are increasingly found in museum collections but pose serious conservation difficulties owing to unstable and widely varying formulations. Solid‐phase microextraction gas chromatography/mass spectrometry and linear discriminant analysis were used to classify samples according to the length of time they had been artificially degraded. Accuracies in classification of 50–83 % were obtained after validation with separate test sets. The method was applied to three artefacts from collections at Tate to detect evidence of degradation. This approach could be used for any material in heritage collections and more widely in the field of polymer degradation.


Introduction
The Supplementary Information provides details of the objects analysed, the SPME-GC/MS method used, the data processing methods and the statistical analysis.

SPME-GC/MS analysis of modern polymeric samples
The modern polymeric objects analysed for this research and their polymer types are shown in Table  1. The objects were chosen to represent the range of material types found in museum collections. The objects from which pieces were taken and degraded for 0,2,4,6,8 and 10 weeks at 80°C and 65% relative humidity (RH) are also shown. The degraded pieces were also sampled as described below. The total number of samples analysed was 211.  [a] Objects from which pieces were taken and artificially degraded for 0,2,4,6,8 and 10 weeks at 80°C and 65% RH. The degraded pieces were also sampled and analysed using SPME-GC/MS.
The development of the solid-phase microextraction gas chromatography/mass spectrometry (SPME-GC/MS) method has been described elsewhere. [1] Objects were sampled by grinding using an Everise Rotary Tool (Code: N60GR) to collect 50 mg samples. It had previously been found that grinding homogenised the sample surface area and led to more repeatable analysis. Samples were then placed in 20 ml Chromacol headspace sample vials (20-HSV T229) and sealed with Chromacol 18 mm Magnetic Screw Caps with a 1 mm Silicone/PTFE Liner -Not Prefitted (18-MSC-ST101) for 6-10 days at room temperature to allow volatile organic compounds (VOCs) to accumulate in the headspace. The laboratory temperature was monitored and the average value was found to be 25°C over a 6-month period. HS-SPME-GC/MS analysis was performed at room temperature using a DVB/CAR/PDMS SPME fibre (50/30 μm) (Supelco, 57298-U). Analysis time for the samples was 1 h and for the standard solutions was 20 s.
For the analysis of objects at Tate, a DVB/CAR/PDMS SPME fibre (50/30 μm) (Supelco, 57298-U) was placed close to the object for one week at room temperature (see Figure 1 in the main paper).
Analytes were recovered from the fibre by heating in the injection port of a Perkin Elmer Clarus 500 gas chromatograph equipped with Combipal PAL System (CTC Analytics) autosampler coupled to a Perkin Elmer Clarus 560D mass spectrometer. A VOCOL column (Supelco, 20% phenyl-80% methylpolysiloxane) was used (60 m in length and 0.25 mm in diameter) to separate the VOCs using the oven programme as follows: initial temperature of 50°C (hold for 5 min), ramp rate of 10°C/min to 100°C, then 5°C/min to 200°C, then 2°C/min to 220°C, which was held for 20 min. The carrier gas was helium with a constant flow of 1 cm 3 /min. The injector temperature was 250°C and the injector was used in splitless mode with a 1 min injection. The interface and source temperatures were 200°C and 180°C respectively. Mass spectra were collected under electron ionisation (EI) mode at 70 eV and recorded from m/z 45-300 with a scan time of 0.4 s and an interscan delay of 0.05 s. VOC peak identification was performed using the NIST 2005 Mass Spectra Library V2.1.
Representative chromatograms showing detected VOC emissions from samples of two different modern polymeric objects are shown below in Figure 1.

Data processing using XCMS Online
XCMS Online is an online version of the XCMS metabolic software from the Scripps Center for Metabolomics which allows users to upload and process chromatographic data. [2] For this work, the chromatographic files generated from the GC/MS TurboMass version 6.1.0 chromatographic software were converted to .mzXML files. These were then uploaded to XCMS Online and processed. Separate jobs were run for each polymer type. Pairwise jobs were run for the samples while Single jobs were run for the analysis of objects at Tate. The parameters used are shown in Table 2. XCMS Online produced Extracted Ion Chromatograms (EICs), each of which corresponded to the detected signal across bins of m/z values of a set width. [3] Peak detection using m/z values is preferred by the developers of XCMS over detection using retention time, as it is subject to less experimental variation. Figure 2 shows an example of an EIC produced using XCMS Online, which corresponds to the detection of dimethyl phthalate from cellulose acetate and cellulose propionate samples. Multiple EICs were produced for each analyte and those that were most consistent across each sample were chosen for further analysis. The EICs were filtered using a second derivative Gaussian function and integrated. These were matched across different samples and retention time correction was also performed. A table of integrated peak areas and their corresponding retention times (RTs) and m/z values was then produced for each polymer type.
The chromatographic data obtained from analysis of objects at Tate was processed in the same way.

Microsoft Excel 2016 data processing workflow
The datatables produced from XCMS Online were transferred into Microsoft Excel 2016 for further processing. These were first placed in order of increasing RT and then filtered to remove peaks corresponding to compounds that are not of interest to this work e.g. siloxanes generated through degradation of the GC column. At each RT, multiple peaks were identified, corresponding to different m/z values found in the mass spectra. For each RT, the three most intense peaks were retained and the others were discarded.
The NIST 2005 Mass Spectra library was then used along with the TurboMass software version 6.1.0 to identify which analytes corresponded to the RTs and m/z values detected. In some cases, peaks with the same RTs and m/z values occurring in different samples were found to correspond to more than one analyte. These were discarded and only peaks which could be unambiguously identified with a single analyte were used for further analysis.
Peak areas were then weighted using external standards, according to a published method. [1] This was done by using SPME-GC/MS to analyse standard solutions on each day on which the analysis of modern polymeric samples took place, with one standard solution analysed after every six samples. This allowed interday variation in instrument performance or environmental conditions to be taken into account. The solution used was the MISA Group 17 Non-Halogen Organic Mix certified reference material, with 2000 μg/mL of each component in methanol, diluted 1/50 in methanol. Peak areas from analysis of the standard solution were used to weight the peak areas of the analytes from the modern polymeric samples run on the same day.
The natural log of each peak area was then calculated so that significant differences in low intensity peaks could be more clearly identified. Peak areas were then normalised to have a mean of 0 and a standard deviation of 1, so that different analytes could be more easily compared. This leads to some peak areas having negative values.
The data processing workflow that was performed in Microsoft Excel 2016 is summarised in Figure 3.
The data produced from the analysis of artefacts at Tate was processed in the same way.

Linear Discriminant Analysis in IBM SPSS Statistics 22
The datatables for each polymer type generated using Microsoft Excel 2016 were transferred into IBM SPSS Statistics software version 22 for Linear Discriminant Analysis (LDA). An additional column was added so that samples could be classified as described in the main paper:  Class 1: samples artificially degraded for 0-4 weeks  Class 2: samples artificially degraded for 6-10 weeks The class number was used as the Grouping Variable. A further additional column was added so that a test set of samples could be identified for model validation. This was used as the Selection Variable. A value of 1 was assigned to the training set of samples while a value of 2 was assigned to the test set of samples. The parameters used for Linear Discriminant Analysis are shown in Table 3.

Datatable of peaks pruned to remove duplicates & ambiguous peaks
Analytes identified using NIST library Peak areas weighted using standards to account for interday variation

Peak areas log-transformed and normalised to mean 0 and standard deviation of 1
In an initial analysis, all detected VOCs were used as Independent Variables. Using the Tests of Equality of Group Means, the VOCs with the most significant differences in their peak areas between Class 1 and Class 2 were identified. Different combinations of these VOCs were then used as independent variables for further LDA. The combination of VOCs that produced the most accurate classifications of test sets of samples (validation) into the two classes described above are reported in the main paper.
An example of the outputs from LDA using IBM SPSS Statistics software version 22 is shown in Table  4. These results were obtained from LDA of cellulose propionate (CP) samples using dimethyl phthalate and propanoic acid as the independent variables. As can be see, the test set contained 13 samples of which 7 belonged to group 1 (Class 1, the less degraded samples) and 6 belonged to group 2 (Class 2, the more degraded samples). Of the samples in Class 1, 5 out of 7 were accurately classified, while 5 out of 6 samples from Class 2 were classified accurately. The classification accuracy is thus 77% as shown in footnote (a) in Table 4.
Validation accuracy is shown in the section marked "Cases Not Selected", indicating that the samples classified in this approach were separate from those used to develop the model. The test set contained 6 samples, of which 3 samples belonged to Class 1 and 3 to Class 2. Accurate classifications were obtained for 2 out of 3 samples from Class 1 while all samples in Class 2 were accurately classified. The validation accuracy is thus 83% as shown in footnote (b) in Table 4. Crossvalidation was also performed, this involves removing a single sample from the training set and then using the obtained model to classify that sample. This is repeated for all samples. The crossvalidation accuracy of the analysis shown below is 77% (footnote (d) in Table 4). The cross-validation accuracies are not reported in the main paper as validation using a separate test set is more robust.  A table such as that shown in Table 4 was produced for each validation using a different test set. Between 3-6 test sets were used for each polymer type and the validations were tested for multiple combination of VOCs. The average of the validation accuracies achieved using different test sets for a particular combination of VOCs was used to assess which combination of VOCs gave the most accurate overall classifications.
Classification of the artefacts from Tate based on their VOC emissions was performed in the same way, however ratios of VOCs rather than peak areas were used as explained in the main paper. This was to allow for differences in object mass and the volume of the headspace analysed.

6
Data from analysis of VOC emissions from modern polymeric samples

Data from analysis of VOC emissions from Cellulose propionate
The VOCs that were found to give the most accurate validations for cellulose propionate (CP) samples were propanoic acid and dimethyl phthalate. The results of LDA for these samples is summarised in Table 5. The data in the third row of results corresponds to the results shown in Table  4. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
The average classification accuracy was found to be 82% and the average validation accuracy was 78%. These figures are reported in the main paper.
It should be noted that 5 samples of cellulose acetate (CA) were also analysed using SPME-GC/MS analysis and the VOC data was processed in the same way as that for the CP samples. However, there were not enough artificially degraded samples to explore classification. Attempts at classifying the CA and CP samples as a grouped set were less successful than those for the CP samples on their own.

Data from analysis of VOC emissions from Cellulose nitrate
The VOCs that were found to give the most accurate validations for cellulose nitrate (CN) samples were furfural, a terpene identified as either 3-carene or β-terpinene, camphene and campholenal.
The results of LDA for these samples is summarised in Table 6. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
The average classification accuracy was found to be 93% and the average validation accuracy was 83%. These figures are reported in the main paper.

Data from analysis of VOC emissions from Polyethylene
The VOCs that were found to give the most accurate validations for polyethylene (PE) samples were decane and camphor. The results of LDA for these samples is summarised in Table 7. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
The average classification accuracy was found to be 63% and the average validation accuracy was 53%. These figures are reported in the main paper.

Data from analysis of VOC emissions from Polystyrene
The VOCs that were found to give the most accurate validations for polystyrene (PS) samples were cis-β-methylstyrene, acetophenone and (1-methylethyl)-benzene. The results of LDA for these samples is summarised in Table 8. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
[b] This test set corresponds to samples taken from different objects, with one sample being taken from each.
The average classification accuracy was found to be 77% and the average validation accuracy was 62%. These figures are reported in the main paper.

Data from analysis of VOC emissions from Polyurethane
The VOCs that were found to give the most accurate validations for polyurethane (PUR) samples were camphor, phenol, pentanal, 3,5-dimethyloctane and styrene . The results of LDA for these samples is summarised in Table 9. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
The average classification accuracy was found to be 87% and the average validation accuracy was 79%. These figures are reported in the main paper.

Data from analysis of VOC emissions from Poly(vinyl chloride)
The VOCs that were found to give the most accurate validations for poly(vinyl chloride) (PVC) samples were hexanal, 2-ethylhexanol and limonene. The results of LDA for these samples is summarised in Table 10. [a] The test set corresponds to 6 samples, each taken from the same object but degraded for different lengths of time, either 0,2,4,6,8 or 10 weeks at 80°C and 65% RH.
The average classification accuracy was found to be 82% and the average validation accuracy was 50%. These figures are reported in the main paper.

Analysis at Tate
Results from analysis at Tate are shown below. In the case of the CA objects, the ratio of acetic acid to dimethyl phthalate was used and compared with the ratio of propanoic acid to dimethyl phthalate from the CP samples analysed in the earlier part of the research. A comparison of these ratios can be shown in Figure 4. This data shows that the ratio for both objects from Tate is low, more similar to those objects degraded for 0-4 weeks than for 6-8 weeks, indicating lower levels of organic acid emissions and higher levels of plasticiser content.
Using a classification based on Linear Discriminant Analysis and the ratios of the relevant peak areas, performed as described above, both Tate objects were classified as being part of Class 1.