Gas chromatography/flame ionisation detection mass spectrometry for the detection of endogenous urine metabolites for metabonomic studies and its use as a complementary tool to nuclear magnetic resonance spectroscopy



Metabonomics is a relatively new field of research in which the total pool of metabolites in body fluids or tissues from different patient groups is subjected to comparative analysis. Nuclear magnetic resonance (NMR) spectroscopy is the technology that is currently most widely used for the analysis of these highly complex metabolite mixtures, and hundreds of metabolites can be detected without any upfront separation. We have investigated in this study whether gas chromatography (GC) separation in combination with flame ionisation detection (FID) and mass spectrometry (MS) detection can be used for metabolite profiling from urine. We show that although GC sample preparation is much more involved than for NMR, hundreds of metabolites can reproducibly be detected and analysed by GC. We show that the data quality is sufficiently high – particularly if appropriate baseline correction and time-warping methods are applied – to allow for data comparison by chemometrics methods.

A sample set of urines from eleven healthy human volunteers was analysed independently by GC and NMR, and subsequent chemometrics analysis of the two datasets showed some similar features. As judged by NIST database searches of the GC/MS data some of the major metabolites that are detected by NMR are also visible by GC/MS. Since in contrast to NMR every peak in GC corresponds to a single metabolite, the electron ionisation spectra can be used to quickly identify metabolites of interest if their reference spectra are present in a searchable database. In summary, we show that GC is a method that can be used as a complementary tool to NMR for metabolite profiling of urine samples. Copyright © 2006 John Wiley & Sons, Ltd.

Over the last decade a significant amount of research has been targeted towards the ‘-omics’ technologies, namely, genomics and transcriptomics, which examine genetic complement and gene expression; proteomics, which involves the analysis of protein expression and cell signalling; metabolomics, which measures metabolic regulation and fluxes in individual cells or cell types; and metabonomics, which investigates systemic biochemical profiles and monitors the regulation of function in whole organisms by analysing biofluids and tissues. All these technologies are used to describe the biology of complex systems and the results have highlighted how the different levels of biomolecular organisation and control are interdependent, and can also be influenced by external factors such as environment or nutritional status.1, 2 A response to a stressor, such as a toxin, is sequential in its nature, but the onset and duration of changes in gene expression, protein synthesis and post-translational modification, and subsequent effects on metabolic processes differ significantly.1, 3 Therefore, measuring a biological system at only a single, fixed time point can be misleading, since metabolic fluxes occur very rapidly. Metabonomic technology has shown that it is possible to collect and analyse biological samples at multiple time points within a study efficiently and to relate the findings to biological endpoints such as pathology or disease.4

Nuclear magnetic resonance (NMR) spectroscopy has, until recently, been the analytical tool of choice for metabonomic studies.5–8 With automation, little sample preparation and simple data set conversion, the technique is relatively quick and provides a complete overview of a biological sample in one experiment. NMR-based metabonomics has been extensively used in the area of pre-clinical toxicology studies.9, 10 Inter-animal, diurnal variation, and other influences have been characterised,11 and first applications in human studies have been shown.12–14 Although NMR has many advantages including relatively high-throughput sampling, and new developments to improve sensitivity, such as cryo-cooled probehead technology, the technique has three major drawbacks:

  • Sensitivity is a significant issue, with many metabolites being present at sub-µg/mL concentrations which can make them difficult to detect by NMR methods.

  • Since there is no separation step involved in the NMR process, signals from water and other xenobiotics need to be removed so that they do not influence the post-acquisition statistical treatment. In the process of removing such signals there is a chance that potentially important metabolite signals are also removed.

  • With the type of high field instrument required for metabonomic studies, cost can be a major consideration in the development of a metabonomic screen.

More recently the use of chromatographic methods for the analysis of urinary metabolites has been proposed; being able to analyse individual metabolites in a complex mixture by mass spectrometry (MS) methods after separation can potentially be very useful in metabolite identification and quantitation. A few studies have shown that it is feasible to use LC/MS technology for the analysis of urine metabolites.15–18

Capillary gas chromatography (GC)19–24 has long been an established technique for the analysis of volatile components. In conjunction with flame ionisation detection (FID),25, 26 relative concentrations of analytes can be determined based on their total carbon content. Targeted screening of biological fluids and microbial matrices has been carried out by this technique making use of its inherent sensitivity and selectivity.27–29 Fully automated measurement of certain urine metabolites in a clinical environment by GC/MS has been described.30, 31 One study targeted at urinary organic acids as markers of a metabolic disorder has made use of principal component analysis (PCA) for data analysis.32

The majority of studies using combined GC/MS and chemometrics techniques have been in the field of metabolomics for plant studies.33–36 A few studies have used the technique to study biological matrices, including mouse tissue extracts,37 human blood plasma,38 and human serum.39 In these cases two-dimensional (2D) GC/time-of-flight (TOF) analysis has been utilised.

In recent years algorithms for feature alignment – so-called time-warping – have been improved to effectively deal with subtle shifts in elution time due to variations in injection timing, column age and recovery between runs, carrier flow rates, temperature variations, overloading by high concentration components, etc.40

In this study we have explored the use of GC coupled with both FID and MS detection for the ‘global’ metabolite analysis of human urine.



Urine samples were collected from healthy human male volunteers and were stored at −80°C until needed.

Phosphate buffer (0.2 M) was prepared using Na2HPO4 and NaH2PO4 (both Sigma-Aldrich, Schnelldorf, Germany) in water (including 20% D2O; Goss Scientific Instruments, Great Baddow, UK), containing 1 mM 3-(trimethylsilyl)propionic-2,2,3,3-d4 acid sodium salt (Sigma-Aldrich, Gillingham, UK) as a reference standard and 3 mM NaN3 (Sigma-Aldrich, Steinheim, Germany).

Active urease (Fisher Scientific, Loughborough, UK) was made up to 80 mg/mL in purified water (18 MΩ, Millipore UK, Watford, UK) forming a saturated solution.

Acetone (Riedel de Haën, Seelze, Germany) was used as supplied for protein precipitation.

N,O-Bis(trimethylsilyl)trifluoroacetamide (BSTFA) with 1% trimethylsilyl chloride (Sigma-Aldrich, St. Louis, MO, USA) and pyridine (Fluka, Buchs, Germany) were used as supplied for the silylation derivatisation process.

Test mix

GC test standards, J and W (part number 200-0310), were purchased from Fisher Scientific.

Sample preparation and derivatisation method

A saturated solution of 80 mg/mL urease was sonicated for 15 min and centrifuged (Thermo Electron, Basingstoke, UK) for 5 min at 15 000 rpm. The supernatant was used for the sample preparation.

A 120 µL aliquot of the urine sample was transferred into a v-bottomed Eppendorf tube (Eppendorf, Cambridge, UK) and centrifuged for 5 min at 15 000 rpm. A 100 µL aliquot of the supernatant was transferred to a v-bottomed Eppendorf tube and 12 µL of the urease solution were added. The sample was placed in an incubator (Heraeus, Hanau, Germany) at 37°C for 2 h. After this 120 µL of acetone were added and the sample was placed in an ice bath for 30 min. The sample was centrifuged for 15 min at 15 000 rpm. A 100 µL aliquot of the supernatant was transferred into a HPLC vial.

A Cyclone drying unit (Presearch, Hitchin, UK), was used to dry the samples down. The Cyclone pressure was initially maintained at 400 mbar for 10 min then reduced to 100 mbar for 20 min before finally being reduced to 7 mbar for the remaining 45 min. The temperature was set at a constant 30°C throughout and revolution of the unit was 900 rpm.

Pyridine (50 µL) and BSTFA (200 µL) were added to the sample. The sample vial was sealed and placed in an oven for 30 min at 110°C. The sample was mixed vigorously using a Whirlimixer (Fisherbrand, Loughborough, UK) for 30 s and returned to the oven for 30 min at 110°C. The sample was centrifuged at 15 000 rpm and the supernatant transferred to a suitable vial for analysis.

Control samples were prepared as for the urine sample preparation but with the substitution of 120 µL of water for the urine.

Each of the eleven samples was aliquoted into four wells to study inter-sample variability after the sample preparation process. Each of the wells was analysed three times to observe inter-injection variability. To reduce errors from trends, the order of the analysis was randomised and 10 min blanks of 2 µL dichloromethane (DCM) were injected between each sample to prevent carryover. A GC standard test mix and control samples were also run at regular points throughout the study.


The samples were analysed on an Agilent 6890 gas chromatograph equipped with a flame ionisation detector and a 5973MSD mass spectrometer (Agilent Ltd., Stockport, UK). A volume of 2 µL of the derivatised sample was injected onto the system using a CTC CombiPAL (Presearch) equipped with a Peltier cooler set to 4°C.

Helium gas (Air Products. Crewe, UK) at a constant pressure of 25 psi was used as the carrier gas. The use of a twin hole ferrule (Agilent Ltd.) in the injector allowed the use of two 30 m ZB-5MS columns with an i.d. of 0.25 mm and a film thickness of 0.25 µm (Phenomenex, Macclesfield, UK).

The samples were introduced using the split mode at a ratio of 10:1. The injector temperature was set at 250°C. The GC oven temperature was initially maintained at 50°C for 1 min then programmed to 325°C at a rate of 3.27°C/min and maintained at 325°C for 5 min.

The flame ionisation detector was operated at 300°C throughout the analysis. A flow of hydrogen 40 mL/min, air 450 mL/min and nitrogen 45 mL/min were used for FID. The mass spectrometer transfer line was set at 325°C.

The mass spectrometer was operated in electron ionisation (EI) mode over a mass range of m/z 50–650, with a multiplier voltage of 2106 V and a data collection rate of 1.47 scans/s. The EI source was operated at 70 eV and a temperature of 230°C with the quadrupole region at 150°C.

NMR sample preparation

Three aliquots of urine from each sample were prepared by mixing 460 µL of urine with 230 µL of phosphate buffer (0.2 M Na2HPO4/NaH2PO4 solution in water (including 20% D2O), containing 1 mM 3-(trimethylsilyl)propionic-2,2,3,3-d4 acid sodium salt (TSP) as a reference standard and 3 mM NaN3, pH 7.4–7.5) in Eppendorf microcentrifuge tubes. After the solution had been spun for 10 min at 15 000 rpm (4°C), 610 µL of the supernatant were transferred into 5 mm NMR tubes for subsequent analysis.

NMR acquisition

The buffered urine samples were analysed on a VARIAN Inova (Varian Inc., Scientific Instruments, Yarnton, UK) 500 MHz instrument using a 5 mm Varian 500 ID PFG probe head operating at 27°C and an autosampler. A NOESY presat pulse sequence with 2 s presaturation delay and 100 ms mixing time was used. Sixty-four transients were collected into 32 k data points using a spectral width of 10 000 Hz. Prior to Fourier transformation a line broadening factor of 1 Hz was applied. The acquisition time was 4.5 min and, including a pre-acquisition delay of 3 min and automatic shimming, the total run time was 12 min per sample.

Data analysis


The GC-FID-MS data was split into MS and FID data. The chromatograms were aligned using an in-house algorithm41 that is similar to correlation-optimised warping (COW) 42 and has been adapted for chromatographic metabonomic data. In this procedure the assumption is made that the order of elution is the same for all samples and that the retention-time variations are small since the data is typically generated under identical conditions and over a short time period. Post-alignment, a segmented baseline correction was applied. Baseline points were selected to be common across all the aligned chromatograms and were placed in persistent noise regions between peaks or bands.

The NMR spectra were phased, baseline corrected and automatically referenced to TSP.

Data reduction

Each GC-FID chromatogram was divided into regions of 0.1 min retention time width and the integrals over each of these regions were calculated. This data ‘bucketing’ simplifies statistical analysis and reduces the impact of small variations in retention time. This yielded 710 buckets.

Similarly, each NMR spectrum was reduced to 198 buckets by integrating areas of 0.04 ppm width in the spectra from 0.2–10.0 ppm. The region of the water signal from 6.18–4.5 ppm was excluded because the intense water signal, which is affected by water pre-saturation, does not carry additional information and skews the analysis. At the same time, the buckets containing the citrate signals (2.74–2.5 ppm) were merged to further reduce the effects of pH-related variation in the chemical shifts.

All chromatograms and NMR spectra were normalised to a constant integrated intensity of 100 units, to compensate for large variations in urine concentration.

Data analysis

The NMR and GC/MS data were mean-centred. Then, principal component analysis (PCA)43 was carried out separately for the NMR and GC/MS-FID data to identify similarities and differences between the two acquisition methods.


All GC/MS data was initially analysed using Agilent Enhanced Chemstation software version C.00.00. Data formatting, data pre-processing for the NMR data and additional data pre-processing for the GC-FID-MS data were carried out using in-house software written in EXCEL and MATLAB. Chemometrics analysis was performed using either the in-house software written in MATLAB (version 7, The MathWorks, Natick, MA, USA) or SIMCA-P+ software (version 10.0, Umetrics AB, Umea, Sweden).


FID versus MS detection of metabolites

Unlike the case with mass spectrometry, the signal produced by a flame ionisation detector is not dependent on the ionisation energy of a particular molecule. The response from FID is proportional to the total carbon content of the component and therefore provides a more realistic indication of the relative concentrations of components within a sample. Based on this argument the FID data was used for the statistical analysis.

FID has the added advantage of faster scanning speed, producing more data points for each peak than the quadrupole mass spectrometer used in this study. The flame ionisation detector scans at 20 scans/s and is capable of handling peak widths of 0.01 min in comparison with 1.47 scans/s for the quadrupole mass spectrometer. The resulting increased resolution is a significant advantage in the subsequent chemometrics analysis. The MS and FID signals were recorded simultaneously with the system described but, for the reasons mentioned above, FID was used for peak detection. The integrated signal intensities of the EI spectra are plotted as TICs (total ion chromatograms). Alignment of the FID and TIC signals for identification purposes is achieved through ‘signal alignment’ in the Chemstation software. Two-point alignment allows peaks to be matched. As will be described below, the MS data can be used to great advantage in the identification of metabolites.

A typical FID trace that was obtained for the analysis of the human urine samples is shown in Fig. 1(a); a magnified image of the low-intensity region is shown in Fig. 1(b).

Figure 1.

(a) FID chromatogram obtained from urine sample from volunteer 5 over the time range 10–80 min. (b) Magnified FID chromatogram from volunteer 5 showing low-intensity region.

The full chromatogram (Fig. 1(a)) shows that a few peaks dominate the trace; at a first glance the number of features does not seem particularly high. Magnifying the lower-intensity region of the FID trace (Fig. 1(b)) gives an impression of the number of individual peaks that can be observed using this technique. Two to three hundred appeared to be the typical number of peaks routinely observed for a human urine sample when the integration parameters were carefully chosen to prevent peak splitting. Only about five to ten peaks of comparable intensity were detected in the control samples (data not shown).

Reproducibility of GC-FID

The key element of metabonomics is the comparative analysis of metabolite pools and the detection of differences in metabolite concentrations between individual samples and groups of samples. It is therefore of utmost importance to investigate and minimise variability that is introduced through sample preparation and analytical procedures.

A big advantage of NMR over GC is the minimal degree of sample preparation that is necessary to obtain spectra of urine. Any sample preparation has the potential to introduce variation and we have investigated the impact of the GC sample preparation process on data quality.

The experiment was designed in such a way that it enabled both a comparison of the data for the eleven human urine samples and an investigation of the reproducibility of various processes. Four aliquots of each of the eleven samples were taken through the whole sample preparation process and each sample was then analysed in triplicate. The order of sample injection was randomised and 10 min blanks of 2 µL DCM were injected between each sample to minimise carryover. Runs of a commercially available GC standard were acquired at various times during the analysis to monitor instrument and method performance over the 11 days duration of the analysis.

Two main causes of variability were investigated: (1) variability introduced by the instrument, e.g. injection volumes, column and instrument performance; and (2) variability introduced by the sample preparation process and by sample stability over the course of the analysis.

The variability described in (1) was investigated by monitoring a range of peaks from the GC standard and the volunteer urine samples. Retention times and peak areas were recorded and used to monitor chromatographic integrity (reproducibility of peak intensity, shift in retention times) over the analysis period. Seven peaks (retention times: 14.49, 16.85, 17.99, 19.43, 21.22, 24.22 and 26.92 min) that represent compounds with a range of elution times were chosen from the GC standard.

Another seven peaks (retention times: 10.91, 17.88, 27.55, 36.04, 43.01, 49.33 and 73.15 min) were chosen from the urine volunteer samples. The peaks were chosen to represent early to late eluting components and also a variety of intensities.

Table 1 shows the variability of peak area and retention times of peaks from the GC standard and peaks from two of the samples. The values shown for the ‘volunteer’ data are based on calculations from twelve injections, while those for the GC standard are based on ten injections. The statistical analysis shown in Table 1 was carried out prior to pre-processing. Although the selected peaks are different for the standard samples and the urine samples, the results allow for an informal comparison of the different degrees of variability introduced by the sample preparation of the urine matrix.

Table 1. Comparison of inter-injection and inter-sample variability prior to pre-processing
 RT (min)Peak area
Inter-injection variability (GC standard)0.032–0.0373.08–13.75
Intra-sample variability (volunteer 11)0.029–0.2217.03–42.84
Intra-sample variability (volunteer 3)0.010–0.0257.85–39.22

The two extremes for intra-sample variability are given, volunteer 3 showing the smallest and volunteer 11 showing the largest variability. The data shows the extent of the hardware variation (as judged from multiple injections of the GC standard, Table 1) and the additional variation which can be introduced by sample preparation. The goal of metabonomics experiments is to measure inter-sample variability, i.e. differences in metabolite concentrations between samples. It is therefore of key importance to minimise the ‘noise’ that is introduced by the variabilities shown in Table 1 before proceeding to chemometrics analysis. To achieve this we have performed three preprocessing steps, (i) baseline correction, (ii) time-warping, and (iii) normalisation.

Figure 2 shows the GC-FID traces from twelve injections, three injections from each of four separately prepared aliquots from the same volunteer. In this case there is variability – resulting in increased differences in peak areas – introduced through sample preparation.

Figure 2.

Variability of signal from twelve injections of volunteer 8. Three injections were made from each of four samples, prepared as individual aliquots.

As can be seen in Fig. 2, the reproducibility of retention times is fairly good (0.1 min). However, to reduce variability, time-warping and baseline correction were applied in order to generate the best possible dataset for subsequent multivariate analysis. Minimising any variability that may have been introduced during the process of analysis has ensured that differences seen between samples are actually due to differences in metabolite concentrations between the samples and not to instrumental variables.

Figure 3 shows the same twelve traces after processing. It can be seen that across the whole dataset the variation between peak retention times is negligible and for all samples from the same volunteer the peak areas are very similar.

Figure 3.

Signals from volunteer 8 after time-warping and baseline correction. Note the marked improvement in signal alignment with reduced variance between peaks.

We have shown that the combined inter- and intra-sample variability is small compared with the variability seen between urine samples from different human volunteers, particularly if baseline correction and time-warping methods are applied post-acquisition. This is a prerequisite for the successful application of GC in a metabonomics setting where differences between groups of samples are explored by chemometrics methods. Although there was less variation in the NMR data it can be expected that by further minimising variability introduced by the GC sample preparation it will be possible to generate even ‘tighter’ datasets. One means of achieving this could be to use robotics for the sample preparation.

Analysis of differences in metabolite concentrations and metabolite identification

Significant information can be gained immediately simply by visual comparison of the FID chromatograms from all eleven volunteers. The information obtained in this way is useful for two purposes: By inspection of the overlaid traces it is possible to judge the quality of the dataset; certain ‘landmark’ peaks or sets of peaks should show little variation throughout the whole dataset. Metabolite peaks that show marked intensity differences between the volunteers can also be detected at this stage. Figure 4 shows a stack plot of the traces from the eleven human volunteers. Examples of marked differences in peak intensities are shown by arrows: The metabolite giving rise to the peak at ∼72 min is highly abundant in volunteer 10 but is only present at a low level in volunteer 2. The metabolite eluting at ∼38 min is only present in significant amounts in volunteer 5. It is well known that factors such as diet, time of sampling, etc., can contribute significantly to the variability of urine metabolite profiles. For this reason it will be important to minimise these effects through careful design of studies.

Figure 4.

Overlay of FID chromatograms from the urine samples taken from the eleven human volunteers. Examples for major differences between the volunteers are marked by arrows.

In contrast to NMR each metabolite gives rise to a single peak in GC-FID. This facilitates interpretation compared with NMR where the multiplicity of signals often gives rise to more signal overlap. In addition, the metabolites in each peak can be subjected to MS analysis post-column. We have used EI in this study. Although in most cases it is difficult to determine the mass of the intact metabolite using this ionisation technique, EI does produce a characteristic fragmentation pattern for each analyte. The fact that this pattern is highly reproducible and characteristic for an individual analyte has been used as a rationale to build comprehensive searchable databases containing EI spectra for large numbers of organic molecules; the NIST database44 is well established.

Figure 5 shows the spectrum extracted for the medium-intensity peak at ∼38 min from Fig. 4. The results from a NIST database search suggest that this compound is either mannitol or glucitol. These two sugars are epimers that produce identical EI spectra and hence our data did not allow us to discriminate between them; for simplicity reasons this metabolite will be named mannitol throughout the rest of this publication. To provide further confirmation of this component, standards could be run to determine whether a difference in retention time is observed. GC/PICI-MS could be used to determine molecular weight information.

Figure 5.

EI spectrum for the peak eluting at 38 min in the FID trace. This peak clearly discriminates volunteer 5 from all the other volunteers (cf. Fig. 4). Chemometrics analysis (see below) also showed that this peak makes a major contribution to the large variation shown by volunteer 5 in comparison to the other volunteers.

In order to investigate which metabolites correspond to high-intensity peaks in the FID trace, EI spectra for the ten most intense peaks were used to search against the NIST database; the results of the database search are shown in Table 2. The quality of the match is indicated by the percentage where a 100% value indicates a complete match of all the ions and their intensities compared to a reference spectrum. All the components that have been tentatively identified in this way are well known urine metabolites.45

Table 2. The ten most intense peaks in the GC-FID trace (ranked by their integrated FID peak areas) are shown. MS data for these peaks were used to search the NIST database. All these peaks gave reasonably high scores in the database search and could be tentatively identified
Peak intensity (% total integrated area)RT (min)NIST match (/100)
19%17.95Phosphate tri-TMS (91.3)
13%36.07Hippuric acid TMS (89.2)
5%43.77Uric acid tetrakis-TMS (85.0)
6%27.58Creatinine enol tri-TMS (83.4)
3%25.01Aminomalonic acid tri-TMS (89.9)
4%36.94Citric acid tetrakis-TMS (91.7)
4%38.84Mannitol hexakis-TMS (91.0)
2%27.6Trihydroxybutyric acid tetrakis-TMS (86.9)
2%10.23Oxalic acid di-TMS (77.0)
2%19.56Aminobutanoic acid tetrakis-TMS (57.5)

Since the majority of the above urine metabolites contain more than one group that can be derivatised by BSTFA, incomplete derivatisation could give rise to several peaks per metabolite in the gas chromatogram. This fact could complicate the chemometrics analysis, particularly if the intensity ratios between multiple forms of one metabolite are not constant. A way of overcoming this will be to determine the identity of all metabolites in the GC chromatogram and it will be highly desirable to build such a database in the future. However, none of the partially derivatised forms of the metabolites from those listed in Table 2 were found in the NIST database. In addition no evidence for the incomplete derivatisation of the metabolites in Table 2 was observed when searching the chromatogram for characteristic fragment ions. Nevertheless, this issue is of high importance for the application of the methodology in metabonomics studies and will need further investigation.

A comparison of the dominant GC and NMR metabolites shows that creatinine and citrate also give rise to some of the most prominent signals in NMR spectra from human urine. This indicates that both methods, although they are based on very different detection principles, generate comparable results in terms of detection of the major urine metabolites. However, while the GC-FID signal of a metabolite is proportional to its total carbon content, the 1H-NMR signal intensity is correlated with the number of protons present in a particular metabolite. This is expected to have a significant influence on the relative signal intensities for the two analysis methods.

The use of headspace-GC could open up new avenues for the metabonomic analysis of biological samples that are difficult to analyse by other methods.46, 47 The performance of GC in the analysis of blood-derived samples will also need to be explored in the future.

Chemometrics analysis

As has been shown above (cf. Fig. 4) major differences between the urine GC-FID traces can be observed by visual inspection. However, to understand how the multiplicity of up/down-regulated signals affects the overall variance between the samples, chemometrics analysis must be applied to show how the volunteer sets relate to each other.

From the raw NMR data it was obvious that the metabolite concentrations in the urine sample from volunteer 10 were much lower than in all other volunteers. This was mirrored in the low intensities of the GC-FID signals. More dilute samples are more prone to water suppression effects such as phase distortions around the water resonance. To prevent these spectra from overly influencing the chemometrics analysis for the entire volunteer sample set, the NMR and GC-FID-MS data sets from volunteer 10 were both excluded from the chemometrics analysis.

PCA was used for the chemometrics analysis of the data. It was applied to the bucketed data (i.e. data sets of 198 for NMR and 710 variables for GC-MS-FID) and each of these data sets was reduced to 2 or 3 linear combinations of buckets. Each linear combination of buckets can be thought of as a direction in 198-dimensional or 710-dimensional space for NMR and GC-FID-MS, respectively. The first PCA component corresponds to the direction of maximum variance of the data. The second component corresponds to the direction of maximum variance subject to the constraint that it is orthogonal to the first, and so on. The coefficients that define how the buckets are combined to create the linear combinations are called ‘loadings’. The values for the linear combination of each sample are called ‘scores’. The scores plot of PC1, often called t1 (first component), against PC2 or t2 (second component) is a simple 2D visualisation of the main variation in the data and can be used to see which samples are similar and which are dissimilar. The corresponding loadings plots for each principle component indicate which parts of the spectra are responsible for the observed patterns in the scores plots.

A 2D PCA scores plot (t[1] vs. t[2]) for the GC-FID data is shown in Fig. 6. With very few exceptions (one data point each for volunteers 8 and 11) the samples cluster quite tightly. This highlights the high reproducibility of sample preparation and data generation for samples from the same volunteer. A data point from volunteer 4 (2B4C) was removed from the set as it was noted that the sample had not injected.

Figure 6.

PCA scores plot (t1 vs. t2) from the FID data.

T[1] reflects both within-volunteer and between-volunteer variation, while t[2] reflects comparatively little within-volunteer variation but rather focuses on between-volunteer variation. In particular, the metabolic profile of the samples from volunteer 5 are clearly different from those of all the other volunteers as the cluster of data points shows a marked deviation from those from the other volunteers.

The loadings plot for PC2 is shown in Fig. 7(a) (GC-FID). It shows that three peaks drive the separation: hippurate, citrate and mannitol, all identified by NIST database searches. The citrate and hippurate concentrations appear to be negatively correlated with the mannitol concentration: volunteers with high PC2 scores have relatively high concentrations in mannitol and uric acid and lower concentrations in citrate and hippurate. Conversely, volunteers with negative PC2 scores have relatively low concentrations in mannitol and uric acid and higher concentrations in citrate and hippurate.

Figure 7.

PCA loadings plots for the GC-FID (a) and NMR (b) data.

At this point it is interesting to compare these results with the results from the chemometrics analysis of the corresponding NMR dataset (Fig. 7(b)). Although the NMR PCA did not give exactly the same results as the GC PCA it was observed that PC2 from the GC analysis has similar patterns to PC3 from the NMR analysis (cf. Figs. 7(a) and 7(b)).

Figure 8 shows 1D scores plots corresponding to the directions shown in Fig. 7(a), i.e. PC2 for the GC data, and Fig. 7(b), i.e. PC3 for the NMR data. Not only do the loadings plots for these two components contain similar information, but these scores plots show that the relative rankings of the volunteers are also broadly similar, with volunteers 5 and 2 having the maximum and minimum values, respectively.

Figure 8.

Comparison of PCA 1D scores plots for GC data (PC2, left-hand side) and NMR data (PC3, right-hand side).


During the last decade the classical methods for analysis of synthetic organic molecules, particularly mass spectrometry and NMR spectroscopy, have been increasingly used to analyse highly complex biological samples. In this context GC had been applied for the analysis of plant metabolites and for screening of urine samples in the clinic. We have explored the use of GC for the ‘global’ analysis of human urine metabolites and have compared the results with data obtained by NMR spectroscopy.

Our results suggest that GC is a valuable and complementary tool to 1H-NMR for the metabonomics analysis of urine samples. It has several strengths: (a) the ability to identify metabolites via MS information; (b) the very low sample consumption required, making it ideal for less abundant biological samples; (c) the higher inherent resolution of GC over LC; (d) FID being more universal than other separative detectors; and finally (e) its relatively low cost. Once databases with retention times for the urine metabolites become available it will not even be necessary to have MS capabilities to perform basic metabonomics studies.

By applying baseline correction and time-warping methods post-acquisition, we have shown that the combined inter- and intra-sample variability is small compared with the variability seen between urine samples from different human volunteers. These results are very encouraging for the use of GC as a metabonomics analysis platform: It can be expected that differences between groups in a typical study, diseased or treated versus control, will be much larger than the differences within a group of samples from healthy volunteers that we have used in this study.

We have also explored whether there were similarities in the results of chemometrics analyses of the GC and NMR datasets. It was interesting to see that, although the NMR PCA did not give exactly the same results as the GC-PCA, PC2 from the GC analysis had similar patterns to PC3 from the NMR analysis. The fact that GC-FID-MS and NMR have different underlying principles of detection could account for differences in the distribution of groups between the two analytical methods in the other principal components. As a consequence, this may also influence the outcome of the PCA.

In summary we have shown that GC can resolve well over 200–300 human urine metabolites in a 90 min run; this number compares well with what can be detected in a typical NMR experiment.48 In principle it will be possible to resolve and detect even more metabolites by increasing the analysis time and by further optimising derivatisation procedures; however, long data acquisition and complex sample preparation procedures are clearly drawbacks to using this as a higher throughput technique for analysing samples from large clinical trials. We believe that GC has the potential to become a powerful tool that is complementary to NMR for the analysis of metabonomics samples for use in exploratory biomarker work.