Cell-specific chemotyping and multivariate imaging by combined FT-IR microspectroscopy and orthogonal projections to latent structures (OPLS) analysis reveals the chemical landscape of secondary xylem


  • András Gorzsás,

    1. Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences (SLU), Umeå SE-90183, Sweden
    Search for more papers by this author
    • To be considered as joint first authors.

  • Hans Stenlund,

    1. Computational Life Science Cluster (CLiC), Department of Chemistry, Umeå University, Umeå SE-90187, Sweden
    Search for more papers by this author
    • To be considered as joint first authors.

  • Per Persson,

    1. Department of Chemistry, Umeå University, Umeå SE-90187, Sweden
    Search for more papers by this author
  • Johan Trygg,

    1. Computational Life Science Cluster (CLiC), Department of Chemistry, Umeå University, Umeå SE-90187, Sweden
    Search for more papers by this author
  • Björn Sundberg

    Corresponding author
    1. Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences (SLU), Umeå SE-90183, Sweden
      (fax +46 90 786 8165; e-mail bjorn.sundberg@slu.se).
    Search for more papers by this author

(fax +46 90 786 8165; e-mail bjorn.sundberg@slu.se).


Fourier-transform infrared (FT-IR) spectroscopy combined with microscopy enables chemical information to be acquired from native plant cell walls with high spatial resolution. Combined with a 64 × 64 focal plane array (FPA) detector, 4096 spectra can be simultaneously obtained from a 0.3 × 0.3 mm image; each spectrum represents a compositional and structural ‘fingerprint’ of all cell wall components. For optimal use and analysis of such a large amount of information, multivariate approaches are preferred. Here, FT-IR microspectroscopy with FPA detection is combined with orthogonal projections to latent structures discriminant analysis (OPLS-DA). This allows for: (i) the extraction of spectra from single cell types, (ii) identification and characterization of different chemotypes using the full spectral information, and (iii) further visualization of the pattern of identified chemotypes by multivariate imaging. As proof of concept, the chemotypes of Populus tremula xylem cell types are described. The approach revealed unknown features about chemical plasticity and patterns of lignin composition in wood fibers that would have remained hidden in the dataset with traditional data analysis. The applicability of the method to Arabidopsis xylem and its usefulness in mutant chemotyping is also demonstrated. The methodological approach is not limited to xylem tissues but can be applied to any plant organ/tissue also using other techniques such as Raman and UV microspectroscopy.


Plant cell walls are complex but well organized biomaterial structures mainly composed of lignin, cellulose and hemicelluloses. The chemical composition and polymer structure of cell walls varies greatly between tissues and cell types. But there is also a large plasticity within a cell type in response to developmental and environmental cues, as well as between different genotypes and species. Secondary lignocellulosic cell walls are an important resource for biorefining for energy and biomaterials, and there is an increasing demand for cell wall chemotyping to explore the impact of the environment and genetic regulation on the plant material.

Chemical analysis of cell walls is inherently challenging due to their complex structures and chemical bonding between the different polymers, and is ideally approached by complementary techniques. Analyses after wet chemical extraction are important and informative, but will unavoidably lose information about the intact wall structure and are also prone to chemical modifications during extraction. As an alternative, spectroscopic techniques, such as FT-IR, have the potential to provide a chemical fingerprint of intact cell wall samples that is rich in information.

Fourier transform infrared (FT-IR) spectroscopy can provide rather high-throughput analysis and is widely used in studying various areas of the biology, degradation and physics of cell walls (Zhong et al., 2000; Moore and Owen, 2001; Boeriu et al., 2004; Maurer et al., 2004; Dokken et al., 2005a; Christensen et al., 2006). Combined with a microscope, FT-IR measurements provide spatially resolved information from specific tissues (microspectroscopy). The common setup uses a single element detector and apertures to focus on the area of interest (McCann et al., 1997; Wilson et al., 2000; Hori and Sugiyama, 2003; Mouille et al., 2003; Labbéet al., 2005). The smallest area that can be measured with this approach is normally around 50 × 50 μm. Smaller areas can be measured if the IR source has high brilliance, e.g. from synchrotrons (Dokken et al., 2005b; Yu, 2005; Gou et al., 2008), but this is technically much more challenging and time consuming. The focal plane array (FPA) detector is a recent tool that uses a traditional IR source but allows for high spatial resolution FT-IR microspectroscopy at higher speed (Kidder et al., 2002). The FPA consists of an array of miniature detector elements that simultaneously record thousands of spectra over an area, in a similar way to pixels building up an image. The spectrum can then be extracted from each detector element (‘pixel’) and analysed individually. In addition, the full dataset can be used for chemical imaging. Modern instruments are diffraction limited, with spatial resolutions in the 5–20 μm range for plant sections in transmission mode (depending on the wavenumber of the radiation; Lasch and Naumann, 2006).

An FT-IR spectrum provides a wealth of information and represents a compositional and structural ‘fingerprint’ of all wall components. Evaluation of large sets of FT-IR spectra is preferentially done by multivariate analysis. Principal components analysis (PCA; Trygg et al., 2006) is an unsupervised approach often used in plant science to give a comprehensive overview of systematic variation within complex sets of data, where clustering of samples with similar features as well as eventual outliers can be detected. Supervised regression methods, such as partial least squares (PLS; Eriksson et al., 2006) and orthogonal projections to latent structures (OPLS; Trygg and Wold, 2002), can be used to improve both interpretability and predictability when relation to a response variable (i.e. a specific class such as cell type) is needed. The OPLS technique can be particularly useful since it separates the systematic variation in the FT-IR data into two distinct parts – variation that is correlated (predictive) and not correlated (orthogonal) to the response variable (classes). Thus, the predictive variation that identifies the difference between classes (such as cell chemotypes) is separated from orthogonal variation that includes the experimental variation (e.g. sample thickness, growth conditions and edge effects) as well as within-class (sample) variation. The predictive variation between the classes can be shown as highly informative correlation-scaled loadings plots, where the relative importance of different FT-IR bands for sample separation can be assessed. In addition, the orthogonal variation can be evaluated to determine if it can provide useful information about chemical differences within the class, or if it just reflects the experimental variation. Recently, OPLS methods have been used to interpret complicated data sets in analytical plant sciences (Wiklund et al., 2008; Bylesjöet al., 2009; Hedenström et al., 2009).

There are only a few examples where the FPA detector has been used on plant tissues (Naumann et al., 2005; Naumann and Polle, 2006; Heraud et al., 2007; Müller and Polle, 2009; Brewer and Wetzel, 2010), but its full potential in combination with multivariate analysis has not yet been explored. Here, a standardization procedure for data handling and an optimized use of the FPA dataset by applying OPLS-discriminant analysis (DA) is presented. Examples are provided where chemotypes of xylem cells in Populus and Arabidopsis are identified, characterized and visualized to reveal cell wall plasticity and genotype differences. As a proof of concept the well-established difference between fiber and vessel chemistry was demonstrated. But the methodology also uncovered unknown features of wood, for example that lignin composition in Populus fibers is affected by their vicinity to vessel elements. Here xylem tissues from a tree and an herb were used to exemplify the method, but the approach can be applied to any kind of plant tissue and other microspectroscopy techniques such as Raman and UV.

Results and Discussion

Automated interactive script facilitates data handling

To handle the large number of data recorded by the FPA detector, an automated and interactive script was developed that performs all spectral pre-treatments in a single application (Figure 1). In this study, a 64 × 64 FPA detector was used that produces 4096 spectra from an image area of about 0.3 × 0.3 mm (‘pixel’ size is approximately 5 μm). During measurement, a visual image of the area is taken as a reference picture (Figure 1a). The corresponding FT-IR image is created by visualizing the intensity of the band at about 1740 cm−1 (–C=O vibration), which is present in all spectra from cell walls and provides good spatial resolution because it is in the higher wavenumber region (Figure 1b). Empty pixels covering cell lumens are eliminated by setting a threshold limit for the absorbance at 1740 cm−1. This process is interactive, so the resulting FT-IR image can continuously be compared with the visual image. Because the spatial resolution of IR images is lower than that of visual images, there is a trade-off between keeping low-intensity spectra of pixels covering thin cell walls and excluding empty areas that are too small in size (Figure S1 in Supporting Information). In the next step, pixels are selected and colour-coded for classes of interest (such as fiber–fiber or vessel–vessel walls) in accordance with the visual image. The spectra from selected pixels are shown in the spectral overview image, where the spectral region to be used in the subsequent analyses is set (white background in Figure 1c). Finally, appropriate baseline correction and area normalization (and optional offset correction as well as spectral and image smoothing) is performed within the defined spectral region on all 4096 spectra of the image to limit experimental variation related to intensity and baseline variations within the image (Stenlund et al., 2008). This is particularly important for plant sections that may vary in thickness and produce images that contain sharp edges due to empty spaces and possible cracks, since all these parameters will affect the properties of the acquired spectra. The dataset can now be used to evaluate chemical differences between selected classes (for example cell types) by multivariate analysis and to create chemical images using all 4096 spectra. It should be noted that due to area normalization, only relative differences in wall components between sample types can be determined, i.e. if the absorbance of a spectral region decreases, the absorbances of other spectral regions will increase. Absolute quantification would require some kind of internal standard, which in case of FT-IR analysis of tissue sections is not yet available.

Figure 1.

 Spectral pre-processing using custom-built software.
(a) Visual image of the area of an aspen wood cross-section. V, vessel; F, fiber; R, ray. Scale bar = 50 μm.
(b) The corresponding FT-IR image, using the intensity of the 1740 cm−1 band. Pixels selected for further analyses are marked (blue, vessels; red, fibers; green, rays). Orange indicates excluded pixels.
(c) Raw spectra of the selected pixels (blue, vessels; red, fibers; green, rays). The vertical orange line marks the 1740 cm−1 band used for creating the FT-IR image. The perpendicular bar (arrow) shows the threshold level for excluding a spectrum from the analysis (orange in b). The spectral region used in the subsequent analyses is marked with a white background.

Spectra sampling with high spatial resolution reveals chemical differences between xylem cell types

We used OPLS-DA analysis to compare cell wall chemistry between xylem cell types of aspen wood. The FT-IR spectra were acquired by the FPA detector from transverse wood sections representing two replicate trees. Spectra from vessel, fiber and ray cell walls were acquired from 12 images per tree, sampled at four radial positions from early- to latewood across the annual ring from the previous year (positions 1–4, Figure 2a). From the individual FT-IR images, five spectra were extracted for each cell type and subjected to a three-class OPLS-DA analysis that clearly separated the three cell wall types (Figure 2b). The largest difference was seen between fiber and vessel walls separated by Component 1, while Component 2 separated ray walls from the other two cell wall types. The loadings plot for these components (Figure 2c,d) shows the spectral bands that are responsible for the separation between cell types and to what degree they contribute to the separation, taking into account the variation of 90 independent measurements per cell type.

Figure 2.

 Sampling and chemotyping of xylem cell types.
(a) Visual image showing the 12 sampling areas (squares) from one of the two replicate trees. 1–4, radial sampling positions; EW, earlywood; LW, latewood. Scale bar = 200 μm.
(b) Orthogonal projections to latent structures discriminant analysis (OPLS-DA) scores plot showing the separation of spectra of vessel, fiber and ray cell walls from positions 1 to 3. Circles, fibers; diamonds, vessels; triangles, rays. n = 90 per cell type sampled from two replicate trees.
(c) Correlation-scaled loadings plot for Predictive Component 1 showing factors separating fibers from vessels. The marked negative bands are more intense in fibers. The marked positive bands are more intense in vessels and are all lignin-related.
(d) Correlation-scaled loadings plot for Predictive Component 2, showing factors separating rays (negative) from the other cell types (positive). The circle indicates negative bands up to about 1180 cm−1 dominated by vibrations associated to various sugar units (Table S1).

Interpretation of the chemical components underlying FT-IR spectra (and loadings plots) must be done with care, because the major cell wall polymers give rise to broad and overlapping bands (see Figure 5g later). Moreover, many functional groups are nearly identical in most of the cell wall biopolymers, the only difference being the rest of the molecule they are attached to. Although the molecular environment of the functional groups influences their bond distances and angles, and thus band positions in the spectra, these differences are small in relation to the total bandwidth. Therefore, individual bands are rarely diagnostic, and reliable conclusions from FT-IR spectra should preferentially be drawn from several bands, all indicating the same compositional changes. For interpretation of the spectra we used the literature and our own data for annotating bands to cell wall biopolymers (Table S1).

Figure 5.

 Four-class multivariate imaging of xylem chemotypes across a latewood to earlywood transition.
(a) Visual image from the annual ring (position 4). AP, terminal axial parenchyma. Scale bar = 50 μm.
(b) The corresponding FT-IR image using the intensity at 1740 cm−1. Pixels used for spectral extraction to build the model are marked. Red, latewood fibers; blue, latewood vessels; green, terminal axial parenchyma; pink, earlywood fibers.
(c) Orthogonal projections to latent structures discriminant analysis (OPLS-DA) 3D scores plot showing class separation between selected cell types along Predictive Components 1, 2 and 3.
(d) Pseudo-colored multivariate image based on the OPLS-DA model, using additive red–green–blue (RGB) color mixing and assigning each predictive component to a colour channel (red, green and blue).
(e) Pseudo-colored multivariate image based on the OPLS-DA model, using separate colors for each class: red, earlywood fibers; green, latewood fibers; blue, vessels; yellow, axial parenchyma. The visual image in (a) was used as a colour intensity filter, to eliminate empty areas (white).
(f) Correlation-scaled loadings for Predictive Component 3, showing bands separating terminal axial parenchyma (positive) from latewood fibers (negative).
(g) Reference spectra of cellulose (blue), xylan (green) and lignin (black) illustrating band positions, widths and overlaps. Red dashed lines mark lignin bands picked up by the positive loadings of Predictive Component 3, as being responsible for separation. Gray dashed lines show unspecific bands that are not contributing to class separation. Blue lines mark cellulose bands picked up by the negative loadings of Predictive Component 3. Bands above 1640 cm−1 are left unmarked because of disturbances from absorbed water and distortions from baseline and scattering effects (particularly for the C=O vibration).

The loadings plot for separation between vessel and fiber cell walls highlights bands that can be assigned to lignin (Figure 2c). Most notably, the negative band (more intense in fibers) around 1128 cm−1 has been assigned to S-lignin, and the positive band (more intense in vessels) around 1207 cm−1 to G-lignin (Table S1), indicating a higher S/G ratio for fibers than for vessels. In addition, the loadings plot indicates a significantly different intensity ratio for the aromatic skeletal vibrations at 1510 and 1595 cm−1 in fiber and vessel spectra. This can also be correlated to the S- to G-lignin ratio (Faix, 1991), although a more general interpretation is that it describes the degree of lignin cross-linking (Zhong et al., 2000). Taken together, the loadings plot agrees with established knowledge of a higher proportion of G-lignin in vessels, as determined by UV microscopy (Fergus and Goring, 1970), bromination electron microscopy energy dispersive x-ray analysis (EM-EDXA; Saka et al., 1988) and wet chemistry analysis (Hardell et al., 1980). Moreover, the relative proportion of lignin appears to be higher in vessels than in fibers, because both aromatic skeletal vibrations (1510 and 1595 cm−1) are on the positive side of the loadings plot. This also agrees with previous observations in white oak and birch based on Klason lignin, UV microscopy and different EDXA techniques (Fergus and Goring, 1970; Obst, 1982; Eriksson et al., 1988). Conversely, fibers contain proportionally more hemicellulose/cellulose as judged from bands at 1250 and 1370 cm−1 (Figure 2c, Table S1). The loadings plot for Component 2 shows the bands that separate rays from fiber and vessel walls (Figure 2d). In practice, all wavenumbers below 1180 cm−1 (various sugar ring vibrations) are in the negative region of the plot (more intense in rays) and virtually everything else is positive. Thus, ray cells appear to contain a much higher proportion of sugars. This may reflect the additional unlignified pectin-rich cell wall layer (the protective layer) that is formed inside the ordinary lignified secondary wall in ray cells, as described in the closely related Populus tremuloides (Chafe, 1974; Chafe and Chauret, 1974). It may also reflect stored starch in the living ray cells.

Spectral sampling from different images reveals the chemical plasticity of xylem cell types within the annual ring

By applying OPLS-DA analysis on FT-IR spectra extracted from the different individual images, the chemical plasticity of each cell wall type across larger areas, such as the annual ring, can be studied both in the radial direction from early- to latewood and in the tangential direction for cells formed at similar times during the growing season. Cell wall spectra sampled from fibers formed from early to late in the growing season (positions 1–4) show an orderly separation along the predictive component in both replicate trees, demonstrating a progressive change in fiber chemistry across the annual ring (Figure 3a). The corresponding loadings plot shows that this separation is caused by bands indicative of hemicellulose/cellulose on the positive side (more intense towards latewood) (1037, 1061, 1158 cm−1) and bands indicative of lignin on the negative side (more intense towards earlywood) (1270, 1462, 1510, 1595, 1710 cm−1; Figure 3b). Thus, the data indicate that the lignin to hemicellulose/cellulose ratio is decreasing in fibers during the growing season. One explanation for this could be an increase in the thickness of the S2 layer from early- to latewood. The proportion of the lignin-rich middle lamella (Donaldson et al., 2001; Koch and Kleist, 2001; Gierlinger and Schwanninger, 2006; Schmidt et al., 2009) to the total cell wall would then be much higher in earlywood fibers as compared with latewood fibers. Interestingly, the variation within each radial sample position (i.e. the variation within the class that is not contributing to class separation) described by the orthogonal component (Figure 3a) is not the result of using two different trees to build the model (i.e. spectra from a particular tree do not cluster only on one side of the orthogonal component). Instead, spectra from fiber walls from each individual image tend to cluster together, especially in the samples from positions 3 and 4 (Movie S1). This demonstrate that fiber cell walls vary in tangential direction over very small distances (sub-millimeter), possibly induced by variation in micro-environmental conditions or positional cues.

Figure 3.

 Chemical plasticity of xylem cells within the annual ring.
(a, c, d) Orthogonal projections to latent structures discriminant analysis (OPLS-DA) scores plot of spectra acquired from (a) fibers, (c) vessels and (d) rays sampled across the annual ring. Green, position 1; blue, position 2; red, position 3; grey, position 4. Triangles, squares and circles represent each of the three tangential positions (images) within the same radial position. Empty and filled symbols represent two replicate trees.
(b) Correlation-scaled loadings plot of Predictive Component 1 for fiber spectra. The marked negative bands (more intense towards earlywood) are lignin-related. The marked positive bands are more intense towards latewood and are hemicellulose/cellulose-related.
(e) Correlation-scaled loadings plot of Predictive Component 1 for rays. The marked negative bands are more intense in earlywood, and the marked positive bands are more intense in latewood.

A similar analysis of vessel cell walls shows considerably more overlap between radial sampling positions from early- to latewood than in the case of fibers (Figure 3c). Thus, vessel walls are more uniform in their chemistry across the annual ring than fiber walls. This can be understood in the light of the more rapid development of vessels (Murakami et al., 1999), giving a smaller time frame to create chemical plasticity. However, the signal to noise ratio is normally worse for spectra from vessel walls than for fiber walls since the former are thinner and therefore less material is present in the area of one pixel. Thus, subtle differences could be harder to find in vessel walls than in the case of fiber walls.

Similarly to fibers, ray cells also show a clear trend along the predictive component (Figure 3d). The loadings plot corresponding to the predictive component shows that earlywood rays contain proportionally more aromatics (lignin or monolignols, 1510 and 1595 cm−1) as well as more –C-O-C– linkages (1158 cm−1) and a higher intensity –C-O vibration (1210 cm−1) of unclear origin. Latewood rays contain proportionally more –C-O/–C=O functional groups (1246, 1724 cm−1), which are characteristics of pectins and/or hemicelluloses, more absorbed water (1641 cm−1) and higher intensity –C-H vibrations (1317 cm−1) of unclear origin (Figure 3e, Table S1). This may reflect a variation in the pectin-rich protective layer typical for ray cells across the annual ring (Murakami et al., 1999). Similarly to fibers, spectra from individual images from the same radial position cluster together supporting the observation in fibers that xylem chemistry varies across small distances in a tangential direction.

Multivariate imaging reveals the chemical landscape of secondary xylem

The FT-IR spectra acquired by the FPA detector not only enable cell-specific analysis but also allow the possibility of chemical imaging. Traditionally, chemical imaging is performed by so-called heat mapping. This involves peak-picking of individual bands characteristic of specific compounds. For plant tissues, the aromatic skeletal vibration at about 1510 cm−1 or the –C-O vibration of syringyl units at about 1320 cm−1 have been used to visualize lignin (Labbéet al., 2005; Müller and Polle, 2009; Brewer and Wetzel, 2010). This approach is, however, not without its pitfalls. Even after spectral pre-processing, unavoidable experimental variations remain (such as scattering effects that distort band shapes and positions, causing peak shifts of up to 30 cm−1; Romeo and Diem, 2005) with the consequent problems of peak integration. Moreover, the poorly resolved FT-IR spectra with large spectral overlaps between major cell wall polymers (Figure 5g, Table S1) makes individual bands hard to deconvolute and evaluate by integration. We demonstrate in Figure S2 how even the well-resolved and characteristic aromatic skeletal vibration at about 1510 cm−1 used for lignin mapping (Müller and Polle, 2009; Brewer and Wetzel, 2010) fails to visualize the higher lignin concentration in vessel walls of the aspen samples, and in practice just reflects the thickness of the sample. Briefly, problems are caused by different signal-to-noise ratios and baseline slopes between pixels, because tissue sections will unavoidably introduce varying pixel coverage and scattering (edge) effects. This implies that other wall polymers giving rise to less well-resolved peaks such as xylan and cellulose will be even more difficult to heat map with any accuracy. Similar problems have been encountered in biomedical applications of FT-IR microspectroscopy (Diem et al., 2004; Mohlenhoff et al., 2005; Romeo and Diem, 2005).

As an alternative approach, multivariate data analysis is used most frequently. Multivariate imaging takes into account the full information provided by the FT-IR spectra and will consequently map the chemotype of the cell wall. It was recently shown that OPLS-DA modeling is particularly suitable for chemical imaging of biomedical samples because of its ability to filter experimental variations and improve the interpretability of the models (Stenlund et al., 2008). Although the method is supervised, i.e. the chemotype classes have to be pre-selected, it will detect and visualize wall plasticity within cell types and can detect outlier chemotypes not included in the models. It is also open for iterative testing of hypotheses (e.g. testing whether certain cells or positions belong to a different chemotype), and any difference in chemical composition can be identified by loadings plots.

Multivariate imaging is first demonstrated using two-class models representing fiber and vessel walls. The models were constructed from selected pixels of individual images (Figure 4a, c) collected across the annual ring. Then all remaining pixels of each image were predicted into its model (Figure 4b). Depending on their position along the predictive component in the scores plot, the unclassified pixels are assigned a color value on a red–blue axis. A pseudo-colored image is then generated, using the assigned color value for each pixel (Figure 4d). The accuracy of the model is confirmed by the precision with which it assigns the correct color to the vessel and fiber walls not used to construct the model. This visualization of the chemical landscape of aspen wood reveals previously unknown features of the plasticity of fiber chemistry. A transition zone in fiber chemistry towards the vessels can be observed, so that the fiber walls around the vessels have a chemical composition resembling that of vessels more than to that of fibers in fiber-rich areas (Figure 4d). The transition zone was observed in replicate samples from other positions across the ring, and also when ray cells were visualized in a three-class model (Figure S3).

Figure 4.

 Two-class multivariate imaging of fiber and vessel wall chemotypes.
(a) Orthogonal projections to latent structures discriminant analysis (OPLS-DA) scores plot showing the separation between fiber (red) and vessel (blue) spectra acquired from one image from an aspen cross-section at position 2.
(b) OPLS-DA scores plot showing the prediction of all 4096 spectra of the image into the model. Two clouds, one around vessels (blue) and a larger one around fibers (red) can be seen, with some spectra in between (‘transition zone’ fibers). The red–blue axis for colour-coding unassigned pixels is shown. Orange, excluded; gray, unclassified pixels.
(c) Visual image, showing pixels from where spectra were extracted to build the model (red, fiber; blue, vessel). Scale bar = 50 μm.
(d) Multivariate imaging of the same cross-section, based on the OPLS-DA model. Pixels are colored by their scores values along Predictive Component 1, from the most positive (red, fiber-like) to the most negative (blue, vessel-like). Pixels in between have varying contributions of red and blue. The intermediate pixels (‘transition zone’) have close to equal contributions of red and blue, and thus become purple (fibers close to vessels). Pixels inside small lumens should be empty but could not be excluded from the model without simultaneously excluding thin walls (Figure S1). These pixels have residual signal similar to vessels if only Predictive Component 1 is mapped (because of the effects of normalization and spectral properties) and therefore appear blue.

This transition zone in fiber chemistry reflects a gradual change in fiber lignin composition, because this was the major factor separating the models for vessel and fiber walls (Figure 2d). It should be mentioned that many species have specific cell types surrounding the vessels (such as paratracheal axial parenchyma or vasicentric tracheids), but in aspen (and most other Salicaceae) the vessels are surrounded by libriform fibers (Carlquist, 2001; Sano et al., 2008). Thus, the transition zone in xylem chemistry does not reflect other cell types around the vessels, but shows that the lignin composition in the fibers is influenced by their proximity to the vessels. The fibers encircling the vessels in Populus lignify after the vessels have matured and died (Murakami et al., 1999) and do not have any pit contacts with the vessels (Sano et al., 2008). A possible scenario underlying these chemical gradients in fibers could be that the monolignol units exported to the apoplast of the lignifying vessel wall further diffuse into the unlignified wall of differentiating neighboring fibers. These would eventually become incorporated into their lignin and make the wall more vessel-like. Taken together, the results demonstrate that multivariate imaging not only assigns known features to certain locations, but also reveals unknown properties that were concealed in the dataset, such as the existence of a transition zone in cell wall chemotype of fibers towards vessels.

Multivariate imaging was also applied across the latewood to earlywood transition using a four-class model including: (i) latewood fibers, (ii) earlywood fibers, (iii) vessels, and (iv) the cell layer of terminal axial xylem parenchyma that forms at the border of the annual ring (Chafe and Chauret, 1974; Chafe, 1974; Carlquist, 2001; Figure 5a,b). In this case, separation of the four classes is best visualized in a 3D scores plot (Figure 5c). Predictive Component 1 separated vessels from the other classes, Predictive Component 2 separated early- and latewood fibers (Figure S4a) and Predictive Component 3 separated the axial xylem parenchyma from latewood fibers (Figure S4b). With this model at hand, unclassified pixels of the image could be predicted (Figure S4c) and the results mapped back to generate a pseudo-color image, where each of the three predictive components was assigned to one of the RGB (red, blue, green) color channels (Figure 5d). Again, the model predicts the unclassified pixels with high accuracy, for example the axial xylem parenchyma is visualized as a narrow light blue band at the border of the annual ring. However, when there are more than three classes in the model, component-based RGB coloring may assign a very similar color to pixels belonging to different classes (an explanation is given in the Data S1). Therefore, a better way of pseudo-coloring in this case is to use a specific color for each class instead of a specific color for each component. The visual image is then superimposed as a color intensity filter, i.e. excluding all pixels that are empty in the visual image, to create the pseudo-colored image (Figure 5e). Thus, this method can be viewed as a ‘universal stain’, which can color for virtually any difference observable by FT-IR spectroscopy. The user has the freedom to choose any attributes at the beginning of the analysis to define the classes, e.g. different cell types, developmental regions, wounded or infected areas, etc., analyse the differences between the classes by loadings plots and visualize each class by pseudo-coloring.

The power of the multivariate approach becomes apparent when comparing the loadings of Component 3 (Figure 5f), which separates the narrow band of axial parenchyma cells at the annual ring, with the spectra of reference compounds (Figure 5g). The loadings plot essentially displays all bands that are specific for lignin on the positive side (red dashed lines) and the few that are relatively specific for cellulose on the negative side (blue dashed lines), even if they are very small or appear only as shoulders. Furthermore, bands that are not specific for lignin but appear in other compounds too (e.g. xylan) do not appear on either side (grey dashed lines), meaning that they do not contribute to the separation. In other words, the analysis can virtually deconvolute complex spectra, effectively filtering out bands and showing only the relevant ones with great precision. The increased lignin found in the axial parenchyma probably reflects the thick primary wall and the additional wall layer(s) (the isotropic layer) that have been described for this cell type in P. tremuloides because they are both strongly lignified (Chafe, 1974; Chafe and Chauret, 1974).

Exploring cell-specific wall modifications in Arabidopsis mutants

Fourier transform IR microspectroscopy combined with OPLS-DA analysis is well suited for cell-specific chemotyping of mutant plants in functional genomics approaches. We demonstrate this on 12-week-old hypocotyls from Arabidopsis fra8 mutants deficient in xylan biosynthesis (Zhong et al., 2005; Pena et al., 2007). Secondary xylem of mature Arabidopsis hypocotyls consists of an inner part where vessels are embedded in what can be considered as non-lignified fibers (i.e. non-lignified xylem cells with pointed tips and thickened secondary walls) and an outer part where these fibers become lignified (Chaffey et al., 2002). Cell-specific FT-IR analysis of Arabidopsis xylem is more challenging than in Populus due to thinner cell walls that increase light scattering from edge effects and decrease signal-to-noise ratios. This limits the possibility of excluding empty areas, and the smaller cell sizes will also increase possible spill-over signal from neighboring cell types. Despite these limitations, differences in the cell wall chemistry between the investigated genotypes were successfully identified.

The visual image of the Arabidopsis xylem is less resolved than that from Populus, but extraction of cell-specific spectra from fibers and vessels was possible (Figure 6a,b). The OPLS-DA analysis of spectra acquired from vessels and fibers in the outer lignified xylem of wild-type and fra8 plants shows that the fra8 mutation affects wall chemistry in both cell types in a similar fashion as described by Predictive Component 1, whereas the difference between vessels and fibers is described by Predictive Component 2 (Figure 6c). This means that the difference caused by the mutation is larger than the difference between the two cell types. The loadings plot for Predictive Component 1 separating the wild type and fra8 shows the largest negative contribution for the –C=O stretching (1740 cm−1) prominent in xylan, along with the aromatic skeletal vibrations of lignin at 1510 and 1595 cm−1 (Figure 6d). Corresponding to the relative decrease in xylan and aromatics, the largest positive contribution comes from rather unspecific sugar ring vibrations, also including the asymmetrical –C-O-C– stretch (bands up to 1153 and 1250 cm−1). The loadings plot for Predictive Component 2, separating fibers from vessels (Figure 6e), showed very similar (although not identical) factors as in the case of the corresponding Populus model, indicating a similar chemical difference between fibers and vessels in the two species. Spectra were also extracted from xylem fibers in the inner non-lignified part of the hypocotyl. The OPLS-DA analysis showed that the fra8 mutation affected this cell type as well, with the corresponding loadings plot indicating very similar separating factors as for the lignified fibers in the outer xylem (except for the lignin bands at 1510 and 1595 cm−1 which are obviously absent in this cell type; Figure 6f,g). Thus, FT-IR microspectroscopy could demonstrate that the fra8 gene is important for xylan biosynthesis in both lignified and non-lignified fibers of the secondary xylem of the Arabidopsis hypocotyl. Because the intact wall is analysed, the overall effect of the mutation on polymer composition is described, and in the case of fra8 it is clear that in addition to xylan the proportion of lignin is also decreased.

Figure 6.

 Cell-specific chemotyping of the Arabidopsis hypocotyl xylem in the wild type and fra8 mutant.
(a) Visual image of the cross-section of a mature wild-type Arabidopsis hypocotyl grown under conditions promoting secondary growth. OX, outer lignified xylem; IX, inner non-lignified xylem. Scale bar = 50 μm.
(b) The corresponding FT-IR image, using the intensity of the 1740 cm−1 band. Orange, pixels excluded by threshold selection (empty areas and parts of the inner xylem due to low intensity spectra of very thin cell walls). Pixels used for extracting spectra are marked red (fiber) and blue (vessel).
(c) Orthogonal projections to latent structures discriminant analysis (OPLS-DA) scores plot showing the separation of xylem cell spectra extracted from outer lignified xylem of wild type and fra8 in Predictive Component 1, and between fibers and vessels in Predictive Component 2. Filled symbols, wild type; empty symbols, fra8 mutants; red squares, fibers; blue triangles, vessels. n = 30 per cell type and genotype sampled from three replicate plants.
(d) Correlation-scaled loadings plot for Predictive Component 1, showing factors separating xylem cell chemotypes of wild type and fra8 mutant. The marked negative bands are more intense in wild type and related to xylan and lignin. The marked positive bands are more intense in fra8 mutants and mostly related to sugar units.
(e) Correlation-scaled loadings plot for Predictive Component 2, showing factors separating Arabidopsis fibers and vessels. The marked negative bands are more intense in fibers and mostly related to (hemi)cellulose. The marked positive bands are more intense in vessels and related to lignin.
(f) The OPLS-DA scores plot showing the separation of spectra extracted from fibers of the inner non-lignified xylem of the wild type and fra8. Filled symbol, wild type; empty symbols, fra8. n = 30 per genotype sampled from three replicate plants.
(g) Correlation-scaled loadings plot for Predictive Component 1 showing factors separating non-lignified xylem fibers of the wild type and fra8 mutant. Marked negative bands are more intense in the wild type and characteristic of hemicelluloses. Marked positive bands are more intense in fra8 mutants and are related to sugars. Due to the lack of lignification, noise is upscaled by normalization in the spectral region above 1400 cm−1 where lignin would absorb.
(h) The OPLS-DA scores plot showing the prediction of all 4096 spectra from a wild-type Arabidopsis hypocotyl into the model separating fibers and vessels. Spectra used to create the model are colored in red (fibers) and blue (vessels). Orange, empty; gray, unclassified pixels. Unclassified pixels falling outside the confidence interval are also above the plane, described by Orthogonal Component 2 (not shown).
(i) pseudo-colored multivariate image of the wild-type Arabidopsis hypocotyl based on the OPLS-DA model. Red, pixels with negative scores values along Predictive Component 1 (fibers); blue, pixels with positive scores values along Predictive Component 1 (vessels); yellow, pixels with high positive scores values along Orthogonal Component 1.

Multivariate imaging was applied on the Arabidopsis hypocotyls using a two-class model separating vessels and fibers in the outer lignified xylem. The OPLS-DA scores plot shows that a large portion of the unclassified pixels falls outside the confidence interval of the model (Figure 6h). This means that there was another major source of variation unaccounted for in the initial model. When the green color channel was assigned to Orthogonal Component 1 or 2 to mark outlier pixels, it was seen that the inner non-lignified xylem and the epidermis of the hypocotyl were the tissues that did not fit the initial model (Figure 6i). This example shows that non-predicted tissues and/or cell types will be detected when the difference from the assigned classes is large enough. Thus, the method is capable of detecting and localizing wall chemotypes that are unknown or left out when building the model (e.g. due to tension wood, fungal infection, wounding, etc.) even without a priori knowledge or input.

Concluding remarks

We demonstrate how OPLS-DA analysis combined with FT-IR microspectroscopy can be used to identify, characterize and visualize cell wall chemotypes in plant tissues. The approach is hypothesis driven and requires initial ideas about specific chemotypes. We used OPLS-DA of spectra from selected cell types to: (i) test the hypothesis that selected cell types are indeed different, (ii) identify the chemistry that separates the cell types, (iii) image how well all cell types in the sampled area fit into the predicted chemotypes, and (iv) identify intermediate chemotypes as well as chemotypes not falling into the selected classes. The method is iterative in that intermediates and outliers can in turn be assigned to a separate chemotype, their distribution visualized and the difference in spectral bands identified. The method is well suited for investigating chemical plasticity of cell walls and the effects of environmental or genetic factors on cell wall chemistry in specific cell types or tissues. It can be applied to any plant organ/tissue using not only FT-IR but also Raman and UV microspectroscopy.

Experimental Procedures

Plant material

Twenty-micrometer-thick sections from previous year annual ring of 15-year-old aspen trees (Populus tremula), growing in natural environment outside Umeå, Sweden (63°47′N, 20°17′E) were made by crysectioning, placed between two standard microscopy glass slides and dried in a desiccator for at least 48 h. Prior to FT-IR measurements, sections were transferred to IR-transparent polished rectangular BaF2 windows (30 × 15 × 4 mm; International Crystal Laboratories, Garfield, NJ, USA). Arabidopsis were grown in short days (8 h) for 12 weeks and hypocotyls from the wild type and fra8 mutants (Wu et al., 2010), both in Columbia ecotypes, were harvested and treated in the same way.

FT-IR microspectroscopy

Spectra were recorded on a Bruker Equinox 55 spectrometer equipped with a microscopy accessory (Hyperion 3000) including a 64 × 64 FPA detector (Bruker Optics, http://www.brukeroptics.com/), providing a maximum spatial resolution of approximately 5 μm at about 4000 cm−1. Visual photographs of the samples for spectral overlay were taken by a Sony Exwave HAD colour digital video camera (http://www.sony.com/) mounted on the top of the microscope. The sample tray was boxed, and the chamber was continuously purged with dry air. Spectra were recorded in transmission mode over the range of 850–3850 cm−1 with a spectral resolution of 8 cm−1. For each image, 64 interferograms were co-added to obtain high signal-to-noise ratios. Prior to sample measurements, background spectra were recorded for each sample at a nearby empty spot on the BaF2 crystal with the same number of scans. Spectral treatments included a two-point linear baseline correction between 900 and 1850 cm−1 and total sum (area) normalization over the same spectral range.

The FT-IR spectroscopy of reference compounds

Reference spectra were obtained from lignin isolated from wild-type poplar (P. tremula × P. alba) according to Marita et al., 2001 (courtesy of J. Ralph, University of Wisconsin, Madison, WI, USA), xylan from birchwood (Sigma-Aldrich, http://www.sigmaaldrich.com/) and cellulose powder for thin layer chromatography (CAMAG, http://www.camag.com/). Ten milligrams of sample powder was manually ground with 390 mg of dry KBr (IR spectroscopy grade, Merck KGaA, http://www.merck.de/en/worldwide.html). Spectra were recorded over the range of 400–5200 cm−1 with a spectral resolution of 4 cm−1, using the diffuse reflectance method. The same Bruker Equinox 55 spectrometer was used and spectra were baseline-corrected and normalized in the same manner as in the case of the microspectroscopic measurements.

Multivariate analysis

Each hyperspectral image was unfolded to a two-dimensional data matrix (X) where each row constitutes an FT-IR spectrum at a specific (x,y)-coordinate in the microspectroscopic image. Each spectrum was then processed by minimum value and offset correction as specified previously (Stenlund et al., 2008).

The OPLS method (Trygg and Wold, 2002, 2003) is an extension to the supervised PLS (Eriksson et al., 2006) regression method featuring an integrated OSC-filter (Wold et al., 1998). The X data are decomposed by OPLS into three distinct parts (Eqn 1), using information in the response matrix Y. Tp and Pp denotes the Y-correlated scores and loadings, To and Po denotes the Y-uncorrelated scores and loadings and E denotes the residuals of X. Hence, OPLS regression has the ability to separately handle Y-correlated and Y-uncorrelated data in X, resulting in a focused pip and more interpretable pjo:


Analogously to PLS-DA, OPLS can be used for discrimination (OPLS-DA; Bylesjo et al., 2006). Discriminant analysis (DA) means that a ‘dummy’ response matrix is used for class modeling. One (1) defines that a sample belongs to a class and zero (0) defines that the sample does not belong to the class. Statistical details of the models are given in Table S2.


Spectra were converted to data point tables using OPUS (version 5.0.53, Bruker Optik GmbH, http://www.brukeroptics.com/). All OPLS models used SIMCA®-P+ (12.0, Umetrics, Umeå, Sweden). Data processing and pseudo-coloring were performed by custom scripts programmed within the MATLAB software 7.0 (Mathworks, Natick, MA).


This work was financed through FORMAS (FuncFiber – a centre of excellence in wood science), Woven (wood formation under varying environmental conditions), the Kempe Foundation, the Swedish Governmental Agency for Innovation Systems, the Swedish Energy Agency, the Swedish Research Council and MKS Umetrics. The authors thank Dr John Ralph for providing the lignin reference material, Dr Ai-Min Wu for providing the fra8 mutant, Kjell Olofsson for technical assistance and Dr Ryo Funada for valuable discussions.