Using imaging spectroscopy to estimate integrated measures of foliage nutritional quality


Correspondence author. E-mail:


1. Integrated measures of foliage nutritional quality that consider the influence of fibre and tannins on the digestibility of nitrogen (N) may provide a more meaningful estimate of forage quality than total foliar N for many herbivorous species. The ability to estimate available nitrogen (AvailN) on a landscape scale could have important applications for herbivore conservation and management. However, AvailN has never been modelled with imaging spectroscopy.

2. We collected hyperspectral remote sensing data (HyMap) over Eucalyptus trees in south-eastern Australia. Using a combination of laboratory near infrared spectrophotometry, a recently developed in vitro assay for AvailN and two powerful spectral transformation methods, continuum removal and derivative analysis, we developed linear regression models to scale concentrations of total foliar N, AvailN and digestible dry matter (DDM) from leaf to canopy-level.

3. The model estimates achieved R2-values between 0·55 and 0·64 for AvailN, 0·54–0·60 for N and 0·75–0·78 for DDM. Ninety-five per cent of the wavebands selected by the models corresponded to known absorption features. This and the large contribution of a small number of wavebands suggest that it may be possible to develop prediction algorithms based on a few wavelengths that could be extrapolated to other landscape types.

4. This is the first time that integrated measures of foliage nutritional quality have been estimated with imaging spectroscopy. In combination with appropriate crown-delineation techniques, our methodology will enable users to map variations in these foliar constituents across forest canopies.


Concentrations of nutrients and chemical defences in plant foliage are genetically and environmentally determined (Moore et al. 2004; Andrew et al. 2007). This results in a patchy distribution of forage quality within a landscape, even if it appears homogenous in terms of species composition and density of cover (Mutanga et al. 2005; DeGabriel et al. 2009). Studies of animal foraging have found significant relationships between individual foraging decisions and plant nutrients and plant chemical defences (Behmer, Simpson, & Raubenheimer 2002; Marsh et al. 2003a; Provenza et al. 2003; Youngentob et al. 2011a). Likewise, herbivore distributions have been linked to variations in the chemical quality of forage (Pausas, Braithwaite, & Austin 1995; Ball, Kjell, & Sunesson 2000; Chapman et al. 2004). The ability to estimate particular chemicals in plant foliage on a landscape scale may provide an indication of habitat quality for some herbivorous species (Leyequien et al. 2007). In addition, landscape-scale measurements of plant chemistry can provide important information about ecosystem processes and functioning (Martin & Aber 1997; Ollinger & Smith 2005; Asner & Martin 2008).

Until recently, assessing plant chemistry on a landscape scale has been impractical because it required sampling thousands of leaves in the field for lengthy laboratory analyses. Recent technological advances in near-infrared spectrometry and hyperspectral remote sensing are opening the door to the rapid assessment of leaf chemical composition in the laboratory and across whole forest canopies (for reviews see, Kerr & Ostrovsky 2003; Majeke, van Aardt, & Cho 2008; Kokaly et al. 2009). Imaging spectroscopy builds upon the extensive laboratory near-infrared spectrometry research that has identified strong relationships between the absorption of electromagnetic radiation and various chemical constituents (Curran 1989; Ebbers et al. 2002). Molecular vibrations resulting from the rotation, bending and stretching of chemical bonds absorb electromagnetic radiation at frequencies that correspond to their energy state and create harmonics and overtones in the near-infrared (NIR) and shortwave infrared (SWIR) regions of the electromagnetic spectrum (Kokaly & Clark 1999). Variations in reflectance at wavelengths that correspond to specific molecular interactions can be used to identify and quantify the chemical composition of materials based on high spectral resolution data.

Nitrogen (N) is a limiting nutrient for many herbivores because of its low concentration in plant tissues (Kavanagh & Lambert 1990; Robbins 1993; White 1993). Nitrogen is often used as a measure of nutritional quality, plant productivity and plant health because of the important role that N plays in the production of chlorophyll and proteins (Cork & Catling 1996; Smith et al. 2002). However, several studies of vertebrate herbivores have suggested that more attention should be directed towards integrated measures of forage quality (e.g. nitrogen availability) rather than individual nutrients (Foley & Moore 2005; McArt et al. 2009). This led DeGabriel et al. (2008) to develop an in vitro assay to measure the nutritional quality of foliage that integrates the effects of fibre and tannins on the amount of N that is available for digestion. Tannins are a common plant secondary metabolite (PSM), which can bind to nitrogenous compounds in plant tissues and reduce the digestibility of plant proteins (Hagerman et al. 1992; Hanley et al. 1992). Tannins also can interfere with the ability of animals to detoxify other PSMs (Min et al. 2003). High levels of tannins can reduce forage intake by herbivores (Robbins et al. 1987; Marsh, Wallis, & Foley 2003b; Min et al. 2003). For the many herbivorous species that are sensitive to tannins, available nitrogen (AvailN) may be a more meaningful measure of forage quality than total foliar N (Foley & Moore 2005; DeGabriel et al. 2008; Wallis, Nicolle, & Foley 2010).

The link between concentrations of N and specific absorption features in the electromagnetic spectrum is well-established, and numerous studies have attempted to estimate foliar N from remote sensing data with varying success (for a review see, Majeke, van Aardt, & Cho 2008). However, using imaging spectroscopy to estimate N availability for herbivores would be a significant advance. In this paper, we combine two powerful spectral transformation methods, continuum removal and derivative analysis, with linear regression modelling to scale concentrations of total foliar N, AvailN and digestible dry matter (DDM)—a common in vitro measure of plant nutritional quality based on fibre and lignin—from leaf to crown-level with hyperspectral remote sensing data collected over Eucalyptus trees in south-eastern Australia. These three foliar constituents have the potential to provide a useful measure of forage quality for a wide-range of herbivorous species and are known to play a role in forage choice and habitat quality for several arboreal marsupial folivores indigenous to our study area (McIlwee et al. 2001; DeGabriel et al. 2009; Youngentob et al. 2011a).

Materials and methods

Study site

Our research site was 100 km west of Canberra near Tumut, New South Wales, Australia (148°30′E, 35°10′S; elevation 600–1200 m above sea level). The area includes a 50 000-ha exotic Pinus radiata plantation bordered by remnants of native Eucalyptus forest and grazed paddocks. Our research focused on the remaining native Eucalyptus forest and isolated eucalypt paddock trees in and around Buccleuch, Bungongo and Bondo State Forests and Brindabella National Park. The native forest types range from dry sclerophyll to tall, open and montane forests. The most prevalent Eucalyptus species in the region are manna gum (E. viminalis), mountain gum (E. dalrympleana), narrow-leaved peppermint (E. radiata), red stringy bark (E. macrorhynca), broad-leaved peppermint (E. dives), apple box (E. bridgesiana), mountain swamp gum (E. camphora) and snow gum (E. pauciflora). Temperatures average 15·3–29·3°C in summer and 3·0–11·5°C in winter (mean max and mean min; Burrinjuck Climatic Station). Precipitation is distributed evenly across the year and typically ranges from 785 to 1385 mm annually (BIOCLIM, Nix 1986). The underlying geology is comprised primarily of granite interbedded with sandstone and shale.

HyMap data collection and pre-processing

Eucalyptus (L’Herit) is a genus of broad-leaved evergreen trees that can produce new foliage throughout the year when conditions are favourable (Williams & Woinarski 1997). Owing to the seasonal climate in our study region, new growth is less common during the cooler months. Leaf-age can influence the concentrations of foliar chemicals in eucalypts (Kavanagh & Lambert 1990), so we timed our data acquisition to correspond with peak leaf maturity, at the end of the growing season. On 8 March 2007, Hyperspectral Mapper (HyMap; HyVista Corporation Pty Ltd, Australia) data were collected between noon and 3 pm, under clear conditions. HyMap is an airborne imaging spectrometer that provides a spectral range of 446·1–2477·8 nm at bandwidth intervals of 10 nm in the visible (VIS) and NIR wavelengths and 15–20 nm in the SWIR (Cocks et al. 1998). Flying altitude was approximately 1500 m, which provided an instantaneous field-of-view (IFOV) of 3–3·5 m with a swath width of 1·8 km (512 pixels).

Five adjacent, NE–SW flightlines (15–20 km in length) were flown in the northern region of the study area and six in the south. The HyMap data were provided by HyVista in a geo-corrected format based on positional data (UTM-WGS-84, Zone 55 S) and atmospherically corrected to reflectance using HyCorr, ATREM for HyMap. The apparent reflectance data were further corrected for residual noise using Empirical Flat Field Optimal Reflectance Transformation (EFFORT, Boardman 1998). We removed one corrupted waveband at 446·1 nm and two additional bands (1389·1–1403·9 nm) located in a spectral region that is strongly influenced by water vapour, resulting in 123 bands. The flightlines were then geo-referenced and mosaicked (nearest-neighbour re-sampling) using ENVI software (Research Systems, Inc., Boulder, CO, USA) and ephemeris data provided by HyMap.

Leaf sample collection

During the week of the over-flight in March 2007, we collected Eucalyptus leaf samples from isolated paddock trees that would be identifiable in the imagery. We collected 99 canopy-leaf samples by shooting one branch from the top-half of the canopy with a 0·222-calibre rifle. We collected about 50 g of fully expanded, adult foliage from each tree. We collected at least ten samples from each of the eight most prevalent canopy tree species in the region (see Study Site). Immediately following collection, leaf samples were placed in a portable freezer and transported to the laboratory where they were freeze-dried. The freeze-dried samples were then ground to pass a 1-mm sieve using a Cyclotec 1093 mill (Tecator, Sweden). The ground samples were further dried in an oven at 40°C for 12 hours to remove residual moisture and then cooled in a desiccator to maintain dryness. Following the methods described by Ebbers et al. (2002), we used a Foss-NIR System Model 6500 Scanning Spectrophotometer (Foss NIRSystems, Laurel, MD, USA) fitted with a spinning cup module to take spectral measurements (400–2500 nm at 2-nm intervals) from each of the freeze-dried, ground leaf samples.

Upon receiving the HyMap data, we were informed of a 1-hour time delay in data acquisition that resulted in strong cross-track illumination issues over the southern flightlines. We chose to focus only on the region included within the five northern flightlines (Fig. 1) to reduce any potential confounding factors associated with this variation in data quality. This diminished the number of paddock trees that were available for the training data set from 99 to 46 because only those trees that had been collected within the area under the northern flightlines could be used. Therefore, we returned to the field the first week of May 2007 to collect additional canopy-leaf samples (n = 34) from isolated eucalypt paddock trees under the northern flightlines following the same methods described earlier. The total number of trees chosen for sampling was based on previous research that established a precedent for the number of eucalypt tree canopies required to model nitrogen with hyperspectral remote sensing data (Huang et al. 2004) and the availably of isolated eucalypt paddock trees.

Figure 1.

 Mosaicked northern HyMap flightlines (black and white image) with an inset showing a few of the isolated Eucalyptus paddock trees that were used for model calibration.

An important caveat is that leaf samples collected from a few branches may not represent the foliar chemistry of an entire tree canopy. It is also possible that the concentrations of foliar chemicals may have changed between the time of the over-flight and the time when the leaf samples were collected, although we expect that any difference would be minimal because of the collection of adult foliage within a relatively short timeframe (Moore et al. 2004; Andrew et al. 2007). These factors could contribute to model error. We attempted to minimize these potential sources of error by collecting leaves from the top-half of the canopy (visible to the sensor) and through conscientious timing of leaf sample collection. The complicated logistics of remote sensing data acquisition in combination with concurrent field campaigns and the realistic limitations of canopy-leaf sampling mean that these potential sources of error could not be entirely eliminated.

Chemical assays

We used laboratory near infrared spectrophotometry (NIRS) to develop calibration equations to predict concentrations of foliar N, AvailN and DDM in our leaf samples based on reference values obtained from chemical assays (details below) performed on a subset of samples. NIRS calibration equations developed from reference values that are representative of the spectral diversity of the population can provide similar accuracies to traditional laboratory techniques, and this method consumes considerably less time and is less resource intensive than conducting wet-chemical assays on every sample (Foley et al. 1998; Shepherd et al. 2005). To capture the widest possible range of spectral and chemical diversity, we developed our NIRS calibration equation from a large collection of freeze-dried eucalypt leaf samples (n = 555). This collection of leaf samples was comprised of leaves collected for two studies; this one and a separate study on the feeding biology of Petauroides volans (a marsupial which browses exclusively on Eucalyptus leaves) conducted in the same area (Youngentob et al. 2011a). Both studies used identical leaf sample collection methods. We used the Mahalanobis distance calculation of spectral variation to identify the leaf samples that best captured the spectral diversity of our NIR spectral library (WinISI software; InfraSoft International, Port Matilda, PA, USA).

Chemical assays for N, AvailN and DDM were conducted on those samples identified by the Mahalanobis distance procedure as representing the spectral variation of the whole data set (n = 100). Using the freeze-dried, ground leaf material, we corrected for residual moisture by drying 1 g of material to constant mass at 50°C. The N concentration of foliage was determined on duplicate samples using a semi-micro Kjeldahl technique with a selenium catalyst and ammonium sulphate as a standard. We determined in vitro DDM and AvailN using the method described by DeGabriel et al. (2008). This method involved digesting the samples in filter bags (Ankom F57; Ankom Technology, Macedon, NY, USA), first with pepsin (for 24 hours) and then with cellulase (for 48 hours). We obtained DDM and AvailN values by comparing the sample weight and N concentration pre- and postdigestion. The assay was repeated if the coefficient of variation between duplicates exceeded 2% (N) or 7% (DDM and AvailN).

Following the established methods of Ebbers et al. (2002), we used the laboratory NIR spectra and the wet-chemistry values from the 100 sample subset to develop robust calibration equations to predict the chemical concentrations of N [R2 = 0·97, standard error of cross-validation (SECV) 0·06], AvailN (R2 = 0·96, SECV = 0·10) and DDM (R2 = 0·97, SECV = 0·03) in the remaining leaf samples. A separate independent validation of 20 samples for N resulted in excellent agreement between the predicted and analysed values (R2 = 0·96, standard error of prediction = 0·04) and suggested that the samples selected for the training algorithm by the Mahalanobis distance calculation were representative of the spectral range of our data.

Collection of Eucalyptus tree-canopy spectra from HyMap imagery for model calibration

Eucalyptus trees typically have an open-canopy architecture and pendulous leaves, which can result in mixed-pixels containing elements of leaves, bark and the ground beneath the tree. The selection of relatively pure canopy foliage pixels from the imagery was important for scaling reference values based on leaf spectra to canopy-level spectra (Huang et al. 2007). First, isolated Eucalyptus paddock trees from which canopy-leaf samples had been collected were located in the HyMap imagery (Fig. 1). Then, we displayed the HyMap reflectance data in three wavebands from the SWIR (1·65 μm), NIR (0·84 μm) and VIS red-edge (0·67 μm) regions of the electromagnetic spectrum (red, green and blue, respectively). Viewed in this combination, green pixels indicate high concentrations of chlorophyll containing vegetation (e.g. canopy leaves) and purple, blue and white pixels are either not as photosynthetically active (e.g. bark and branches) or highly shaded. Following the methods described by Huang et al. (2004), we selected only those tree canopies from which we could collect at least 4 ‘good’ (green) pixels from each tree for our model training and testing data sets (n = 77 trees) and obtained a mean, median and maximum spectral value for each tree canopy. Also, duplicate pixels resulting from the nearest-neighbour re-sampling of image pixels were removed from the analysis, leaving only unique spectra from each tree.

Spectral transformations

The effects of field-of-view and photon-scattering can influence the amount of radiance that reaches a sensor and negatively effect the signal-to-noise ratio of spectra collected with imaging spectrometers (Tsai & Philpot 1998; Richards & Jia 2006). Several methods, including scatter-corrections, derivative analysis, smoothing transformations and continuum removal analysis, have been developed to enhance signal components and reduce background effects in spectral data (Clark & Roush 1984; Dhanoa et al. 1994; Tsai & Philpot 1998). Following the methods described by Huang et al. (2004), we applied continuum removal analysis across the full spectrum to normalize reflectance values and to emphasize absorption features in our HyMap spectra. In continuum removal analysis (eqn 1), a convex hull is fitted over a spectrum to connect the points of maximum reflectance with a straight line. The reflectance value (inline image) of a specific wavelength (λ) is then divided by the reflectance value of the continuum line (inline image) at the corresponding wavelength:

image(eqn 1)

The peak reflectance points where the continuum line meets the actual spectrum are standardized to unity, and CR values decrease towards zero as the distance between the continuum line and the original spectrum increases.

The reflectance values were then transformed into pseudo-absorption values by calculating log (1/CR) (Huang et al. 2004; Fig. 2a). To remove the effects of curvi-linearity and baseline shift, we detrended the log (1/CR) spectra by subtracting an individually fitted second-degree polynomial from each spectra (Fig. 2b) and then applied a standard normal variate (SNV) scatter correction to remove unnecessary signal components (Barnes, Dhanoa, & Lister 1989; Fig. 2c). We tested various combinations of Savitzy–Golay derivative-based spectral smoothing functions provided by the WinISI software (Win ISI; Port Matilda, PA, USA), which also has been demonstrated to improve model fit by emphasizing absorption features while reducing noise (Tsai & Philpot 1998; Fig. 2d).

Figure 2.

 Spectral transformations applied to the average, maximum continuum-removed (CR) HyMap reflectance data from 77 eucalypt tree canopies: (a) pseudo-absorption spectra from the log(1/CR) data; (b) detrended log(1/CR) spectra; (c) standard normal variate (SNV) scatter correction applied to the detrended, log(1/CR) spectra; (d) a Savitzy–Golay derivative-based spectral smoothing routine (e.g. 2221) applied to the SNV, detrended log(1/CR) spectra: 2221 = the second derivative (2) was calculated with a primary smoothing of 2 nm (2) across a gap size of 2 nm (2) and no secondary smoothing (1).

Regression modelling

Calibration equations from the transformed mean, median and max HyMap spectra were developed using three common regression methods. The first was a partial least squares regression (PLSR) method based on all wavebands (Wold 1975). PLSR is a multivariate extension of multiple linear regression that determines the independent linear combinations of the predictor variables (i.e. wavebands) that explain the maximum covariation with the response variables (i.e. chemical concentrations). Thus, PLSR compresses the independent variables into factors, similar to a principal component regression. We used a modified PLSR (Shenk & Westerhaus 1991a), which normalizes (i.e. zero mean, unit variance) the chemical concentrations and reflectance values at each wavelength. PLSR requires cross-validation to prevent over-fitting the model (described below). We also used step-up and stepwise regression to develop models based on a subset of wavebands. Step-up regression begins with a single waveband from the full spectrum and then adds subsequent wavebands to the regression model (Weisberg 1980). The waveband selected is the one that results in the largest increase in model fit, which is assessed with the coefficient of determination, R2. The model is run with the added variable, and this process can be repeated to add additional terms. Stepwise regression is a variation of step-up regression that relies on an F-test of significance to determine whether a previous term can be removed once a new term is added to the model.

Over-fitting is a common problem in linear regression models, resulting from a tendency of fitting procedures to exploit as large a number of predictor variables as possible to explain all the variation in a given training data set. While the fit to training data is very good, it is likely to result in a regression model that is too complex to have any real predictive power for independent validation data (Weisberg 1980). The number of terms selected for a model requires consideration of the sample size, the closeness of fit and the contribution of each additional term. To avoid over-fitting, we used a test-of-exact-fit to identify the maximum number of terms that could be expected to fit the population covariance matrix, based on the number of samples and wavebands (Bollen & Long 1993). Thus, model complexity was restricted to a maximum of six terms.

Model fit was further assessed using cross-validation (Elisseeff & Pontil 2002). Cross-validation provides an estimate of model error based on data re-sampling. Samples were split into 6 groups (so-called 6-fold cross-validation), and we trained the model six times on all but one group, which served as validation data. A SECV was obtained by pooling the residuals from each round of prediction and averaging the estimates of prediction error across the six repetitions. Obtaining an external estimate of prediction is not always feasible for small data sets. In these instances, cross-validation is appropriate because it enables a model to be trained and tested using all available data (Elisseeff & Pontil 2002). Two additional benefits of cross-validation are that it can be used to help identify the optimum number of terms for a model, and outliers are easily identified from the prediction residuals (Shenk & Westerhaus 1991b; Baumann 2003).

Although cross-validation is commonly used in spectrometry, some caution must be taken when interpreting model accuracy because the ability of the model to fit new data depends on how well the training data represents the entire population. An additional indication of model stability can be obtained by comparing the training and testing standard errors from the cross-validation, which should be similar. Another caveat is that MPLS, step-up and stepwise regression assume a linear relationship between leaf reflectance and concentrations of foliar biochemicals and ignore possible nonlinear interactions that can result from multiple scattering effects (Borel & Gerstl 1994).

Results and discussion

Foliar chemical composition

The laboratory NIR spectral analysis revealed that foliar concentration of N across all of the Eucalyptus leaf samples (n = 555) ranged from 0·78 to 1·86% dry matter (DM) (mean = 1·22% DM), AvailN from −0·23 to 1·20% DM (mean = 0·38) and DDM from 0·19 to 0·66 g/g DM (mean = 0·42). We found a positive correlation between AvailN and N (< 0·001, = 13·18, = 0·49) and between AvailN and DDM (< 0·001, = 28·91, = 0·78), but no relationship between DDM and N (= 0·38, = 0·87, = 0·037) based on mature foliage. However, the AvailN concentration of leaves was significantly more variable than total foliar N based on a two-variance F-test (= 2·83, > 0·001). In some samples, the foliage had negative AvailN. These findings are consistent with previous analyses of these constituents in Eucalyptus foliage and can be attributed to high concentrations of tannins (DeGabriel et al. 2008, 2009; Wallis, Nicolle, & Foley 2010). Total foliar N did not capture the range of nitrogen variation that could be encountered by herbivores, and this highlights the potential importance of measuring AvailN, in addition to total N as an indication of forage quality.

Estimating N, AvailN and DDM with imaging spectroscopy

The best-performing models (criterion of the lowest SECV) from the transformed HyMap spectra are presented in Table 1. The step-up and stepwise regression methods had lower SECV and generally higher R2 values than MPLS. Model fit reached R2 values of 0·60 for N, 0·64 for AvailN and 0·78 for DDM (Table 1 and Fig. 3). The MPLS regressions resulted in a lower agreement between predicted and analysed values for N (R2 = 0·54), AvailN (R2 = 0·55) and DDM (R2 = 0·75). The highest correlation between predicted and analysed values was obtained for DDM.

Table 1.   Results from modeling (MPLS, step-up and stepwise regression with cross-validation) the relationship between the foliar concentrations of nitrogen [N; % dry matter (DM)], available nitrogen (AvailN; % DM) and digestible dry matter (DDM; g/g DM) and the spectral characteristics of Eucalyptus canopies collected with an airborne hyperspectral sensor (HyMap)
ConstituentMath treatmentStatistical methodSECSECVR2Selected wavelengths* or number of terms (MPLS only)
  1. Models are based on the maximum spectra of pixels collected from individual tree canopies. Continuum-removal analysis and standard normal variate scatter correction and detrending (SNV-detrend) were applied to the full spectrum of all samples before modeling. Math treatment refers to the Savitzy–Golay derivative based spectral smoothing and includes the derivatives and the number of data points across which the smoothing functions were calculated [e.g. 2,4,4,1 indicates that the second derivative (2) was calculated with a primary smoothing of 4 nm (4) across a gap size of 4 nm (4) and no secondary smoothing (1)]; SEC is standard error and SECV is standard error of cross-validation predictions. The degree of correlation between predicted and analysed values is indicated by the R2-value.

  2. *Wavelengths are listed in the order of selection.

N2,2,2,1MPLS0·90·110·542 terms
N2,1,1,2Step-up0·100·100·601662, 1785, 1476, 1419, 624, 908
N2,1,1,2Stepwise0·100·100·58624, 908, 1785, 1662, 1419
AvailN2,4,4,1MPLS0·150·180·553 terms
AvailN2,1,1,1Step-up0·160·160·641675, 1330, 2174, 475, 810, 1971
AvailN2,1,1,1Stepwise0·150·150·64824, 475, 2174, 1330, 1675, 1462
DDM2,4,4,1MPLS0·050·060·754 terms
DDM1,1,1,1Step-up0·060·070·771675, 954, 739, 652 , 594, 2297
DDM1,4,4,1Stepwise0·060·060·781558, 969, 652, 594, 1637
Figure 3.

 Predicted vs. analysed foliar concentrations of nitrogen (N; % dry matter (DM)), available nitrogen (AvailN; % DM) and digestible dry matter (DDM; g/g DM) using step-up (for N) and stepwise (for AvailN and DDM) regression models applied to the transformed HyMap maximum spectra. The degree of correlation between the predicted and analysed values is provided by R2.

The maximum spectra resulted in better models than the mean or median spectra for all foliar constituents, although the mean spectra performed nearly as well in some models (results for maximum spectra only reported in Table 1). The open-canopy structure and pendulous leaves that are common characteristics of eucalypt tree canopies can create variability in albedo. Huang et al. (2004) suggested that the maximum spectra may provide a better representation of the canopy foliage spectra in these situations, and this was consistent with our findings.

Derivative analysis improved the performance of all of our models. We tested multiple combinations of Savitzy–Golay derivative-based math treatments, and no single treatment was identified as clearly superior to other possible combinations for all constituents. This variability in optimal derivative and smoothing treatments among models is reported in other studies that used similar spectrometry methods with laboratory and imaging spectra (e.g. Ebbers et al. 2002; Huang et al. 2004). This is sensible because the reflectance characteristics that correspond to particular foliar constituents have unique signatures that will interact differently with the various derivative and smoothing treatments according to their band depth, location and width.

Although MPLS regression is a favoured method for creating prediction equations with laboratory spectra (e.g. Ebbers et al. 2002), we found that the step-up and stepwise regression resulted in better prediction accuracies with the remotely sensed data. Partial least squares regression compresses the wavelengths into a new small subset of factors (features), and this transformation works well with data that have a high signal-to-noise ratio (SNR; e.g. laboratory data) because the result of extracted information is derived from the complete set of wavebands. However, PLSR does not guarantee the best performance because the transformation is not directly target-oriented. This is particularly an issue with data that have a lower SNR (i.e. remotely sensed data) because the PLS data compression does not remove all of the irrelevant information. Stepwise regression uses the original, best subset of wavelengths for predicting foliar content, which has the advantage of capturing the sensitive bands directly. When the selected bands are of good quality and ignored bands are insignificant, this method can perform better, as appears to be the case with our results.

Wavelength selection in relation to recognized absorption features

Stepwise and step-up regression methods are often criticized for selecting bands that do not correspond with known absorption features (Curran 1989; Huber et al. 2008). However, 95% of the wavebands selected by our stepwise and step-up regression models correspond to causal absorption features reported in the literature (Table 2). Only one waveband (1330 nm) was selected that was not within 25 nm of a recognized absorption feature (Huang et al. 2004), and the majority were within the 12 nm range suggested by Curran, Dungan, & Peterson (2001) as consistent with causal absorption. Notably, the only nonassociated waveband at 1330 nm also was identified in a study that investigated the use of imaging spectroscopy to classify the two major eucalypt subgenera, Eucalyptus (‘monocalypts’) and Symphyomyrtus (‘symphyomyrtles’) (Youngentob et al. 2008, 2011b). This is relevant to our research because Wallis, Nicolle, & Foley (2010) found that monocalypt foliage contains less AvailN on average than symphyomyrtle foliage. The selection of this waveband by AvailN models may relate to a spectral difference between those two subgenera that also correlates to foliar concentrations of AvailN (e.g. Asner & Martin 2008).

Table 2.   Relationship between the wavelengths selected by the step-up and stepwise regression models reported in Table 1 and associated absorption features within 25 nm
Selected wavelength (nm)Known absorption feature (nm) and related biochemical (s)Absorption mechanism and reference
475460 Chlorophyll bElectron transition (Curran 1989)
594570 Chlorophyll and nitrogenElectron transition (Penuelas et al. 1994)
624640 Chlorophyll bElectron transition (Curran 1989)
652Chlorophyll aElectron transition (Curran 1989)
739, 810, 824Red-edge (680–800) 800 TanninShift from photo absorption by chlorophyll to photon reflectance by mesophyll (Curran 1989; Filella & Penuelas 1994; Soukupova, Rock, & Albrechtova 2002; Ferwerda, Skidmore, & Stein 2006)
908910 Nitrogen, proteinC-H stretch, 3rd overtone (Curran 1989)
954930 Oil
948 Tannin
C-H stretch, 3rd overtone (Curran 1989; Ferwerda, Skidmore, & Stein 2006)
969970 Water, starchO-H bend, 1st overtone (Curran 1989)
14191420 LigninC-H stretch, C-H deformation, O-H stretch, 1st overtone (Curran 1989)
14621450 Sugar, starch, water, lignin 1456 TanninC-H stretch, C-H deformation, O-H stretch, 1st overtone (Curran 1989; Soukupova, Rock, & Albrechtova 2002)
14761470 Lignin, tannin
480 Cellulose, lignin
1490 Cellulose, sugar
C-H stretch, O-H stretch, 1st overtone (Curran 1989; Soukupova, Rock, & Albrechtova 2002; Ferwerda, Skidmore, & Stein 2006)
15581560 Cellulose, ligninO-H stretch, 1st overtone (Elvidge 1990)
16371640 Nitrogen, tanninN-H stretch, 1st overtone, NH3+ NH deformation, 3rd overtone, C-H stretch, 1st overtone (Murray & Williams 1987; Curran 1989; Ferwerda, Skidmore, & Stein 2006)
16621645 NitrogenNH3 +  NH deformation, 3rd overtone (Murray & Williams 1987)
16751675 Tannin
1690 Lignin, starch, protein
C-H stretch, 1st overtone (Curran 1989; Soukupova, Rock, & Albrechtova 2002)
17851780 Cellulose, sugar, starchC-H stretch, 1st overtone, O-H stretch, H-O-H deformation (Curran 1989)
19711960 Starch, sugar
1980 Protein
O-H stretch, O-H rotation, N-H asymmetry (Curran 1989)
21742172 & 2180 Protein, nitrogen
2175 Tannin
2179 Phenolic compounds
N-H rotation, C-H stretch, C = O stretch, Aromatic C = C bond, 3rd overtone (Curran 1989; Kokaly 2001; Soukupova, Rock, & Albrechtova 2002; Ferwerda, Skidmore, & Stein 2006)
22972280 Starch, cellulose
2287 Lignin
2300 Protein, nitrogen
C-H stretch, CH2 deformation, C-H rotation, C = O stretch, N-H stretch (Curran 1989; Soukupova, Rock, & Albrechtova 2002)

The mechanisms (i.e. chemical bonds) responsible for many absorption features have been identified and attributed to their presence in particular chemical compounds [e.g. tannins, protein, cellulose, etc. (e.g. Curran 1989)]. However, it is important to recognize that similarities in the molecular structure of many chemicals can result in similar associated absorption features (Soukupova, Rock, & Albrechtova 2002;. The strong relationship between chemical concentrations and reflectance at some wavelengths also can result from strong intercorrelations among chemicals, rather than direct association with the absorption mechanism (Curran 1989; Soukupova, Rock, & Albrechtova 2002). Many plant chemical compounds (e.g. tannins) have complex structures, and the links between structure and function (e.g. protein precipitating capacity) are not readily discerned. For these reasons, models often select wavelengths that correspond to chemical constituents that differ from the one(s) being estimated. This was true for some of our models as well.

The selection of wavebands that largely correspond to known absorption features by our stepwise and step-up regression models is particularly encouraging because Huber et al. (2008) found that these models performed better when they were extrapolated to new regions than models based on wavelengths that were not associated with known absorption features. Ideally, our models would have been tested on the entirely independent data set collected from tree canopies within our southern flightlines. However, the problems we experienced with the quality of the HyMap spectral data collected over those flightlines required us to omit this external estimate of prediction accuracy. The acceptably low standard errors of cross-validation, relatively high R2-values, selection of recognised absorption features and general stability among different models for the same constituents indicate that we were able to train models to predict integrated measures of forage quality (e.g. AvailN and DDM) based on canopy spectra collected with an airborne imaging spectrometer. Our SECV and R2-values for N were similar to the only other study that has estimated N across multiple tree species at an individual crown-level with hyperspectral remote sensing data (Huber et al. 2008).

Many of the wavebands selected by the models occur in the NIR and SWIR regions of the electromagnetic spectrum. A sensor that incorporates these wavelength regions may be important for modelling these foliar constituents with similar levels of accuracy. Although we reported only our best-performing models according to SECV and R2, models with as few as four wavebands achieved R2-values over 0·40 for AvailN, N and DDM. In situations where less accurate measures of these foliar constituents are acceptable (e.g. when assessing relative variations in forest productivity or forage quality across large-scales or at coarse resolution), it may be possible to use very simple models comprised of a few wavebands (Huang et al. 2004). For example, several vegetation indexes have been developed from the combination of two or three wavelengths that correlate to foliar nitrogen and chlorophyll in a number of plant types (Huete et al. 2002; Ferwerda, Skidmore, & Mutanga 2005). Although a few studies have investigated the use of spectral indexes for measures of in vitro DDM (e.g. Starks, Zhao, & Brown 2008), more research is needed to determine whether there are wavelengths that are consistently associated with DDM and AvailN in spectra collected from a range of plant-types and species.


Imaging spectroscopy has received considerable attention as a potential tool to estimate plant chemical composition on a landscape scale (Kerr & Ostrovsky 2003; Leyequien et al. 2007; Skidmore et al. 2010). Much research in this area has focused on the ability to estimate foliar N, because of its direct relationship to photosynthetic activity and because N is believed to be a limiting nutrient for many herbivorous species (Kavanagh & Lambert 1990; White 1993; Huang et al. 2004; Majeke, van Aardt, & Cho 2008). However, the total quantity of foliar N does not necessarily reflect the amount of N that can be utilized by herbivores. AvailN is an in vitro measure of forage quality that integrates the influence of tannins and fibre on the amount of foliar N that is available for digestion by herbivores. AvailN may be a more meaningful measure of forage quality than total N for the many herbivorous species that are sensitive to the effects of tannins (Foley & Moore 2005; DeGabriel et al. 2009; McArt et al. 2009; Youngentob et al. 2011a). Our research demonstrated that it is possible to estimate AvailN and DDM, a common measure of forage quality based on foliar concentrations of fibre and lignin, from individual tree-canopy spectra collected with an airborne imaging spectrometer. To our knowledge, only one other study has estimated concentrations of foliar chemicals at an individual tree crown-scale across multiple tree species with imaging spectroscopy (Huber et al. 2008) and ours is the first study to estimate integrated measures of forage quality (AvailN and DDM) in this manner.

Our models were based on the maximum spectrum of multiple pixels manually selected from individual tree crowns. Future research will focus on combining crown-delineation software that incorporates a spectral similarity metric (i.e. Spectral Angle Mapper, Kruse et al. 1993) to identify individual tree crowns within forest canopies (Held et al. 2001) and a recently developed method for automated canopy-pixel selection (Huang et al. 2007). This should enable us to map variations in these foliar constituents across forest canopies and investigate relationships between these measures of forage quality and the presence and abundance of arboreal folivores, such as the greater glider (Petauroides volans).


The authors thank Dr. Karen Marsh and Dr. Ian Wallis for their assistance with the laboratory component of this research. We thank Patrick Schmidt, Nicole Coggan, Sarah Ugalde, Jeffery Alexander, Paul Daniel, Stewart Archer and Jeff Whiting for their assistance in the field. We also extend considerable appreciation to Dr. Zhi Huang for assisting us with model development using WinISI software. Two anonymous reviewers provided helpful comments that improved earlier versions of this manuscript. This research was made possible by the generous support of The Hermon Slade Foundation, The Wilderness Society, Ecological Society of Australia, NSW Department of Environment and Climate Change, The Australian Commonwealth Scientific and Research Organization (CSIRO) Division of Marine and Atmospheric Research, and The Fenner School of Environment and Society and Research School of Biology at The Australian National University. This research was conducted with the permission of Forests New South Wales (permit CO32438) and National Parks (permit S12036).