Automated estimation of foliage cover in forest understorey from digital nadir images

Authors

  • Craig Macfarlane,

    Corresponding author
    1. CSIRO, Private Bag No. 5, Wembley, WA 6913, Australia
    2. School of Plant Biology, Faculty of Natural and Agricultural Sciences, The University of Western Australia, 35 Stirling Hwy, Crawley, WA 6009, Australia
    Search for more papers by this author
  • Gary N. Ogden

    1. CSIRO, Private Bag No. 5, Wembley, WA 6913, Australia
    Search for more papers by this author

Correspondence author. E-mail: craig.macfarlane@csiro.au

Summary

1. Understorey vegetation can contribute significantly to the foliage cover and gas exchange of forest and impact the accuracy of remotely-sensed vegetation indices. Visual assessment is a rapid means of estimating cover, but its drawbacks include bias and inconsistency between observers and observation periods and the inability of observers to distinguish between cover intervals smaller than 10%. The accuracy of visual assessment is unlikely to be able to reliably detect changes smaller than 10%. Hence, there is a need for rapid and accurate alternatives for estimating understorey cover in forests.

2. Nadir (downward facing) photography is an alternative method that has been applied successfully in agriculture but not tested in forests. Detection of green foliage in images of forest understorey is far more challenging than in images of agricultural crops owing to the heterogeneity of the vegetation, the background and the lighting conditions. We tested several image analysis approaches to classifying and quantifying foliage cover in images taken from a height of 3–4 m above the ground using a digital SLR camera on an extendible pole. To reduce the impact of lighting variations on image processing, we converted red, green and blue (RGB) digital numbers from the RGB image to green leaf algorithm (GLA) values and to the CIE L*a*b* colour model prior to analysis.

3. The most successful classification method (LAB2) utilised the GLA, a* and b* values of each pixel to classify green vegetation using a minimum-distance-to-means classifier. A histogram-corner detection method (Rosin) was superior to other methods when cover was <10%. Between-class variance methods were the least accurate methods. Owing to the spectral complexity of the forest floor, it was necessary to filter noise from the classified images. Further work is needed to separate shades of gray from hues of green in images with sparse cover and coarse woody debris.

4.Synthesis and Applications. We propose that the LAB2 method be used to quantify foliage cover in nadir images of understorey with cover >10% but that the Rosin method be used for cover <10%. The automated methodology that we have developed yields estimates of foliage cover in forest understorey from digital photography that are rapid, objective and at least as accurate as visual assessment (i.e. within 5%). User intervention is limited to quality control. The improved method could be extended to indirect estimation of leaf area index for studies of forest water balance and productivity.

Introduction

Understorey vegetation can contribute significantly to the total foliage cover of forest. For example, Macfarlane et al. (2010) showed that, following forest thinning and regeneration burning, the leaf area of understorey may exceed that of overstorey, and other studies have also demonstrated a significant contribution of understorey to both stand leaf area and gas exchange (e.g. Jarosz et al. 2008; Serbin, Gower & Ahl 2009; Urban et al. 2009). Failure to account for understorey cover can also reduce the accuracy of remotely-sensed estimates of overstorey leaf area (Eriksson et al. 2006). To quantify the contribution of understorey to foliage cover, there is a need for rapid, non-destructive and accurate methods to estimate understorey cover. The most rapid means is visual assessment, but the drawbacks of this method are well documented and include bias and inconsistency between observers and observation periods (Sykes, Horril & Mountford 1983; Kennedy & Addison 1987), as well as the inability of most observers to distinguish between cover intervals smaller than 10% (Hahn & Scheuring 2003). Hence, the accuracy of visual assessment can be ±10–20% (Sykes, Horril & Mountford 1983), and repetitive sampling is unlikely to be able to reliably detect changes smaller than 10% (Kennedy & Addison 1987). Forest understorey poses further challenges in that plants may be larger and taller than typically found in agriculture and some rangelands, such that observers must view vegetation side-on rather than from above, making estimation of ‘vertical’ cover problematic. In an extreme example of the errors associated with visual estimates of cover, Crombie (1992) reported visual estimates of foliage projective cover of 50% in jarrah forest understorey but estimated understorey leaf area index (LAI) of only 0·35: Crombie (1992) correctly noted that foliage cover cannot exceed LAI.

In this article, we describe a photographic method for estimating understorey foliage cover from nadir images in forests. Our objective was to develop a method that was at least as accurate as visual assessment but rapid, automated and objective. The method is based around consumer-grade CMOS or CCD-based camera equipment that produces images using the red, green and blue (RGB) colour model. Much previous research into ‘nadir’ photography (i.e. camera aimed towards the ground) has concentrated on agricultural crops, especially for weed detection (Tellaeche et al. 2008; Zheng, Zhang, & Wang 2009) or on rangelands (Booth et al. 2004, 2005, 2006). The size of understorey vegetation also poses challenges for photographic cover measurement; the camera needs to be positioned high above the ground, and a large area needs to be photographically captured to represent the vegetation; hence, the use of frames or shading devices (e.g. Booth et al. 2004) is impractical. Unlike monoculture crops, forest understorey can contain many different plant species in close proximity or overlapping which, combined with the presence of a litter layer, coarse woody debris, charcoal, bare ground and rock, creates a spectrally complex background against which to detect plant objects. Tall plants increase the heterogeneity of illumination, in addition to that caused by overstorey shading, resulting in even greater spectral variation within object classes. Hence, it is essential that image processing and analytical methods for detecting plant pixels in forest understorey deal robustly with variable scene brightness.

One such method for dealing with variable illumination in images of vegetation cover is the green leaf algorithm (GLA) (Louhaichi, Borman & Johnson 2001; Booth et al. 2005), which is calculated as follows:

image(eqn 1)

where R, G and B are the digital numbers (DN) of the RGB channels of the RGB image. The numerator of GLA is also referred to as the Excess Green Index (e.g., Richardson et al. 2009; Zheng, Zhang & Wang 2009). Alternative methods for reducing scene brightness variations are to convert images from RGB to colour models that separate chromaticity from brightness (see Gonzalez, Woods & Eddins 2009 for a comparison of color models). For example, the hue, saturation and luminance colour model was used by Graham et al. (2009) to track budburst and leaf expansion of forest understorey using digital camera images. To detect green pixels from non-green pixels using the hue requires that two thresholds be detected in a tri-modal histogram; one threshold to separate greens from reds and the other separate green from blues. This is a more complex and potentially error-prone task to automate than detecting a single threshold. Instead, the CIE L*a*b* colour model separates a single luminance channel (L*) from chromaticity in two channels a* and b* and has been used to quantify greenness of leaves in laboratory studies (Madeira et al. 2003; Koca, Karadeniz & Burdurlu 2006).

Whatever the colour model the greatest obstacle to an objective and rapid method is to automatically detect a threshold value that separates vegetation and non-vegetation pixels in images. The GLA has been implemented in the software VegMeasure (Booth et al. 2005, 2006), but this software requires the user to manually determine a threshold value, either for each image or for a subset of images collected in similar conditions. The widely used Otsu method (Otsu 1979) yielded poor results with images for which the similar Excess Green Index had been calculated (Zheng, Zhang & Wang 2009). Our initial investigations revealed that histograms of gray-scale images derived from the GLA algorithm are usually unimodal rather than bimodal, which explains the failure of methods such as Otsu to correctly classify pixels within these images, although Baradez et al. (2004) found that the Otsu method suitable to classify confocal microscopy images with unimodal histograms. Many classification methods have been developed for bimodal histograms with even distribution of pixels between foreground and background but they typically provide poor classifications of images with either unimodal histograms or strongly asymmetric pixel distributions (Rosin 2001).

As an alternative to maximum between-class variance methods, we tested a threshold-detection method that was specifically developed for unimodal histograms (corner-detection method, Rosin 2001). We also tested whether it was possible to detect green pixels using the L*a*b* colour model and minimum-distance-to-means algorithms. Early tests revealed that the green and the red channels of some RGB images were saturated in brightly lit regions of images with sharp contrasts between light and shade and that all of the classification methods misclassified some background as foreground owing to the highly variable spectral quality of the background in forest understorey. Hence, we also tested whether underexposing images could reduce saturation and improve image classification, and we evaluated a morphology-based noise filter on estimates of foliage cover.

Materials and methods

The steps involved in the estimation of foliage cover, from image acquisition to pixel classification, are summarised in Fig. 1 and outlined in detail below. All images were processed and analysed using matlab R2010b (Mathworks, Natick, Massachusetts, US) and the Image Processing Toolbox. The matlab script is available as supplementary material (Table S1).

Figure 1.

 Illustrative flow chart of image processing methods tested in this study.

Image acquisition

Digital photographic images of low, ground vegetation were collected in natural jarrah forest at 31 Mile Brook (32°15′S 116°10′E) and in planted native gardens at King’s Park Botanic Gardens (31°58′S 115°50′E) in south-western Australia. Ground vegetation at 31 Mile Brook was relatively sparse (foliage cover < 50%), whereas images of denser ground vegetation were obtained at the botanic gardens. Background at 31 Mile Brook was mainly leaf litter with occasional coarse woody debris, charcoal and rocks. Background at the botanic garden was mainly wood-chips. Nadir images were collected at two exposure settings (automatic and underexposed by one stop) to test the effect of exposure on method performance. In total, 100 images were collected at 50 sites.

To obtain nadir images, a Nikon D80 DLSR camera with AF Nikkor 24 mm f/2·8D lens (Sendai Nikon Corp., Otawara, Tochigi, Japan) was attached to the top of an extendable aluminium pole via an angled steel bracket such that the camera would point straight down when the pole was comfortably held at arm’s length with the base of the pole between the operator’s feet (Fig. S1). A bubble level at chest height allowed the operator to check the level before capturing images. Depending on the pole’s extension, the camera was 3·5–4 m above the ground and photographed a ground area of 5–6 m2. The shutter was operated by an infra-red remote control (ML-3) with a 2 s delay. The lens was set to F11, and the camera was set to Aperture Priority mode; all images were captured using autofocus, autowhite balance and matrix metering at maximum resolution as FINE quality JPEG. The camera’s default colour space, sRGB 1a, was used. An example nadir image is presented in Fig. S2.

Identification of training sets

Four groups of pixels were automatically detected using logical tests based on the DNs of the original RGB image:

Group 1. Background pixels: In the form presented in eqn 1, the GLA yields values between −1 and +1. Following Louhaichi, Borman, & Johnson (2001), we classified any pixel with GLA ≤ 0 as definite background. This is equivalent to classifying as background any pixels satisfying the following rule:

image(eqn 2)

Regardless of subsequent processing, these pixels were always classified as background in the classified image.

Group 2. Dark pixels: We classified pixels with G ≤ 25 as dark background pixels. These pixels were either charcoal or very dark shadows in which it was not possible to reliably distinguish foliage from background; these pixels were also classified as background in the final output.

Group 3. Foreground pixels: We initially classified pixels that satisfied the following rule:

image(eqn 3)

as probable foreground (vegetation) pixels. This classification was not final because close inspection revealed that this group contained some background pixels.

Group 4. Uncertain pixels: Pixels that were not members of any group above were initially classified as ‘uncertain’.

We calculated the fractions of background pixels, probable foreground pixels and uncertain pixels in each image, and the fractions of dark and saturated pixels in each image. Although saturated pixels are normally defined as having all three channels (R, G and B) equal to 255, we defined saturated pixels as having G = 255 and either R = 255 or B = 255, because it is only necessary for the green channel and one other channel to saturate for pixel classification to be significantly impacted.

Transformation of the RGB image to GLA and L*a*b*

The GLA value of each pixel (eqn 1) was calculated in floating-point format from eqn 1 then rescaled from −1 to 1 to between 0 and 1. Throughout the remainder of the study, it is these rescaled GLA values that we refer to as GLA. The mean GLA value of each image was calculated. The GLA histogram is not only frequently unimodal but is also very narrow, i.e., it lacks dynamic range; thus, the resulting classified image is very sensitive to the choice of threshold. To increase dynamic range and contrast between foreground and background, we applied a linear contrast stretch to the images prior to conversion to 8-bit (Fig. 2). Throughout the remainder of the paper these contrast-stretched 8-bit GLA values are denoted GLA’.

Figure 2.

 Prior to image classification, the green leaf algorithm (GLA) values of nadir images were contrast stretched then converted to 8-bit resolution (denoted GLA’). The histogram of the GLA images had little dynamic range whereas the GLA’ images had a wide dynamic range. Histograms of images of sparse vegetation were unimodal; a second mode was evident in histograms of images of dense vegetation.

Unlike RGB, which is an additive model that describes colour based on their intensities of RGB, the CIE L*a*b* colour model (Mclaren 1976) separates grayscale information from colour information in three channels: L* that represents brightness, a* that represents colour on a green-magenta axis and b* that represents colour on a blue–yellow axis (Gonzalez & Woods 2008). Images were converted from Nikon’s sRGB 1a colour space to L*a*b* using algorithms provided in the matlab Image Processing toolbox and assuming the default white point setting. Note that the objective of the transformation was to separate luminance information from chromaticity information for the purposes of image classification, not to accurately describe the colours in any absolute colour space and that the absolute values of a* and b* were not important for this purpose.

Image classification

Between-class variance

We tested Otsu’s (1979) method for detecting a threshold in a grayscale image based on maximising the between-class variance (equivalent to minimising the weighted sum of the intra-class variances). We used the MIP toolbox function ‘mipbcv.m’ of Demirkaya, Asyali & Sahoo (2008) (http://biomedimaging.org/BookMFiles.aspx). Baradez et al. (2004) found that maximising the between-class variance was a simple and rapid method to classify cell images with unimodal histograms but Zheng, Zhang & Wang (2009) found that Otsu’s method performed poorly on nadir images of agricultural crops. We applied the method firstly to the whole image (denoted BCV1) and secondly excluding pixels from Groups 2 and 3 (denoted BCV2). If, in the latter case, this resulted in a threshold that lay within the GLA’ range of Group 1 pixels, then the method was re-applied to the pixels in Group 4 only; thus, ensuring a threshold was detected within the range of possible values.

Corner detection algorithm

The corner detection algorithm detects the ‘corner’ of a histogram and is fully described in Rosin (2001); the method is also known as the ‘triangle method’ (Coudray, Buessler & Urban 2010). A straight line is drawn from the histogram maximum to the last empty bin, and the threshold is taken as the point of maximum deviation of the line from the histogram curve. The method has been successfully used to detect homogeneous regions in zenith images of forest canopy (Macfarlane 2011). This method was applied to the histogram of Group 4 pixels only, not the whole image histogram, because initial tests revealed that this increased the likelihood of the method detecting an appropriate threshold, especially in images with dense foliage. The method is denoted ‘Rosin’ throughout the study.

Minimum-distance-to-means

Mean values of a* and b*, as well as mean values of GLA and GLA’, were calculated for the foreground and background training sets (Groups 1 and 3). Pixels were then classified as foreground or background based on the Euclidean distance from the background and foreground means using three different minimum-distance-to-means classifiers (Lillesand, Kiefer & Chipman 2008 p. 551):

  •  The first classifier was based on the means of the GLA’ only and was denoted GLA’.
  •  The second classifier was based on the means of a* and b* only and was denoted LAB1.
  •  The third classifier was based on the means of a*, b* and GLA and was denoted LAB2.

Noise filtering

Following classification of pixels via the above methods, any Group 1 or 2 pixels classified as foreground were re-classified as background. To both quantify and correct misclassification of background pixels as foreground, we applied an object-based high-frequency filter to the classified images; foreground objects smaller than 0·05% of the image size were reclassified as background. The size criterion was selected after visual examination of images to determine the minimum object size that was likely to contain living plants rather than misclassified background.

Image analysis

Following classification, foliage cover was calculated as the fraction of foreground pixels in each classified image. Although we refer to cover estimates as ‘foliage cover’, the field of view was not strictly vertical. Unlike Macfarlane, Grigg & Evangelista (2007) who used a 50-mm lens to obtain nearly vertical cover, we selected a lens with twice that field of view as a compromise between ‘verticality’ and adequate sample size. Note that foliage cover differs from ‘crown cover’ defined as the proportion of ground area covered by the vertical projection of solid crowns (Walker & Tunstall 1981).

Jarrah forest case study

To test the methodology, we took advantage of twelve 40 × 40 m monitoring plots established in jarrah (Eucalyptus marginata) forest at 31 Mile Brook south-west of Perth, Western Australia. In April 2009, sixteen nadir images were collected in each plot c. 10 m apart, along four transects spaced 10 m apart.

Criteria for performance evaluation

No attempt was made to visually estimate foliage cover of the original RGB images owing to the difficulty that observers have in distinguishing between more than ten intervals (Hahn & Scheuring 2003). Instead, the best classification of each image was judged visually, and the foliage cover of the image that best represented the foliage in the original RGB image was taken as the best estimate of foliage cover. The GLA’ image, which highlighted green vegetation, was also used to clarify the existence of green foliage in the original images. If several classified images were indistinguishable (foliage cover usually within 0·05), then the average of their foliage covers was taken as the best estimate. As a result, the reference foliage cover from the best estimate was not independent of foliage cover from the photographic methods and involved some subjective judgement, but it was our experience that the foliage cover of the best classified images was far more accurate and precise than a visual estimate of cover from the original image.

Foliage cover from individual methods was compared against the ‘best estimate’ according to the following criteria:

  •  Fraction of estimates that differed from the best estimate by more than 5%, 10% and 20%.
  •  Root-mean-squared error.
  •  Linear regression: Deviation of the slope from one and the intercept from zero and correlation coefficient.

All statistical analyses were performed using Minitab release 16. Methods were compared on an image basis and, for the case study, also on a plot-average basis using orthogonal (also called reduced-major-axis) regression with the error variance ratio set equal to one.

Results

Noise filtering

Noise, defined as the fraction of small foreground objects, generally decreased with increasing foliage cover from c. 0·04 at ff = 0·2 to <0·01 at ff = 1·0 (Fig. 3). However, for foliage cover < 0·2, there was a tendency for the LAB1 and LAB2 methods to have more noise than the other methods; as much as 9% of the background was incorrectly classified as foreground in extreme cases (e.g. Fig. 4). Photographic exposure had little impact on the amount of noise remaining following classification. Given the susceptibility of most methods to noise in images of sparse vegetation, further comparisons of method performance were based on filtered images.

Figure 3.

 The fraction of small foreground objects (<0·005% of image size) in each image after classification. For clarity only the results of the BCV2 and LAB2 methods, as applied to automatically exposed images, are illustrated.

Figure 4.

 Example of noise in a nadir image taken above sparsely vegetated forest floor, after classification by the BCV2 and LAB2 methods. The original red, green and blue (RGB) image and contrast-stretched green leaf algorithm (GLA)’ image are shown for reference. Foliage cover (ff) is indicated above each classified image. Note also the shadow of the camera operator in the RGB image that is not present in the GLA’ image.

Fractions of uncertain, saturated and dark pixels

Between 10% and 40% of pixels were uncertain in most images (i.e. initially classified into Group 4) unless cover was either very small or very large; in which case <10% of pixels were uncertain. Other than at the extremes, there was little relationship between the cover and the fraction of uncertain pixels. Underexposed images contained 3% more uncertain pixels on average than automatically exposed pixels (P < 0·001, one-way anova). The majority of automatically exposed images contained c. 0·5% saturated pixels with some images containing up to 2% saturated pixels; underexposed images all contained <0·5% saturated pixels. Only three of the 50 automatically exposed images contained more than 12% dark pixels (green DN < 25); instead, the majority of images contained 10–30% dark pixels when underexposed by one stop.

Comparison of classification methods

The between-class variance method performed poorly unless the image was sub-sampled to remove likely foreground pixels prior to detecting a threshold value. The slope of the BCV1 method (whole image used to detect threshold) was c. 0·5, and the root-mean-squared-error (RMSE) was very large compared with other methods (Table 1). Instead, sub-sampling the images resulted in estimates of foliage cover that were comparable to the other methods tested. Based on the slope and intercept of regressions against the best estimate of foliage cover (Table 1), the BCV2 method was one of the three best methods; however, it had a larger RMSE than the other methods and produced a large number of images whose apparent cover differed from the best estimate by more than 5–10% (Δ0·05 and Δ0·1 in Table 1). The best method overall was the LAB2 method, at least when applied to automatically exposed images, followed by the Rosin method. Although the GLA’ and LAB1 methods had small RMSE, these methods had regression slopes that differed significantly from one, which indicated bias in the methods.

Table 1.   Regression equations (slope and intercept) of foliage cover from each individual method vs. the ‘best estimate’ of foliage cover for the test images, for both automatically exposed and underexposed nadir images (n = 50). All regressions were highly significant. The superscript ‘ns’ indicates slopes that were not different from 1 and intercepts that were not different from zero (P > 0·05). The root-mean-squared error (RMSE) is also given along with the percentage of images whose cover differed by more than 0·05 (Δ0·05), 0·1 (Δ0·1) and 0·2 (Δ0·2) from the best estimate
MethodSlopeInterceptRMSEΔ0·05Δ0·1Δ0·2
  1. GLA, green leaf algorithm.

Automatic exposure
 BCV10·540·0400·159724622
 BCV21·05ns−0·013ns0·0462460
 Rosin1·04ns−0·013ns0·0421640
 GLA’0·920·0100·0321220
 LAB10·930·0240·0341820
 LAB20·97ns0·013ns0·0271000
Underexposed
 BCV10·510·0390·164724224
 BCV20·850·008ns0·07936102
 Rosin0·98ns−0·009ns0·0632242
 GLA’0·820·017ns0·07628102
 LAB10·790·0400·07928142
 LAB20·850·024ns0·0692462

Underexposing images did not improve image classification. In fact, the slopes of most regressions deviated further from one and the RMSE increased for all methods, as did Δ0·05 and Δ0·1. Three of the underexposed images were not adequately classified by any of the six methods, whereas at least one method produced a good classification of every automatically exposed image. The Rosin method was the least sensitive to exposure variation, and the BCV2 method was the most sensitive. Closer investigation of images revealed that the LAB1 method, and to a lesser extent the LAB2 method, was poor at detecting green pixels in dark shadows; hence, the smaller slope and greater number of poorly classified images when these methods were applied to underexposed images (Table 1). The GLA’ image often revealed green foliage in shadows but the LAB1 method failed to detect these pixels; this included very dark shadows occasionally cast by the camera operator standing between the sun and the sample area. Incorporating the GLA value as a third chromaticity channel (LAB2 method) somewhat improved the detection of foliage in shadows.

There was a strong relationship between the mean GLA value of automatically exposed nadir images and the best estimate of foliage cover for that image (eqn 4). However, there was considerable scatter in the relationship at large cover values, which appeared to result from variations in ‘greenness’ of the vegetation owing to species differences and the developmental stage of the foliage. Bright green foliage yielded a larger GLA than the same area of darker foliage; some images with similar cover differed in their GLA by more than 0·05. The total range of GLA values was only 0·45–0·65. Underexposure resulted in a small increase in GLA for images with dense cover but not sparse cover (eqn 5).

image(eqn 4)
image(eqn 5)

Jarrah forest case study

Based on the results from the test images, only three classification methods were applied in the case study: Rosin, GLA’ and LAB2. The BCV methods were omitted owing to their large RMSE, and the LAB1 method was omitted because it was inferior to the similar LAB2 method. In the natural jarrah forest, there was a smaller range of foliage cover than in the test images: only one image out of 192 contained more than 40% cover (Fig. 5). The Rosin method tended to underestimate foliage cover unless it was small (<10%), while the LAB2 method yielded slightly larger estimates of cover than the GLA’ method (Table 2). Both the LAB2 and GLA’ methods occasionally overestimated foliage cover in the range 0–10%. Similar trends were evident in regressions of plot averages, unless one plot with very little foliage cover was omitted. In the latter case, the LAB2 and GLA’ methods agreed very closely with the best estimate of foliage cover (Table 3 and Fig. 5). Overall, the LAB2 method performed best, in as much as it displayed the least bias (for cover > 10%), largest correlation coefficient and smallest RMSE (Table 3).

Figure 5.

 Relationship of foliage cover from three methods to that of the best estimate for individual images (top) and for twelve plots (bottom) at 31 Mile Brook. Standard errors of the plot means are shown.

Table 2.   Regression equations (slope and intercept) of foliage cover from each method vs. the ‘best estimate’ of foliage cover (n = 192) for each image from natural jarrah forest. All regressions were highly significant. All slopes were significantly different from 1 and intercepts were different from zero (P < 0·05). The root-mean-squared error (RMSE) is given along with the percentage of images whose cover differed by more than 0·05 (Δ0·05) or 0·1 (Δ0·1) from the best estimate
MethodSlopeInterceptRMSEΔ0·05Δ0·1
  1. GLA, green leaf algorithm.

Rosin0·91−0·0090·039112
GLA’0·890·0330·04194
LAB20·960·0250·03783
Table 3.   Regression equations (slope and intercept) of foliage cover from each method vs. the ‘best estimate’ of foliage cover (n = 192) for each plot from natural jarrah forest. All regressions were highly significant. Slopes that were significantly different from 1 (P < 0·05) are indicated with an asterisk. The plot with the smallest foliage has been excluded for the GLA’ and LAB2 methods (n = 11) but not the Rosin method (n = 12). The root-mean-squared error (RMSE) is also given
MethodSlopeInterceptR2RMSE
  1. GLA, green leaf algorithm.

Rosin0·85*0·0000·930·027
GLA’0·920·0200·970·026
LAB20·990·0160·990·023

Discussion

We demonstrated in this study that image analysis of nadir images is at least as accurate and reliable as visual estimation of foliage cover of forest understorey. Importantly, foliage cover estimated from the LAB2, Rosin and GLA’ methods differed from that of the best estimate by more than 10% in <5% of images, whether test images or from the case study. The LAB2 method is the most accurate method overall, but the Rosin method provides more robust estimates of foliage cover when it is <10%. We propose that foliage cover estimates from the LAB2 method be used unless the Rosin method yields foliage cover of <10%, in which case the Rosin estimate should be used. This yields a method that is largely automated and at least as accurate as visual assessment. As with overstorey canopy photography, it is unlikely that any method will be 100% accurate and quality control of image classification will always be necessary, but the method described in this manuscript removes the subjectivity of the operator and increases the precision of understorey foliage cover estimates.

The success of Rosin’s corner-detection method on images with sparse cover was to be expected given that the method was designed to classify unimodal histograms of images in which foreground pixels make-up a small fraction of all pixels in the image. Instead, the LAB2 and GLA’ methods were sensitive to small variations of background chromaticity in images of very sparse vegetation and were likely to classify gray objects as foreground in such images. The success of the LAB2 method reflects the greater amount of information used to classify images; the method took advantage of pixel chromaticity (a* and b*) as well as the GLA value of each pixel. A surprising result was the good performance of the GLA’ method, which simply used the midpoint on the GLA’ histogram between the mean GLA’ values of the training sets to classify pixels. The similarity of the performance and results from the GLA’ and LAB2 methods indicates that the GLA’ value and the a* and b* chromaticities were capturing similar information and lends support to the use of the GLA algorithm for detecting foliage in RGB colour images. The GLA’ method was one of the quickest and simplest of all the methods tested and could provide similar results for far less computational effort and time than the LAB2 method, which yielded small improvements in the performance but for a great deal of computational effort. Woebbecke et al. (1995) also concluded that indices based on ‘2G-R-B’ were as effective as more computationally intense methods at separating green foliage from background. The tendency for both methods to detect gray leaves and wood as foreground in images of sparse vegetation is not a significant drawback because the misclassification is easily detected and foliage cover from the Rosin method can be used as an alternative. Minimum-distance-to-means classifiers are noted for misclassification where spectral classes are close to one another (Lillesand, Kiefer & Chipman 2008, p. 551), which is the case in images of sparse vegetation.

The poorer performance of the classification methods in the case study was the result of certain background, and vegetation features that were not present in the test images. The occasional gross over-estimation of cover by the LAB2 and GLA’ methods resulted from the presence of coarse woody debris that was gray from weathering and had a GLA value above 0·5. In images with very little green cover, gray wood was detected as green owing to its chromatic similarity to the little green cover present. Gray leaves on the forest floor were also detected as green in this situation. Coarse woody debris was not common in the images and only appeared to be misclassified when there was little vegetation cover. The underestimation of foliage cover by the Rosin method was the result of gray-green foliage that did not show up strongly on the GLA’ image. The LAB2 method detected this foliage (perhaps owing to the aforementioned tendency to detect gray pixels as green), but the Rosin method classified it as background because the GLA’ values of this foliage were similar to that of background pixels and to the left of the corner in the GLA’ histogram. This foliage was common in some of the case study monitoring plots, and it was quite difficult even to visually assess whether this foliage was actually green or not from images. Booth et al. (2006) also noted the difficulty in differentiating grays from hues of green in nadir images, and this should be the subject of further investigation to improve automated classification of ground vegetation in digital images.

Although the method we developed coped well with variations in lighting, we would still recommend capturing images in overcast conditions if possible to improve image processing. Underexposing images did not improve image processing. There were typically far fewer saturated pixels than dark pixels in automatically exposed images and underexposing increased the number of dark pixels substantially, which had deleterious effects on image analysis using most methods. As an alternative to underexposing images, we suggest processing of 12-bit RAW images rather than 8-bit JPEG images to access the full dynamic range of captured images (Verhoeven 2010). The impact of such features on processing of canopy images remains to be explored.

It was necessary to contrast-stretch GLA images prior to conversion to 8-bit to reduce the sensitivity of classification to the choice of threshold value. The total range of raw GLA values is quite small: 0·45–0·65 at the extremes for foliage cover ranging from nearly zero to nearly 100%. A similar range was observed by Richardson et al. (2009); they observed a total variation of <70 for the Green Excess Index (the numerator of the GLA algorithm) from complete defoliation to dense canopy, which is equivalent to a maximum range of 0·13 for the GLA if the denominator is assumed to be c. 500 (mid-gray of 127 times four). Contrast stretching of the GLA image reduces the sensitivity of estimated foliage cover to small variations in the choice of threshold value, thus providing more robust estimates of foliage cover. To obtain accurate estimates of foliage cover, we also needed to apply an object-based noise filter to remove small misclassified background objects. This is a drawback of the method because the user must specify the object size below which objects initially classified as foreground are reclassified as background; it re-introduces some subjectivity into the method but was essential to obtain accurate separation of vegetation and background. The size criterion will be specific to the lens focal length and the distance of the camera from the vegetation, and the vegetation itself (i.e. leaf size and shape and the degree of clumping of foliage). Hence, it is necessarily a subjective criterion that requires consistent protocols for image acquisition and is vegetation-specific. Users would need to perform their own initial assessment of noise in images to determine whether such noise filtering is needed and the appropriate threshold object size to remove. The need for noise filtering is likely to depend on the spectral complexity of the background; in early tests using images taken of bright green plants in garden beds with homogeneous backgrounds (similar to nadir images of agricultural crops, e.g. Liu & Pattey 2010), we found that no noise filtering was needed and that all methods actually produced very similar and accurate results.

A photographic system has numerous advantages over visual assessment. The digital images are a permanent record that can be re-analysed with improved methods as they become available (Booth et al. 2006), and the quality of the cover estimates can be checked by observers who might not be available in the field. In contrast, visual assessments (in the field) are unrepeatable measurements with no opportunity for quality control. In forest understorey, the simple photographic system we developed has another great advantage: the vegetation is viewed from above rather than side-on. Ecologists usually define cover in terms of the vertical projection of foliage, but in tall (1–3 m) vegetation, a human observer will view the vegetation from the side and need to ‘imagine’ the vertical projection of foliage. This is likely to result in even greater errors and biases in cover estimation than from short vegetation such as some crops and pastures where the observer can view the vegetation from above. Hence, even the relatively wide-angled lens we used in this study is preferable to judging vertical cover from a side view, even if cover was later visually assessed from photographs in the laboratory rather than using automated image analysis. The greatest source of error will frequently be inadequate sampling (Booth et al. 2006); hence, a system of rapid image collection combined with automated image analysis can produce much data quickly, saving time in the field and in the laboratory. The operator need not spend time evaluating each quadrat in the field and can therefore sample a larger number of quadrats, and the operator need not evaluate every image in the laboratory, although we recommend that the classified images be viewed alongside the original to check that image classification is not grossly in error. As such, the role of the operator is reduced to one of post-analysis quality control only.

Understorey foliage cover in the 31 Mile Brook catchment, rarely exceeded 50%, which is comparable to foliage cover in some rangelands (Booth et al. 2005, 2006). Although we have demonstrated that foliage cover can be accurately obtained from the photographic system, hydrological models and other applications may require LAI rather than cover as an input. Macfarlane, Grigg & Evangelista (2007) described a method for obtaining both foliage cover and crown cover from the same image, and then using a modified version of Beer’s law to estimate LAI. This approach is problematic in understorey images in which individual plants can be small and well separated with poorly defined crown boundaries, unlike cover images of overstorey, which typically contain a few clumped crowns with relatively dense foliage. Furthermore, the wide variety of life-forms, canopy architectures and leaf angles makes derivation of a single light extinction coefficient problematic. We conclude that in biodiverse understorey, it is preferable to develop empirical relationships to relate biomass and LAI to foliage cover. Macfarlane et al. (2010) developed such relationships between biomass and LAI but not cover. Typical biomass of jarrah forest understorey is up to 4 ton ha−1 (Boer et al. 2008) from which we calculate LAI up to 0·8 for understorey based on those models. To a first approximation, one can derive LAI estimates from foliage cover by assuming a ‘typical’ value for crown porosity (porosity = 1 − foliage cover/crown cover). Values for porosity typically range from 0·1 to 0·4 (Pekin & Macfarlane 2009). Assuming a porosity of 0·25 and a light extinction coefficient of 0·6 for understorey (Vertessy et al. 1996) yields LAI up to 0·8 for a foliage cover of 0·25 – the maximum plot average we measured. This is in broad agreement with values based on biomass from the study of Boer et al. (2008) and gives us confidence that understorey foliage cover estimated from photographic methods can provide hydrologically useful information. We recommend further work to strengthen the relationships between foliage cover, biomass and LAI in eucalypt forest.

Ancillary