Principles and methods for automated palynology

Authors


Summary

Pollen grains are microscopic so their identification and quantification has, for decades, depended upon human observers using light microscopes: a labour-intensive approach. Modern improvements in computing and imaging hardware and software now bring automation of pollen analyses within reach. In this paper, we provide the first review in over 15 yr of progress towards automation of the part of palynology concerned with counting and classifying pollen, bringing together literature published from a wide spectrum of sources. We consider which attempts offer the most potential for an automated palynology system for universal application across all fields of research concerned with pollen classification and counting. We discuss what is required to make the datasets of these automated systems as acceptable as those produced by human palynologists, and present suggestions for how automation will generate novel approaches to counting and classifying pollen that have hitherto been unthinkable.

Introduction

Almost two decades ago, Stillman & Flenley (1996) published a concise consideration of the needs of palynology, many of which they envisaged could be satisfied by partial or complete automation. Their ideas were presented largely from a Quaternary palaeoecological perspective; however, the majority of potential benefits highlighted are highly relevant to the other fields of pollen analysis. These needs and benefits fall into four key areas, summarized in Table 1. These needs persist today in most areas of palynology, and if anything are stronger than ever before. Any technique that can improve the speed, efficiency and volume of material processed in research is obviously advantageous. An automated system has the potential to not only save time and money, but also open up further opportunities to propel pollen-based research in directions that would not otherwise be possible.

Table 1. Summary of the four key areas in palynology in need of improvement (after Stillman & Flenley, 1996) and the resultant benefits
NeedBenefit
Volume• More pollen grains counted per sample
• More samples counted per ‘batch’ (e.g. increased temporal resolution for vegetation reconstructions or airborne pollen samples, or increased replicates for Pollen production or pollen biology studies)
• More sites sampled (where relevant)
Speed/time• Generate data sooner
• Generate data more efficiently, reduce labour hours
Objectivity and consistency• Reduce intra- and inter-analyst biasing
• Improve consistency across all levels of pollen analysis, including within individual samples (intra-analyst), within a suite of samples from the same site (typically intra-analyst), between suites of samples from different sites projects (most likely inter-analyst) etc.
Taxonomic resolution• Increased precision in pollen based vegetation reconstructions
• Applications in ‘alpha taxonomy’ (MacLeod et al., 2010)

Since the late 1960s, when Flenley (1968) first highlighted the need for automated palynology, many groups have made attempts at automating some component of the aspect of palynology concerned with quantifying and/or identifying pollen grains within some form of sample. Table 2 presents a selected summary of known published work representing various attempts at partial or complete automation of palynology. Typically, each attempt has been prompted by a technological advance which has offered new potential for grasping the ‘holy grail’ of palynology.

Table 2. Selected published attempts at automated palynology published since 1996
 Discriminate pollen from nonpollen or locate pollenClassify (+/− count)Count onlyNo. of taxaPrimary technique for data captureClassifier type (where relevant)Intended applicationSuccess rate (where given) in percent correct or r value
  1. TL, transmitted light microscopy (bright field); DF, dark field microscopy; RL, reflected light; FL, fluorescence; AP, apotome; FT-IR, fourier transform infra-red; LAMS, laser ablation mass spectrometry; NN, neural network; SVM, support vector machine; LDA, linear discriminant analysis; MDC, minimum distance classifier.

Bechar et al. (1997)  x1Image processing (RL) Plant biology 
Li & Flenley (1999) x 4Image processing (TL)NNGeneral palynology100%
France et al. (2000)xx 3Image processing (TL)NNGeneral Palynologyc. 82%
Parker et al. (2000)xx 6LAMSMultivariate patch algorithmAerobiology 
Boucher et al. (2002)xx 4Image processing (TL) Aeropalynology77%
Ronneberger et al. (2002) x 26Image processing (FL)SVMAeropalynology92–97%
Li et al. (2004) x 4 – 13Image processing (TL)LDA, NNGeneral palynology100%
De Sa-Otero et al. (2004)xx 3Image processing (TL)MDCAeropalynology> 86%
Treloar et al. (2004) x 12Image processing (TL) General palynology81–100%
Zhang et al. (2004) x 5Image processing (TL)NNGeneral palynology> 97%
Chun et al. (2006) x 3Image processing (TL)manyAeropalynology97%
Ileva et al. (2005)   4Raman microscopymanyAeropalynology 
Allen (2006); Holt et al. (2011)xx 6Image processing (DF)NNGeneral palynology 
Kawashima et al. (2007)xx 3Laser scatteringScatter plotsAeropalynology0.8
Costa & Yang (2009)  x2Image processing (RL) Plant biology86%
Dell'Anna et al. (2009) x 11FT-IRk-nearest neighbour, hierarchical cluster analysisAeropalynology84%
Mitsumoto et al. (2009)xx 9Flow cytometry & FL Aeropalynology 
Ticay-Rivas et al. (2011) x 17Image processing (TL)NNBiodiversity, pollination biology96%
Surbek et al. (2011) x 8Elastically scattered light Aeropalynology60–100%
Punyasena et al. (2012) x 2Image processing (FL)Bias-optimisationQuaternary palynology> 93%
Kaya et al. (2013) x 19Image processing (TL)Rough SetQuaternary palynology and taxonomy91%
Johnsrud et al. (2013)x  3Image processing (TL, FL & AP) General palynology97%
Dhawale et al. (2013) x 10Image processing (SEM)NNGeneral palynology85%
Hoshimiya (2013)x x1Photoacoustic imaging Aeropalynology 
Nguyen et al. (2013) x 9Image processing (TL)Adaptive boostingGeneral palynology92%

At the time of publication of Stillman & Flenley (1996), progress with automated systems was largely limited to efforts using computer-based texture analysis to classify a limited number of pollen types (Mirkin & Bagdasaryan, 1972; Witte, 1988; Langford et al., 1990; Vezey & Skvarla, 1990). Images used were typically sourced from either a light microscope or a scanning electron microscope, and the classifiers were either statistical or machine learning. Since the start of the 21st century, there has been acceleration in the amount of research directed at automating palynology. The strategy of image processing of microscope images combined with a statistical or machine learning classifier has been prominent among this more recent research, but other techniques have also been employed. These include fluorescence microscopy, flow cytometry, photoacoustic microscopy, fluorescence spectroscopy, Raman miscroscopy, fourier-transform infra-red (FTIR), laser ablation mass spectrometry (LAMS) and elastically scattered light.

Progress

Given the wealth of research and development accomplished since the publication of Stillman & Flenley (1996), are we any further along in being able to satisfy the goals that they presented?

Taxonomic diversity and volume

The existing studies on automated classification cited by Stillman & Flenley (1996) had only attempted to classify a maximum of six pollen types. This number was regarded, rightly, as being too small to realistically represent the complexity of pollen counting. Stillman & Flenley (1996) suggested that any automated system would need to be able to classify c. 40 types to be useful (in Quaternary palaeoecology). Despite this, Table 2 shows that only limited progress has been made in extending the number of classes an automated system can cope with. Out of all attempts published post-1996, over 50% have worked with ≤ 6 taxa. Only seven have attempted classification of 10 or more types. Many of the systems are proof-of-concept, whereby the aim was to see if it was even possible to classify pollen by the chosen method, but more progress towards higher numbers of classes might have been expected by now.

Obviously, success rates are also important but, interestingly, they are only loosely correlated with the number of pollen types across these studies. In general it is expected that success rate would be inversely proportional to the number of taxa, as identifying more types (and the more individuals) brings greater opportunities for misclassifications. The highest success rates (100%) have been achieved using image processing combined with a neural network classifier, with a maximum of 13 pollen types (Li et al., 2004). Although this classifies more pollen types than the majority of systems in Table 1, it is still much lower than Stillman & Flenley's (1996) target. Impressive results were produced by Ronneberger et al. (2002), who classified 26 different types of pollen at 92% success rate. More recently, Ticay-Rivas et al. (2011) have achieved even better success rates (>94%) in a 17-type pollen set.

Although 17–26 taxa are an improvement on the 1996 maximum of six, we are still far from the target of 40. Despite this shortcoming, it is possible that within a few years, if not already, automated palynology systems could be applied to certain tasks which will improve the volume of material processed and extend the range of questions addressed, even if the number of identifiable types remains relatively low. Some of these tasks are simple, but are often particularly time-consuming for the human palynologist. Examples include large counts (> 10 000 grains) of a few pollen types, and/or a large number of samples (> 1000) for a low number of taxa. This facility would be useful in fields of palynology where simple information on pollen abundance/and concentration (regardless of taxon) is needed (e.g. melissopalynology and pollination biology). It may also be useful when working with samples where the analyst is interested only in counting certain key taxa/indicator taxa.

Stillman & Flenley (1996) commented on the need to improve the speed of pollen counting, mostly because of the costs involved. However, minimizing the time it takes to generate datasets obviously contributes to enhancing the rate at which discoveries can be made and reducing time to publication. In terms of progress with speed, it is too early to make firm statements, as few of the systems in Table 2 make any comment on the speed at which counting and classification takes place. In many cases the image processing and classification may, in fact, be quick, but if the capture of the images must be done manually, then there is still some time involved here. Also, because so many of these systems are experimental and may not cope with appropriate numbers of pollen types, and are not yet directly applicable, their current speed is irrelevant. Ideal automated systems could be operated 24/7, providing a potential gain in time for a project over human systems by nearly an order of magnitude (168 h wk−1 for an automated system, 5 d of 4 h each for a human analyst).

In the ideal situation, an automated system should also be faster than a human palynologist, in the sense of counting and classifying more grains per unit time. However, even if an automated system performs at the same speed as a human palynologist, or slower, it should still provide benefits in the time and money saved through the freeing of the palynologist for other tasks.

Higher taxonomic resolution

The need to enhance taxonomic resolution exists in virtually all fields of palynology, and is arguably stronger than ever before. For example, in palaeoecology it is well known that pollen of members of the same genus or even family often exhibit very little morphological variation (e.g. Poaceae), yet these taxa may have vastly different environmental requirements, and often limit the accuracy of palaeoenvironmental reconstructions. It is envisaged that higher taxonomic resolution would be facilitated by the potential for computer-based systems to be more sensitive than the human eye to subtle but systematic morphological variation. This technology is also being called for in general taxonomy and species identification as the volumes of material gathered for identification and taxonomic classification far exceed the capacity to perform the task (MacLeod et al., 2010; Gaston & O'Neill, 2011).

There has been progress in improving taxonomic resolution for problem pollen groups through improved microscopy techniques (Sivaguru et al., 2012), but these are difficult to apply in routine pollen counting. The automated systems, hitherto, have been applied on sets of morphologically distinctive types, usually from separate genera, if not separate families. However, Rodriguez-Damian et al. (2006), Punyasena et al. (2012) and Kaya et al. (2013) have attempted to differentiate between members of the same genus using different techniques, with successes of 89%, > 93% and > 90%, respectively. There is room for improvement here, but potential has been demonstrated for computer-based classifiers to differentiate between morphologically similar taxa, as well as morphologically distinct taxa.

Objectivity and consistency

The need for improved objectivity and consistency is crucial to all branches of science. Palynology is a highly skilled discipline, requiring patience, observation and an eye for detail. Human palynologists do their utmost to ensure that they are consistent, both in terms of the individual palynologist remaining consistent through time, and inter-analyst/inter-lab consistency, facilitated through the use of reference collections, databases and counting protocols (to minimise missing or re-counting grains). Yet there is still a degree of subjectivity in pollen counting – the ‘personal equation’ of Stillman & Flenley (1996). Further to this, inconsistency in the performance of the individual analyst can arise through issues like fatigue and over-familiarity with their samples (MacLeod et al., 2010). By its very nature, a machine-based system should be objective and consistent both in terms of individual performance and inter-system performance. Machines do not suffer from fatigue and cannot change their minds.

Consistency in performance of the system (i.e. repeatability) should be fundamental, but few reports have been published (but see Holt et al., 2011). Where repeatability has been assessed, machine systems have always been shown to be more consistent than human palynologists.

There is also potential for improved consistency across the discipline, currently an assumed but untested notion. It is expected to increase through removing the influence of the ‘personal equation’, and through the ability to standardize the reference material used as the basis of identification. In nearly all assessed systems (Table 2), classification is facilitated through comparison of ‘data’ that represent the unknown pollen type with equivalent ‘data’ obtained from known (cf. reference) pollen types. Human palynologists operate in the same way, except reference information is a mixture of books, databases and reference slides (of fresh or vouchered herbarium material). Reference slides are probably the best option, but the extent to which they can be shared is limited, with implications for consistency of identification. Automated systems have the potential to base their identifications on exactly the same reference data, shared electronically. Reference data (images) generated by systems located in labs across the world would be uploaded to a global database in a standardized file format, and a ‘master’ library or reference file for that taxon created, downloadable and usable by everyone for classification.

Automated systems can collect and store images of all grains located and classified. This offers the potential of recounting in the future, based on updated reference data, including the potential for finer discrimination, bringing older sites up to the same identification standard as current sites. This possibility scarcely exists at present, as few human analysts have the time or inclination to re-count older in preference to newer material. For very large projects, spread over many years, the possibility of maintaining all counts to the same standard regardless of when first analysed may provide a significant improvement in consistency across the whole dataset.

There has been clear progress towards developing automated palynology systems which will be able to address the needs in palynology described by Stillman & Flenley (1996). Other areas of potential benefit have also been highlighted. Current systems are still only halfway to the target of 40 taxa, but there has been considerable improvement on the 6 taxa classification problems. Attempts to classify morphologically similar types from the same family/genus have seen promising results. Speed and consistency are not yet of concern, as it does not matter whether systems are fast or consistent if the classification problems they are dealing with are not yet fully representative of ‘real-world’ problems.

What is the ideal system?

The automated systems in Table 2 were typically developed to target specific uses. While some could only be applied in their particular field, many are equally useful in others. Particularly attractive systems are considered to be those that could be universally adopted across the majority of fields of palynology and within all labs. Systems that can be applied to conventional preparations mounted on slides are also preferable, by offering the potential to apply the capacity for increased volume and enhanced taxonomic resolution retrospectively, by re-visiting and re-analysing existing archives of slides.

At face value, number of taxa classified and success rate would be the simplest metrics by which to judge the potential of a pollen counting system. Highest success rates have generally been achieved with a combination of image processing and a machine learning-based (neural network) classifier (i.e. Li et al., 2004; Ticay-Rivas et al., 2011). However, the highest number of taxa classified so far was achieved by Ronneberger et al. (2002), using processed images of fresh pollen collected through a combination of fluorescence and confocal microscopy. This combination is theoretically attractive as 3D volume images overcome issues with viewing angle, images are free from light noise (refraction/defraction) and the use of fluorescence helps distinguish pollen grains from nonbiological debris and isolates them from their background. Ronneberger et al. (2002) developed their system for an aeropalynological application, with the aim of coupling it to an automatic sampler to produce a real-time pollen monitor. This has now been realised, in the production of the MICROBUS bioaerosol monitor (see http://world-of-photonics.net/link/en/20250274). However, although use of fluoresence is a significant advance in the old problem of separating pollen and nonpollen, the MICROBUS system may not be suitable for automated analysis of pre-existing pollen slides prepared under different techniques/treatments.

Several of the other aeropalynology-focussed systems presented in Table 2 perform direct measurements, and count and classify the pollen continuously as it is drawn out of the air, using techniques such as flow cytometry and autofluorescence. These systems have obvious advantages for their targeted field (quick, real-time counts) but offer limited potential in other areas of palynology where counting and identification is typically slide-based. Airborne samples captured via more traditional methods (e.g. Hirst-type samplers; Hirst, 1952) and mounted on slides are storable and can be re-analysed at a later date where necessary. Such samples can potentially contribute to important research into the impact of climate change on flowering phenology and consequently allergenicity (Ziska et al., 2011). At present, however, the majority of flow-cytometry based systems do not offer the opportunity to (easily) retain and re-analyse airborne pollen samples, nor are these systems readily open to other fields where sample material is extracted from a medium other than air. While there is undoubtedly a pressing need for real-time pollen monitoring and reporting, there is currently still a place for ‘slide-based’ aeropalynology, so that airborne pollen samples can be relevant for more than just aeroallergen research.

If palynology generally is to remain largely slide-based, then it seems probable that the system that is most likely to satisfy the greatest range of needs will be one that combines automatic slide scanning with image processing routines to detect and locate pollen, as well as capturing morphological information from pollen grains, and then classifying them through some sort of artificial classifier.

Any system for automated palynology should, we suggest, be one which requires very little modification of existing techniques for collection and preparation of pollen samples, and should impart no new limitations to what can be done. For example, if a particular automated system were to require that absolutely no nonpollen debris were present in a sample for it to produce accurate results, then the time saved in counting would be absorbed or even exceeded in the extra preparation time needed to clean the samples. However, it may be that some compromises will need to be made to facilitate optimum results from a given system and these may not be too unreasonable. For example, changing the type of staining method used to facilitate better feature resolution in an image-processing based system.

Protocols and standards

The whole scientific system of obtaining, evaluating and publishing pollen analyses has developed by practice over a century or more. Automated pollen counts lack this accumulated authority, but any system, to be truly usable, needs to be acceptable to the scientific community, including editors and reviewers. This means establishing protocols to provide quality assurance.

Accuracy and reliability

The goal of such protocols should be to make automated counts at least as acceptable in scientific literature as human counts. Potential systems need to be thoroughly evaluated, their accuracy, precision and consistency assessed, and results of these evaluations published. Automated counts can then be used against a background of known limitations, in exactly the same way as the limitations of human counts are known and accepted. In reality, neither is nor will be perfect, but each represents an approach towards the true, but unknown, actual composition of a sample. Machine counts are more likely to reach these ‘true’ values simply through being able to obtain higher counts. As described by Maher (1980), the confidence interval narrows as the number of grains counted (n) increases. To halve the width of a confidence interval requires increasing n fourfold. This is of no consequence to a machine-based system, assuming material is available in ample supply.

The performance of a pollen counter (human or machine) can be broken down into two key aspects: finding the pollen grains (discriminating pollen from nonpollen), and identifying them (classification). Failure in one or both of these areas introduces error into the final results. Classification performance for machine systems is easy to assess. But what is an acceptable minimum rate of success for an artificial classifier? 100% is ideal, but arguably unrealistic. It is highly unlikely that an artificial classifier could be developed that would be 100% correct, 100% of the time, simply because it is most likely impossible to train for all possible appearances of a given taxon, particularly damaged or clumped grains. An accuracy of 95% is in our opinion a far more reasonable expectation. An ideal system would also be able to rate the strength of its classifications, with the results provided to the analyst for scrutiny. The analyst would neither need (nor want) to check the classification of every grain. However, a system with an average accuracy of 95% could provide only those 5% of images with the poorest classification strength values to the analyst for checking and reassignment to correct classes if necessary. For example a sample count of 500 grains would require that only 25 grains actually be examined by the analyst. Doubling or quadrupling the count would still only require 50–100 grains to be examined manually (in contrast to 2000 for the same degree of precision). Although this is a departure from complete automation (sensu stricto), it offers a degree of quality control that should be retained in palynology, particularly for instances where novel/unknown taxa occur.

Performance in the area of finding pollen grains is potentially more difficult to quantify accurately, depending on the nature of the system in use. The true challenge here lies in knowing what is present in a test sample in the first place. Human counts can provide a baseline, but have their own inherent errors (MacLeod et al., 2010). Precision manufacturing of designer test slides is a possible option here. But in practice, the best way of assessing accuracy and precision will likely be thorough comparison with human counts of the same material. However, as human counts are also < 100% accurate, human and machine counts of the same material are always likely to differ by some amount. The key is minimizing this difference. If a system can be shown to produce counts in which the difference between its counts and the human counts falls within the standard error of the sample counted, then it should be regarded as performing as accurately as a human palynologist, with the potential added benefit of narrower confidence intervals through increased counts.

Data handling protocols

Once a system has been suitably tested and adopted, protocols for data storage, handling and sharing will need to be followed. Reference images and image libraries should be shared globally, facilitating the realization of maximum objectivity and consistency, as previously discussed. These libraries need to be developed using consistent (or convertible) image file formats, so that all systems can share the same base reference material, contributing to the goal of increased consistency. The same libraries should also store the images collected from pollen counts, so that they can be re-run in the future, perhaps with improved hardware or software for analysis. All images collected (reference or otherwise) should be tagged with as much information as possible. This would include data produced through the counting and classification process (i.e. co-ordinates of object on slide, classification/ID, classification score), as well as a wealth of additional information which would vary depending on the type of sample the image is from (Table 3). Access to this information for what would eventually be millions of individual images would open the door to addressing many important research questions through simple database querying. An example of such a question is ‘how does pollen morphology vary through space and time?’ There have been reported instances of intraspecific variation in pollen morphology which could be related to geographic or temporal factors (Dajoz, 1999; Ejsmond et al., 2011) and, as suggested by Flenley (2003) using the example of Alnus, such cases may offer opportunities for using pollen morphology itself as an additional proxy for environmental change. Databases for pollen morphological information and imagery already exist (www.paldat.org; Bush & Weng, 2007), but image capture, data tagging and upload is very much a set of manual processes. Machine systems could do this automatically, for every grain in a sample if necessary, exponentially increasing the amount of data available for analysis.

Table 3. Information that should be tagged to images/data of pollen grains captured by an automated system for different fields of palynology. This information could be entered to the system at the start of analysis of a sample and then tagged on to every digital record (i.e. image of a pollen grain) produced from that sample
 PaleopalynologyMelissopalynologyAeropalynologyPollination biologyReference materialForensic palynology
Temporal info

Likely age of sample (e.g. from 14C dating)

Date of sample collection

Date of collection from hive

Time period honey produced over

Date and time of collection— — — — — —Date of collection— — — — — — —
Spatial info (lat., long., elevation)

Sample site location

Site type

Hive locationSampler locationPlant/pollinator sample locationPlant locationSample location
Preparation— — — — — — — — — — — —Chemical treatments, mounting medium, staining— — — — — — — — — — — — — — — — — —
Field- specific info (examples)Sediment characteristics (peat, lake mud etc.)Honey characteristics (monofloral, mixed, etc.)Weather conditions during time sample collectedPollinator type (bee, bird, etc.)Herbarium info, voucher number

Case number

Crime details

Origin of sample

A protocol for publication of automatically generated pollen data is also required, which should enable exact repetition of the analysis through accessing the samples or images and libraries, and running with the same (or alternate) equipment and/or software. This would require providing basic information on the hardware and software used, along with reference library and sample information (Table 4), that would again, ideally be available as supplementary information. With respect to libraries upon which automatic identifications are based, images included in such libraries should all be independent (of different grains, not different views of the same grain), preferably including grains collected from a number of different flowers, plants and localities, and would be tagged with appropriate information (as per Table 3).

Table 4. Information to accompany publication of automatically generated pollen data
Counting systemLibrarySamples

• Hardware used (manufacturer, version, serial numbers)

• Software used (developer, version)

• Details of the ‘library’ used as the basis for pollen identification, including:

• The number of images in the library

• Source information for each image (as per Table 3)

• Unique slide/sample identifiers

• Physical location of slides and residues

• Ideally, link to online repository where images/data from sample scans can be obtained

Conclusions

The automation of pollen counting appears now to be within reach. A system which is fast and can deal with an unlimited number of taxa, as well as broken, deformed and clumped pollen, is perhaps still some way off, but certain simple, time-consuming tasks can be carried out, with more complex problems addressed within the next few decades.

Automated systems should be treated as tools to maximise research capability, rather than to replace the human palynologist. They should allow the widening of the scope of pollen analysis into areas that the human palynologist cannot reach. But the ability for external accuracy checking by human analysts is a key feature of any such system (Holt et al., 2011). However, now is the time to begin thinking about the protocols that may need to be adopted in order to maximize the benefits offered by automated systems, particularly those in the realm of consistency.

Acknowledgements

We wish to thank three anonymous reviewers for their comments and suggestions which improved the manuscript.

Ancillary