From leaf to label: A robust automated workflow for stomata detection

Abstract Plant leaf stomata are the gatekeepers of the atmosphere–plant interface and are essential building blocks of land surface models as they control transpiration and photosynthesis. Although more stomatal trait data are needed to significantly reduce the error in these model predictions, recording these traits is time‐consuming, and no standardized protocol is currently available. Some attempts were made to automate stomatal detection from photomicrographs; however, these approaches have the disadvantage of using classic image processing or targeting a narrow taxonomic entity which makes these technologies less robust and generalizable to other plant species. We propose an easy‐to‐use and adaptable workflow from leaf to label. A methodology for automatic stomata detection was developed using deep neural networks according to the state of the art and its applicability demonstrated across the phylogeny of the angiosperms. We used a patch‐based approach for training/tuning three different deep learning architectures. For training, we used 431 micrographs taken from leaf prints made according to the nail polish method from herbarium specimens of 19 species. The best‐performing architecture was tested on 595 images of 16 additional species spread across the angiosperm phylogeny. The nail polish method was successfully applied in 78% of the species sampled here. The VGG19 architecture slightly outperformed the basic shallow and deep architectures, with a confidence threshold equal to 0.7 resulting in an optimal trade‐off between precision and recall. Applying this threshold, the VGG19 architecture obtained an average F‐score of 0.87, 0.89, and 0.67 on the training, validation, and unseen test set, respectively. The average accuracy was very high (94%) for computed stomatal counts on unseen images of species used for training. The leaf‐to‐label pipeline is an easy‐to‐use workflow for researchers of different areas of expertise interested in detecting stomata more efficiently. The described methodology was based on multiple species and well‐established methods so that it can serve as a reference for future work.


| INTRODUC TI ON
The study of ecosystem functioning requires a thorough understanding of the physiological processes of organisms occurring at the individual level. Organisms can be defined in terms of their functional traits, which are the phenotypic characteristics that are related to the fitness and performance of an organism. The spatial distribution of these functional traits in combination with environmental conditions constitutes the global diversity in ecosystem functioning and is therefore essential building blocks of land surface models (LSM). LSM are essential for estimating transpiration and photosynthesis from vegetated surfaces (Jefferson, Maxwell, & Constantine, 2017), the dominant component of global land evapotranspiration, and are a key component in models for operational predictions of the near-climate (Kushnir et al., 2019;Bertolino, Caine, & Gray, 2019). Transpiration in an ecosystem, in essence, occurs at the individual leaf surface where stomata function as "gates" between deep-soil water reservoirs and the atmosphere. Leaf stomata are microscopic pores surrounded by two guard cells ranging from approximately 10 to 100 µm in length. They control the balance between water loss and CO 2 uptake by the leaves and therefore have an important effect on the global carbon and hydrologic cycle (Berry, Beerling, & Franks, 2010;Steinthorsdottir, Woodward, Surlyk, & McElwain, 2012;Wang et al., 2015). Moreover, as stomatal traits show a clear response to environmental parameters such as climate (Liu, He, et al., 2018) and atmospheric carbon dioxide concentrations (e.g., Woodward, 1987;Tanaka, Sugano, Shimada, & Hara-Nishimura, 2013), they are key proxies of environmental change (Hetherington & Woodward, 2003). Stomatal conductance (gs), defined as the uptake rate of carbon dioxide or water vapor loss through the stomata of a leaf, is an elemental parameter in the LSM linking plant water use and carbon uptake (Kala et al., 2016) and is constrained by and derived from the size and density of the leaf stomata (Drake, Froend, & Franks, 2013). It is well known that (maximum and minimum) stomatal conductance, as well as stomatal size, density, and rate of response, varies widely across plant species. Recent efforts have mapped stomatal behavior globally (a.o. Lin et al., 2015), yet more detail is needed as including more interspecific trait variation in climate models could significantly reduce the error in model predictions (Butler et al., 2017;Wolz, Wertin, Abordo, Wang, & Leakey, 2017). To be useful in global-scale mapping, functional traits should be relatively easy and inexpensive to measure in a large number of taxa using a standardized protocol (Cornelissen et al., 2003;Dawson et al., 2019;Moretti et al., 2017;Pérez-Harguindeguy et al., 2013). Recording stomatal traits is widely considered to be labor-intensive and time-consuming and, to this day, mostly performed manually (e.g., counting stomata through the microscope) and therefore not replicable. The aspects of the methodologies currently used that add to the cost and intensity of the labor are the (a) preparation of the leaves to be viewed with a microscope, (b) the number of replicates to account for the intraindividual variation in stomatal traits, and (c) the measurements, either counts or size measurements, themselves.
Few methods to automate the detection of and measurement on stomata have been reported in the literature, and in most cases, they consist of conventional image processing using algorithms that have to be tweaked to the specific task at hand. Scarlett, Tang, Petrie, and Whitty (2016) for instance, apply maximum stable external regions to detect potential ellipses of stomata on microscope images of vine leaves while da Silva Oliveira et al. (2014) use Gaussian filtering and a series of morphological operations to detect stomata on optical microscope imagery of five different plant species. Duarte et al. (2017) use wavelet spot detection in tandem with standard image processing tools to segment stomata on one plant species, and Higaki et al.
(2014) combine a genetic algorithm and self-organizing maps, coined clustering-aided rapid training agent, for the detection of stomata on fluorescently labeled cell contour images of the leaf epidermis of Arabidopsis leaves. A series of other papers relies on classifiers for detecting of stomata. Vialet-Chabrand and Brendel (2014) report on the use of a cascade classifier for rapid assessment of the density and distribution of stomata on the leaves of two oak species.
By training a Haar feature-based classifier with exemplary stomata, they can be detected with high accuracy on SEM microphotographs. Jayakody, Liu, Whitty, and Petrie (2017) use a cascade object detection learning algorithm to correctly identify multiple stomata on rather large microscopic images of grapevine leaves, but also apply a combination of image processing techniques to estimate the pore dimensions of the stomata that were detected with the cascade object detector. Typically, the applied classic image processing techniques are based on handcrafted features for the detection and segmentation of the desired stomata. While these techniques perform well on one specific plant species, they do not generalize to other species.
An answer to the limitations of classical image processing techniques came from the field of neural networks with the introduction of deep learning. In a significant breakthrough, Krizhevsky, Sutskever, and Hinton (2012) showed that deep learning was capable of achieving record-breaking results for object recognition.
Deep learning allows computational models that are composed of multiple processing layers to learn representations from raw data with multiple levels of abstraction (LeCun et al., 2015;Najafabadi et al., 2015). Since then, deep learning was quickly adopted by the vision community, which led to state-of-the-art results for the prediction of galaxy pictures (Dieleman, Willett, & Dambre, 2015), face recognition (Parkhi, Vedaldi, & Zisserman, 2015), or the detection of anatomical structures (Shen, Wu, & Suk, 2017;Hoo-Chang et al., 2016). Its application is now being explored in different fields of biology including plant phenotyping (e.g., Pound et al., 2017) and taxonomy (e.g., Wäldchen and Mäder, 2018). The very recently published work of Fetter, Eberhardt, Barclay, Wing, and Keller (2019), covering 82 angiosperm families, is a good example of the potential of using DL for stomata counting. LeCun et al. (2015) state that all of these successes in deep learning can be explained by the increase in computing power via GPUs, the ease with which data can be collected and various improvements for neural network techniques.
Moreover, with the advent of deep learning toolboxes such as Keras (Chollet, 2015), deep learning also became accessible for noncomputer scientists. Although deep learning can outperform other machine-learning algorithms, training data are needed. Despite their important function, no standardized methodology has yet been described to measure stomatal traits such as stomatal density and size. The handbook of protocols for the measurement of plant functional traits (Cornelissen et al., 2003) highlights the importance of stomata as hard functional traits; however, it does not include any advice standardized way on how to prepare, image and count them, while there is a clear need in the framework of global efforts on the one hand (Lin et al., 2015), and to feed our deep learning networks on the other hand.
Finally, the recent paper by (Christin, Hervet, & Lecomte, 2019) highlights the importance of guidelines and recommendations to help ecologists get started with deep learning. Although deep learning has proven its potential in a lot of disciplines, developing a deep learning solution is not yet a trivial task. They strongly advocate a stronger interaction between computer scientists and ecologists.
However, given the diversity of stomatal shapes and sizes among plants (there are more than 400 angiosperm families only; Haston, Richardson, Stevens, Chase, & Harris, 2009), the variation in techniques for making stomatal impressions (Gitz & Baker, 2009) and the different kinds of imaging techniques available (optical, fluorescence microscopy and SEM), there is a clear need for researchers to be able to understand the entire process "from leaf to label" and to tweak this to their own needs. Here, we describe a step-by-step guide of a pipeline of actions we have developed from leaf preparation to microscope imaging that is easy, inexpensive, and acquires enough image quality to train and use the DL network. The objective of this paper is therefore twofold.
(a) We provide a methodological protocol aimed at standardizing sample preparation as well as imaging of stomata. The rationale is to facilitate comparability and usability across studies for revealing patterns and mechanisms by increasing the reliability and predictive power of stomatal counts. More specifically, we outline an accessible methodology to obtain stomatal counts "from leaf to label" that can be applied beyond a laboratory setting and is also suitable for educational purposes. (b) We present a tutorial-styled detailed and replicable methodology for automatic stomata detection with deep neural networks and show its applicability of deep learning across the phylogeny of the angiosperms. Our aim is to motivate researchers from the ecology and evolution community to consider deep learning techniques for the automation of their workflows.

| Specimens and species
We used mounted specimens from the African herbarium collection of Meise Botanic Garden which contains approximately 500,000 herbarium specimens from Burundi, Rwanda, and Democratic Republic of the Congo, representing more than 80% of the existing collections from these countries (Stoffelen P., pers. comm., 2019).
Five fully developed leaves per specimen were carefully detached and remounted afterward. The species for algorithm training were selected in the context of studying the effects of global change on the central African forest vegetation (Bauters et al., 2020). The specimens used here mainly came from common tropical (timber) tree species such as Cola griseiflora, Mammea africana, and Erythrophleum suaveolens which are well represented in the collection and were recurrently collected throughout the last century (1902-2013; for complete species list, see Table S1). These herbarium specimens were collected at the Yangambi Biosphere Reserve, situated within the Congo River Basin west of the City Kisangani in the Democratic Republic of the Congo.

| Leaf prints
Surfaces of plant leaves are very variable in texture, so depending on the species of study one may want to try different approaches for visualization of the stomata. Many methods for looking at stomata exist and can be categorized into two classes: the use of fresh leaf material or imprints. Important criteria for choosing a method are as follows: toxicity, availability of a laboratorium space, negative versus positive image, effect on stomatal movements, preferred or available microscopic visualization technique (e.g., light microscopy, scanning electron microscopy), slide preservation, damage to the leaf tissue, and ease of use. For this study, we opted for the traditional nail polish method because we needed a nondestructive technique for generating leaf prints as we are dealing with valuable historical dried plant material. However, for some thin-leaved species, this method was not suited. Also for our purposes, the preserving and archiving of impressions were of minor importance compared to the ease of use, as was the need for a positive replica or the effect of the technique on stomatal movements. For a comprehensive comparison of the quality of different techniques for creating stomatal impressions in combination with brightfield microscopy, we refer to Gitz and Baker (2009). Epidermal leaf impressions were made from the abaxial side of the leaves in the middle of the leaf, between the midvein and edge. This region of the leaf has been shown to contain guard cell lengths and stomatal densities comparable to the means of the entire leaf (Beaulieu, Leitch, Patel, Pendharkar, & Knight, 2008, and references therein). Transparent nail polish (Bourjois Crystal ball) was used to make the impressions which, once dried, were mounted pointing upward with double-sided tape (Scotch™) on a microscope slide.
A single photomicrograph was created by stacking of several digital images taken at different focal planes to increase the depth of the resulting image. All stomata that fell entirely within the view field were counted and converted to stomata per square millimeter to obtain stomatal density.

| Deep learning approach
A basic deep learning architecture is depicted in Figure 1c. It consists, from left to right, of an input layer, multiple stacked convolutional and pooling layers, a fully connected feedforward neural network, and an output layer. By alternating convolutional and pooling layers, the (raw) input (e.g., a RGB image) is progressively transformed into more abstract representations. Therefore, the convolutional layers convolve the input feature maps with a set of learnable filters (i.e., nonlinear transformations) to produce a stack of output feature maps (Zeiler & Fergus, 2014). The pooling layers are used to reduce the dimensionality of the feature maps by computing some aggregation function (typically the maximum or the mean) across small local regions of the input (Boureau et al., 2010). This results in a hierarchical set of features where higher-level (more abstract) features are defined in terms of lower-level (less abstract) features (Najafabadi et al., 2015).
The resulting feature maps are then concatenated and fed into a stack of fully connected neural layers to map these features onto the desired output.
Deep neural networks come with a lot of trainable parameters: an order of magnitude of a hundred million parameters is not uncommon. In order to properly adjust the weights, gradient descent in combination with the backpropagation procedure can be used (LeCun et al., 2015). By applying the chain rule on the stacked layers on both the convolutional and fully connected layers, the gradient of the objective with respect to the input can be computed. The backpropagation equation can be applied repeatedly to propagate gradients through all modules, starting from the output at the top (where the network produces its prediction) all the way to the bottom (where the external input is fed) (LeCun et al., 2015). Today, adapted versions of the gradient descent optimization algorithm are used (see Ruder, 2016, for an overview). A particular popular optimizer is Adam (Kingma & Ba, 2014), an adaptive learning-rate method, with bias correction and momentum.
Because of their proven capabilities and state-of-the-art results in many domains, deep neural networks are popular. However, due to their huge amount of trainable parameters, overfitting to data remains a major challenge. A toolbox of techniques to avoid overfitting exists, including the reduction of the model complexity by reducing the number of hidden layers or units, layer-wise pretraining and fine-tuning (Bengio, Lamblin, Popovici, & Larochelle, 2007), dropout (Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov, 2014), and data augmentation (Simard, Steinkraus, & Platt, 2003).
F I G U R E 1 From leaf to label: a simple deep learning approach for automatic stomata detection. A photomicrograph (a) is divided into small overlapping patches (b) by using a sliding window approach. The deep learning architecture (c) is trained to label these patches. Positively labeled patches of a photomicrograph (d) are clustered which results in the detection (e)

| Detection of stomata with deep learning
In this work, we assessed the performance of deep learning for the detection of stomata. While this task can be broadened to a generic object detection task for which multiple efficient methodologies were proposed (see Liu, Ouyang, et al., 2018, for an extensive review), we focus on a simple methodology across multiple species which models the stomata detection task as classification task within a fixed window. This baseline approach is illustrated in Figure 1.
For generating the training set, we used herbarium specimens of 19 common tropical tree species belonging to 12 flowering plant families and eight orders ( Figure 2, Table S1). The choice of training set was made in function of a running research project (COBECORE) to investigate the change in stomatal density and function over time in Central African tropical rainforest (Bauters et al., 2020). A total of 431 micrographs were used for training, 1-53 training images per species, 3-115 per family, and 14-126 per order.
In order to detect the stomata in a picture, we applied a simple patch-based method (Cruz-Roa et al., 2014;Hou et al., 2016). Therefore, we divided each picture in multiple overlapping patches of size 120-by-120 pixels. This patch size is based on the average stomatal size observed in the training set. The patches were labeled as being positive or negative by an expert (Figure 3). Note the variability of the stomata in the training set as well the variability of the negative patches due to the occurrence of different artifacts in the data. In total, we extracted more than 12 thousand positive labeled patches and 72 thousand negative patches from the training set.
Due to the apparent larger variability in the negative patches, more negative patches than positive patches were included.
The obtained patches were then used to train three different deep learning models: two basic architectures with three convolutional layers followed by two dense layers and an output layer are relatively closely related to the species from the training set (see Table S1). Lastly, all positively labeled patches are clustered by using mean shift clustering (Comaniciu & Meer, 2002). This technique groups neighboring (or even overlapping) positively labeled patches from which the resulting stoma coordinates are derived.
All software was implemented in Python 3.6. Keras (Chollet, 2015) and Tensorflow (Abadi et al., 2016)  stomatal shape and size (Zhang et al., 2012). As stomatal shape can vary in relation to climate even between species within a genus (e.g., Pautov et al., 2017;Yukawa, Ando, Karasawa, & Hashimoto, 1992), we sampled for this dataset three species within each genus with one species from tropical rainforest, one from tropical moist deciduous forest and one from tropical shrubland and desert (but only two climate regions for the Asparagaceae and one for Orchidaceae) to average precision measures and be able to compare genera by controlling for provenance. Precision indices for the training set were calculated to assess the performance in function of the number of stomata used per species for training and to compare performance to the "unseen beyond the scope of training" set.
The output of the developed model for stomatal detection con- Accuracy is defined as the ratio of the number of correctly classified items to the total number of items (Michie et al., 1994). Goodness of fit was determined by calculation of the coefficient of determination (R 2 ) from a linear regression between computed and manual counts.

| Efficacy of the nail polish method
A total of 49 species were sampled from the African herbarium of Meise Botanic Garden (Table S1). The nail polish method was successfully applied in 78% of the species sampled. Generating impressions failed in 16% of the species due to hairy or velvety leaf surfaces. In 7% of the species for which we managed to get leaf prints, we were unable to detect the stomata visually.

| Model selection and evaluation
The accuracy of all three architectures on an unseen dataset (i.e., the validation set) is depicted in Figure 4, illustrating the precision and recall for varying thresholds (0.05-0.95), as well as the F-score, which is an indication of the overall performance. One can observe that there is a trade-off between precision and recall. For example, one can choose to obtain maximal precision with very low recall or vice versa. From  250 stomata or more were used for training, precision, recall, and Fscore values of 0.8 and higher were obtained ( Figure 6).

| Accuracy
The accuracy was calculated for 70 images of species within the scope of the training set to compare results of stomatal densities between computed and manual counts. Average accuracy was high (94%), and a strong correlation between the computed counts and the manual counts was observed among all the images (Figure 7, Figure 7 shows the reference line (1:1) with an intercept within the 95% confidence interval (CI) around the intercept (−4.46 to 0.86) of the linear regression and with a slope value of 1 slightly outside the 95% CI of the regression slope (1.01-1.11). For images containing many stomata (>60) stomatal number tends to be underestimated ( Figure 7).

| Generalization to other species
In Figure

| D ISCUSS I ON
In this work, we developed a leaf-to-label workflow that allows detecting stomata on light microscope images from dried plant material such as that of herbarium specimens. Even though mostly used in fresh plant material (e.g., Wu & Zhao, 2017), the nail polish method proves to be a reliable, noninvasive, easy, and inexpensive method that can obtain qualitative leaf impressions from dried leaves on the majority of species (78%). We trained a deep learning architecture for the detection of stomata in focus-stacked images of high resolution. However, we believe that traditional light microscopy could also be used for imaging given that the entire field of view is in focus. We illustrated that, even with a simple deep learning approach in which we model the object detection problem as a classification problem with a fixed patch size based, a F-score of 0.89 can be reached on unseen taxa on the condition that they are in the phylogenetic scope of the training set. This is in line with the average results (F-score: 0.87) on the training set. The model on average did not perform better on unseen species within the same angiosperm order as the training set (F-score: 0.64-0.79) as compared to its performance on unseen species of other angiosperm orders (Fscore: 0.57-0.80). This result seems to indicate that the variation in stomatal structure and shape within flowering plant orders is similar to the variation between them. However, note that this test set includes at most a few species representing an angiosperm order and therefore does not include all variation within genera, families, and orders. The training focused mainly on taxa belonging to the core eudicots and one species of the basal angiosperms Polyalthia suaveolens (Magnoliales). The model performed on average better on unseen species from the core eudicots (F-score: 0.77) than on unseen species from the monocots (F-score: 0.59). The difference in stomatal shape between monocots and the dicots is apparent, especially the grasses (Poaceae), represented here by Chloris species, F I G U R E 5 Precision-recall curve for the fine-tuned VGG19 architecture on the three different species (zoom on the right) are known for their particular dumbbell-shaped guard cells as compared to kidney-shaped cells of dicots (Rudall, Chen, & Cullen, 2017;Zeiger, Farquhar, & Cowan, 1987). Also, the orchid species included in the test set, Cyrtorchis chailluana has a stomatal shape not easily detected by our model (Figure 8), probably because of its particular circular shape and round opening [cf. stoma type II in Dendrobium (Yukawa et al., 1992)]. Also note that we did not include species with extremely large stomata typical for, for example, the Liliaceae, as this will decrease the performance of the model to detect the stomata (but see below).
While the model performs relatively well over a broad taxonomy, our approach has room for further improvement. First, the model performance is highly related to the variation (Figure 8), the quantity ( Figure 6), and quality of training images. The network presented in this paper is not trained to handle low-quality images. Therefore, high-quality images should be aimed to enable the network to perform optimally (see also Fetter et al., 2019). The quality (contrast, blurriness, etc.) of a set of pictures can be quantified using the image histogram and using PyImageQualityRanking software for ranking the images in a set and detecting outliers (Koho, Fazeli, Eriksson, & Hänninen, 2016). Based on this ranking, one can choose to leave out low-ranked images due to their insufficient quality. If low-quality images should still be processed, then the network should be trained accordingly.
Second images and total number of stomata seen during training (results not shown) as was the case for the information retrieval (IR) standard measures ( Figure 6). For example, only 69 stomata or three images for the species Irvingia grandifolia were used in training (Table S1) rendering an average accuracy of 64%, that is, a reduction or increase in stomatal density of 36%. Since a 28% reduction in stomatal density in transgenic poplars is enough to cause a 30% drop in transpiration (Bertolino et al., 2019;Wang et al., 2016), we recommend at least 250 stomata for training depending on the level of difference in stomatal density one wants to detect. If small differences in stomatal density within a species are targeted, the general protocol described in this paper can be used. In order to obtain more accurate results, the threshold of the deep learning model (cf. Section 3.2) can be adjusted for each species separately. Furthermore, the accuracy for an individual species can be increased by fine-tuning the model by training the dense layers of the deep learning model.
Third, our patch-based approach is constrained by a patch size of 120-by-120 pixels which correspond to a window of 25 by 25 µm using the microscope settings as described above. Angiosperms on average have a stomatal length or guard cell length of 31 µm (cf. Beaulieu et al., 2008;Hodgson et al., 2010; Figure S1). Although this patch size could be successfully applied to the majority of angiosperm species, the patch size limits both the aspect ratio and the scale of the input image. The simplest solution is to adjust magnification during data collection, by increasing the magnification when stomata are too small to be detected and decreasing the magnification when stomata extend beyond a patch size of 120-by-120 pixels.
In our model, stomata between 60 and 120 pixels are best detected . The numbers in the central frame denote the performance indices: precision, recall, and F-score. Average precision, recall, and F-score for the training set are 0.84, 0.91, and 0.87, respectively. Images (b, c, and d) visualize the performance of the network on unseen taxa belonging to the test set with (b) Cyrtorchis chailluana (Orchidaceae, Asparagales), (c) Lannea schweinfurthii (Anacardiaceae, Sapindales), and (d) Ipomoea eriocarpa (Convolvulaceae, Solanales). Green crosses denote the actual stomata, red x's the stomata recognized by the network with a confidence of 0.7 or higher. Color gradient from green (low confidence) to yellow (high confidence) by the model. Another more elegant way of handling this problem is by including some region of interest pooling layer as discussed by Dai, Li, He, and Sun (2016) and He, Zhang, Ren, and Sun (2014), which would allow moving from the patch-based method to detect all stomata in a spatial hierarchical way. This object detection pipeline can be improved further with Fast R-CNN (Girshick, 2015) and Faster R-CNN (Ren, He, Girshick, & Sun, 2015) which combine the idea of using a spatial hierarchical pooling with region-based convolutions into an end-to-end trainable deep learning model. Furthermore, if processing speed is an issue, one can opt for a single shot multibox detector approach (SSD). SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location . At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.
Fourth, with the current advances in deep learning, the object detection pipeline can be improved further by using novel convolutional neural network architectures such as Xception (Chollet, 2017) or ResNeXt (Xie, Girshick, Dollár, Tu, & He, 2017) as a backbone for feature extraction. See Bianco et al. (2018) for an in-depth analysis of the majority of the deep neural network architectures that deviate from the idea that simply stacking convolutional layers is sufficient.
To summarize, we illustrated that by using a simple deep learning architecture one can work out a simple leaf-to-label workflow that allows detecting stomata on light microscope images from dried plant material such as that of herbarium specimens. Our approach can be optimized depending on the availability of the data as well as by using more recent object detection pipelines. We rec-

| CON CLUS IONS
The entire leaf-to-label pipeline presented here could be of use in different research areas with the need for stomatal count data of many specimens. It will allow ecologists to focus on the ecological questions rather than on the technical aspects of data analysis and more specifically deep learning, and computer scientists to pave new roads on some of the biological world's most complex units, such as ecosystems (Christin et al., 2019). Large-scale studies using stomata of fossils to reconstruct a changing environment in deep time (e.g., Franks, Berry, Lombardozzi, & Bonan, 2017;Mcelwain, Beerling, & Woodward, 1999), as well as work on the anthropogenic effect on stomatal density and size in agricultural crops (Zheng et al., 2013), could benefit from such an approach, that is, the use of a general deep learning model that can be tweaked and expanded for the detection of other objects such as epidermal cells. Especially the information locked in the archives of herbaria, the result of centurylong efforts of collecting, has shown to be of great value in several studies, as the digitization of herbaria specimens has the potential to produce data to facilitate the study of the natural world (Goodwin, Harris, Filer, Wood, & Scotland, 2015). The leaf-to-image approach described here is easy to perform and given that imaging technology is becoming faster and can be partially automated, the exploration of these sleeping beauties is within reach.

ACK N OWLED G M ENTS
We are grateful to Filip Vandelook, Steven Janssens, and Piet

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
The trained model is accessible to use at https://kiks.ilabt.imec.
be/ for the upload of individual images. The do-it-yourself tutorial for training and detection as well as an example image set can be accessed on GitHub by following this link: http://github.