Identifying microbial life in rocks: Insights from population morphometry

The identification of cellular life in the rock record is problematic, since microbial life forms, and particularly bacteria, lack sufficient morphologic complexity to be effectively distinguished from certain abiogenic features in rocks. Examples include organic pore‐fillings, hydrocarbon‐containing fluid inclusions, organic coatings on exfoliated crystals and biomimetic mineral aggregates (biomorphs). This has led to the interpretation and re‐interpretation of individual microstructures in the rock record. The morphologic description of entire populations of microstructures, however, may provide support for distinguishing between preserved micro‐organisms and abiogenic objects. Here, we present a statistical approach based on quantitative morphological description of populations of microstructures. Images of modern microbial populations were compared to images of two relevant types of abiogenic microstructures: interstitial spaces and silica–carbonate biomorphs. For the populations of these three systems, the size, circularity, and solidity of individual particles were calculated. Subsequently, the mean/SD, skewness, and kurtosis of the statistical distributions of these parameters were established. This allowed the qualitative and quantitative comparison of distributions in these three systems. In addition, the fractal dimension and lacunarity of the populations were determined. In total, 11 parameters, independent of absolute size or shape, were used to characterize each population of microstructures. Using discriminant analysis with parameter subsets, it was found that size and shape distributions are typically sufficient to discriminate populations of biologic and abiogenic microstructures. Analysis of ancient, yet unambiguously biologic, samples (1.0 Ga Angmaat Formation, Baffin Island, Canada) suggests that taphonomic effects can alter morphometric characteristics and complicate image analysis; therefore, a wider range of microfossil assemblages should be studied in the future before automated analyses can be developed. In general, however, it is clear from our results that there is great potential for morphometric descriptions of populations in the context of life recognition in rocks, either on Earth or on extraterrestrial bodies.

This characterization also facilitated the determination of the indigenous character of a carbonaceous microfossil, enabling the exclusion of modern organic contaminants and modern endoliths.
We extend here the morphologic approach to entire populations of microstructures. This will specifically bring more light and precision to one of the aforementioned criteria of biogenicity: the association of individual structures representing a biologic population. A critical question to address is whether populations of abiogenic structures such as interstitial spaces and biomorphic mineral aggregates can be quantitatively distinguished from populations of micro-organisms. If it can be shown that this is possible, preliminary tests of biogenicity can be made without the use of complex and expensive in situ analytical techniques.
In theory, using an assemblage of example images for biogenic and biomimicking populations, machine learning could be used to empirically distinguish populations of micro-organisms from abiogenic populations. However, machine learning approaches usually necessitate a large number of images, which may be difficult to obtain for the populations of relevance here. Additionally, many machine learning algorithms make it difficult to discretely identify the decisive criteria that distinguish the systems.
Very recently, a study of the morphologic variation in two specific populations of organic microstructures from the 3.4 Ga Strelley Pool formation was conducted, indicating significant differences between populations and the existence of distinct sub-populations, with potential important paleoecological implications .
To the best of our knowledge, there is no general survey of the potential of statistical morphometrics for life recognition. One of the most important outcomes of morphometric studies is the construction of morphospaces-theoretical spaces in which axes represent continuous morphology-describing parameters. As far as we know, morphospaces haves been used in paleontology for individuals of a specific biologic group. For instance, in the seminal work conducted on ammonoid shells by Raup, 1967, an individual is represented by a point. It has been found that, in any morphospace, the regions occupied by individuals from a specific biologic group are restricted by developmental and adaptive constraints. In this study, we extend the concept of morphospace to the population level; that is, populations are represented by a single point in these "population" morphospaces. We make the hypothesis that the occupancy of population morphospaces by microbial communities depends on individual-scale processes (development, adaptation) but also on population-scale processes (ecologic relationships, environmental forcings). In this study, we critically test whether populations of specific abiogenic objects show fundamentally different morphometric characteristics in comparison with populations of modern micro-organisms (single-strain and multiple-strain communities). Two different types of abiogenic objects were chosen: interstitial spaces between clasts in sedimentary rocks and silica-carbonate biomorphs. The importance of silica-carbonate biomorphs for micropaleontology has been put forward in previous studies (García-Ruiz, Carnerup, Christy, Welham, & Hyde, 2002;García-Ruiz et al., 2003;Rouillard et al., 2018). They display a wide range of life-like morphologies; during the Archean, they may have been formed-and preserved-in the same hydrothermal environments where life emerged and first evolved. As a consequence, they represent a material of choice for the current study.
Interstitial spaces, especially those in spherulitic or botryoidal chert fabrics, display a continuum of shapes that may also be mistaken for degraded microfossils (Brasier et al., 2002(Brasier et al., , 2005. In this proof of concept study, and for practical reasons, the system of clastic interstitial spaces was chosen. They are genetically different from interstitial spaces in spherulitic or botryoidal chert fabrics, but they have important morphologic similarities. These three systems-interstitial spaces, biomorphic mineral aggregates, and microbial cells-are compared using the statistical distributions of size and shape-describing parameters (circularity, solidity), and pattern-describing parameters (fractal dimension and lacunarity). In addition, correlation studies and multivariate analyses are conducted in order to maximize discrimination between the three systems. The method devised is then tested on well-preserved, silicified microfossil assemblages from the 1.0 Ga old Angmaat Formation (Bylot group, Baffin Island, Canada).
Based on this test, the limits and potential applications of statistical morphometry are discussed.

| Description of the images used
Three populations of microstructure are compared in the context of this study: (a) interstitial spaces, (b) silica-carbonate biomorphs, and (c) microbial populations. This study was performed using representative images for each system (presented in full in Figures S1-S6). Examples of images from the different systems are shown in Figures 1-3.

| System 1: Interstitial spaces in clastic sedimentary fabrics
This system is represented by eight images taken on five different samples of sandstone and one sample of limestone (Figures S1 and S2).
The first three sandstones contain ferruginous cement within interstitial pore space, the opacity of which allows ready identification of pore space under plane-polarized light. The first sandstone ( Figure 1a; Figure S1A) consists of a fine-to medium-grained detrital sediment composed of angular quartz, together with mica.
Micaceous grains are commonly deformed by compaction of quartz.
The second sandstone (Figure 1b,Figures S1C and S2C,D) consists of a medium-grained detrital sediment composed of subangular quartz and feldspar. The third sandstone ( Figure S2E) corresponds to a medium-to coarse-grained detrital sediment composed of angular to subangular quartz, feldspar, and mica mineral grains, as well as lithic fragments composed of volcanic rock and glass.
The next two sandstone samples add additional complexity.
The fourth sandstone ( Figure S2A) consists of a medium-to coarsegrained detrital sediment composed of angular to subangular quartz and feldspar mineral grains, detrital calcite, with macrocrystalline calcite cement filling most of the interstitial spaces between the grains. In plane-polarized light, the presence of calcite as both detrital grains and as interstitial cement complicates the distinction between the grains and the initial porosity. Similarly, the last sandstone ( Figure S1B) consists of a medium-grained clastic sediment mainly composed of subangular quartz, feldspar, and mica mineral grains, a variety of lithic fragments, and (rare) bioclasts consisting of echinoderm and bryozoan skeletal fragments. Bioclasts are broken and abraded and do not retain their primary biologic shape. A macrocrystalline calcite cement fills most of the interstitial spaces between the grains. Here again, in plane-polarized light, a number of clasts appear very similar to the cemented pores, which complicates the distinction between the grains and the initial porosity. These two samples were also used to investigate the robustness of the method regarding segmentation quality (see Sections 2.2 and 4.2.2).
Finally, a single limestone sample ( Figure S2B) was analyzed. This sample consists of an oolitic and pelletal grainstone, associated with scarce gastropod bioclasts. The bioclasts are strongly abraded and rounded, and do not retain their initial biologic shape. A sparitic cement fills the interstitial spaces between the grains, and fractures within clasts are filled with a similar calcitic cement. In plane-polarized light, the clasts typically appear darker than the cemented pores and fractures, which are easily identifiable.

| System 2: Silica-carbonate biomorphs
This system is represented by 11 images of silica-carbonate biomorphs grown in gel and in solution ( Figures S3 and S4). Here, silica-carbonate biomorphs were grown using laboratory chemicals.
In silica gel, silica-carbonate biomorphs form over a few days to several weeks, upon diffusion of a concentrated barium solution (solutions of [Ba] = 0.5 and 1 M here) through an alkaline silica gel in the presence of carbonate ions (Melero-García, Santisteban-Bailón, & García-Ruiz, 2009). They display a crystallization gradient along the direction of diffusion. A total of 10 images of gel-grown biomorphs, which were synthetized in the context of a previous study (Rouillard et al., 2018), were used here (Figure 2; Figures S3 and S4A,B,D,E). Images were taken with an optical microscope using plane-polarized light, at different distances from the diffusion boundary, therefore representing different regimes of growth (Melero-García et al., 2009).
In a silica-rich alkaline solution, silica-carbonate biomorphs form within a few hours upon addition of barium (a [Ba] = 10 mM was used here) and carbonate (being supplied by diffusion of atmospheric CO 2 ). Solution-grown biomorphs are represented by one image ( Figure S4C), synthetized in the context of a previous study (Rouillard et al., 2018). The image was taken with an optical microscope using plane-polarized light.

| System 3: Microbial communities
This system is represented by 10 images of a single strain of cyanobacteria and of a natural microbial community ( Figure S5 and S6).
Single-strain bacteria culture: One image was taken on a culture of coccoid cyanobacteria grown in the laboratory (Figure 3a; Figure   S5E). This culture consists of a strain of Synechocystis sp. from the Pasteur Cyanobacteria Collection (PCC6803) grown with standard BG-11 medium to a stationary growth phase. The image was taken using an optical microscope with contrast enhanced using differential interference contrast (DIC) optics.

Microfossil assemblage
A well-preserved microfossil assemblage from the 1.0 Ga Angmaat Formation (Baffin Island, Canada) was used here as a test. The Angmaat Formation represents deposition in a peritidal, episodically restricted microbial flat with diverse, well-documented fossil microbial mats preserved in early diagenetic chert. Communities commonly contain both filamentous and coccoidal microfossils (Hofmann & Jackson, 1991;Kah & Knoll, 1996;Kah, Sherman, Narbonne, Knoll, & Kaufman, 1999;Knoll, Worndle, & Kah, 2013; Manning-Berg, Wood, Williford, Czaja, & Kah, 2019). The thin sections used in the context of this study contain an assemblage of coccoidal taxa (Eogloeocapsa sp, Myxococcoides sp., Eoentophysalis sp., Gloeodiniopsis sp. and Polybessurus sp.). Two specific microfossil-rich areas were imaged in these thin sections (image mosaics are shown in Figure 4a and Figure S7). In order to test the effect of taphonomic variability (Manning-Berg et al., 2019), a smaller, well-preserved F I G U R E 2 Examples of biomorph images used in this study (full list of images shown in Figures S3 and S4). Silica-witherite biomorphs shown here were grown by diffusion in gels (method described in detail in Melero-García et al., 2009). (b) is taken further away from the diffusion source than (a). Images were taken using an optical microscope in plane-polarized mode; a stack of ~30 images taken along the depth were processed to reconstruct these images. The corresponding binarized images, obtained by treatment and segmentation, are shown below F I G U R E 3 Examples of images of microbial populations/communities used in this study (full list of images shown in Figures S5 and S6). (a) Singlestrain Synechocystis sp. population grown in laboratory and imaged during the stationary phase. (b) Stromatolitedwelling microbial community from Alchichica crater Lake (Mexico; Gérard et al., 2013). (a) Was taken using an optical microscope with differential interference contrast optics, and (b) was taken using a Confocal Microscope. Colors in (b) are due to the natural autofluorescence of the bacteria (photosynthetic pigments). The corresponding binarized images, obtained by treatment and segmentation, are shown below [Colour figure can be viewed at wileyonlinelibrary.com] area of the mosaic shown in Figure 4a was also used in the study ( Figure 4b).

| Image analysis
All images were treated individually using ImageJ (Abramoff, Magalhaes, & Ram, 2004;Collins, 2007). The images were binarized to separate the populations of microstructures of interest (i.e., interstitial spaces, silica-carbonate biomorphs, and micro-organisms) from their surroundings. For all images, except those corresponding to the two sandstone samples where the cement is difficult to distinguish, the binarization was obtained using the threshold algorithm (the type of threshold was set to "default") of ImageJ. For the images of the two sandstone samples where the interstitial cement is difficult to distinguish from some clasts, the "Trainable Weka Segmentation

| Characterization of particles in populations
For every particle in an image, the normalized area A (in square pixels) and simple shape descriptors-circularity, C, and solidity, S (dimensionless)-were measured. Circularity is defined as: where P (in pixels) is the circumference of the particle. This descriptor ranges from 0 to 1, with decreasing values corresponding to increasingly elongated particles (circularity values of example particles are shown in Figure 5a). Solidity is defined as: Mosaic of a microfossilrich area in a thin section realized on a chert sample from the 1.0 Ga Angmaat Formation (Baffin Island, Canada). The mosaic was taken using an optical microscope in plane-polarized mode. The red rectangle indicates the region corresponding to (b). Another mosaic is shown in Figure S7. (b) Close-up view of the region outlined in (a). Note the strong degradation of the assemblage in some areas, in which individual microfossils are difficult to segment or even not recognizable anymore. The corresponding binarized images T2 and T2-sub, obtained by treatment and segmentation, are shown on the right of (a) (T2) and below (b) (T2-sub) [Colour figure can be viewed at wileyonlinelibrary.com] where α is the area (in square pixels) within the convex hull of the particle. The convex hull of a particle is defined by the ensemble of straight segments joining the outermost points of the particle (convex hulls of example particles are drawn in brown on Figure 5b). Solidity ranges therefore from 0 to 1, with higher values representing more convex particles (see Figure 5b). Measurements of circularity and solidity of particles in digital images have been used in various contexts, for example, for quantifying the changes of shape of a fungal strain

| Characterization of populations
In order to describe the overall geometry of populations in the binary images, the parameters of fractal dimension and lacunarity were calculated. A plugin developed for ImageJ (FracLac -Karperien, A., FracLac for ImageJ) was employed.
Generally, the fractal dimension (D, dimensionless) relates detail (N, which may be defined differently depending on the context) with scale ( , representing a length), by the following relationship: (3) N ∝ −D F I G U R E 5 Illustration of the key morphologic descriptors used in this paper. (a, b) Circularity and solidity values of some example particles. The particles are ordered with decreasing circularity or solidity values from left to right. The convex hulls (ensemble of straight segments joining the outermost points of the particle) of each particle are drawn in brown in (b). The solidity values decrease when white areas inside the convex hulls are more important. Note that, the two parameters describing different aspects of the shape, the order from left to right is not exactly the same in (a) and (b). (c) Explanation of the measurement of fractal dimension by box-counting method on an example pattern. Grids of boxes with different sizes (ɛ1, ɛ2, and ɛ3) are overlaid on the pattern. The size of boxes represents scale in this method. The number of boxes containing black pixels (green) in the different grids is given as N1, N2, and N3, and represents the amount of detail in this method. A rough estimate of the fractal dimension is first given by a ratio of logs. A better estimate of the fractal dimension is then found using a power-law regression of N versus ɛ. (d) Lacunarity values of some example patterns. The lacunarity value increases when the overall heterogeneity of the pattern increases [Colour figure can be viewed at wileyonlinelibrary.com] with ∝ denoting a proportionality relationship. For 2D objects, D ranges between 1 (fractal dimension of a line) and 2 (fractal dimension of a homogeneous surface). D is commonly estimated via the box-counting method (see Figure 5c for an example of application of this method).
In this method, a grid is positioned on the image. The length of box sides in the grid (in pixels) represents the scale, ɛ, while the amount of detail N at scale ɛ is estimated by the number of boxes in the grid containing foreground pixels (black pixels here; see green boxes in Figure   5c). This process is repeated several times with different ɛ values. If N 2 and N 1 are the amount of detail for two different scales 2 and 1 , one can make a rough estimate of the fractal dimension according to the following formula: An example application of Equation (4)  Lacunarity quantifies the heterogeneity and the importance of "gaps" in a binary image (see Figure 5d to see how these features relate to the value of lacunarity in example patterns). Using a box-counting method similar to the one used for fractal dimension, the lacunarity λ at a given scale ɛ can be estimated by where ( ) and ( ) represent respectively the standard deviation and the average of the pixel count in the boxes of a grid (with boxes of side length ɛ). For the images of the three systems, a general lacunarity value L was calculated as an average of the λ values found with different ɛ. The lacunarity value of a homogeneous image is 0, since ( ) is equal to 0. There is theoretically no upper limit for the value of lacunarity. For the measure of both D and L, grids of 12 different ε (ranging from 1 pixel to 45% of the image size, with a fixed increment) were used.

| Description of statistical distributions
Individual distributions were represented graphically as bar histograms. In order to facilitate the comparison between them, the width of the bins is the same in each histogram.
For a parameter X observed in a population of G particles, with a mean µ(X) and a standard deviation σ(X), the shape of the distribution can be described quantitatively (independently of its graphical representation) by three parameters: The mean divided by standard deviation (Mean/SD), which describes the relative width of the distribution (compare Figure   S9A,B): skewness, which describes the asymmetry of the distribution (compare These three parameters were used to quantify the shapes of the distributions of size, circularity, and solidity in all populations.

| Assessment of sampling bias
The qualitative and quantitative descriptions of statistic distributions are subject to error resulting from imperfect sampling of the populations. In order to assess the biases resulting from insufficient sampling, a study of the effect of sampling size was conducted on several test distributions. Size distributions were plotted as bar histograms and characterized quantitatively for different sample sizes.
The results are shown in Figure S10

| Parametric correlations between populations
The linear correlation coefficient r between two parameters X and Y, measured on a population of G points, can be calculated according to The significance of the correlation p (or probability that the two parameters are not correlated) varies with the number of points G and can be calculated with a two-tailed t test according to The shape of the statistic distribution of each of the three parameters size, circularity, and solidity was described using the three parameters mean/SD, skewness, and kurtosis. Together they constitute nine parameters describing the distributions in each population. Combined with the two parameters of fractal dimension and lacunarity, a total of 11 parameters are obtained that characterize each population. Linear correlation coefficients r and their associated significances p were calculated between all the couples of these parameters for each system and are given in Tables S1, S2, and S3.

| Discrimination of populationscomparison of different sets of parameters
Multivariate analysis allows the exploration of the combined data from several describing variables (Bishop, Fienberg, & Holland, 2007). Discriminant analysis is a type of multivariate analysis that finds the best linear combinations of variables to maximize the variance between different groups and minimize the variance inside with n(c) being the number of test populations attributed correctly to their system of origin and n(c) + n(u) the total number of test populations. This process was repeated 1,000 times for a given size of the training group. The efficiency of the analysis was assessed by the mean rate of correct classification over these 1,000 repetitions. The different sets of parameters were compared by looking at the evolution of this rate as the size of the training group increased.

| Size distribution
The size distributions of all populations from the three systems are shown in Figure S11  shown in Figure 6. The parameters describing the size distributions for each population are given in Table 1.

System 1: Interstitial spaces
All populations of interstitial spaces show a similar size distribution: a frequency that decreases monotonically with size, comparable to an exponential or lognormal law ( Figure 6a; Figure S11, red histograms).
This could be due to the insufficient sampling in these populations ( Figure S11B).

System 3: Microbial communities
The microbial communities display a wide variety of size distributions. The culture of a single non-colonial cyanobacteria strain (Synechocystis sp., Figure 6d) displays a narrow, slightly positively skewed unimodal distribution (mean/SD of 6.30, skewness of 0.72, kurtosis of 0.32- Table 1). In contrast, the natural rock-dwelling microbial communities display various shapes of size distributions including unimodal, monotonically decreasing, or multimodal distributions (Figure 6e,f; Figure S11, green histograms). Consequently, the values of mean/SD, skewness, and kurtosis vary significantly between these populations (average Mean/SD of 2.61, ranging from 1.45 to 4.00; average skewness of 2.05, ranging from 1.14 to 2.99; average kurtosis of 7.85, ranging from 2.63 to 18.21- Table 1).

| Relationship between size and shape
Circularity-C-and Solidity-S- (Figure 5a,b) are used here to characterize the variation in particle shape in the populations. For all the studied populations, C and S are plotted against normalized area in Figures S12 and S13, and plots for representative populations are compared in Figure 7. The parameters describing the circularity and solidity distributions for each population are given in Table 1.

System 1: Interstitial spaces
In the system of interstitial spaces, the solidity values decrease with size, converging from a range of 0.6-0.9 at smaller sizes to a range of 0.4-0.6 at larger sizes (Figure 7a, red triangles; Figure S12). The circularity values also decrease markedly with size, converging from a range of 0.2-0.8 at smaller sizes to a range of 0.1-0.3 at larger sizes ( Figure 7b, red triangles; Figure S13). Based on these plots, large pores are more concave and elongated than small pores.

System 3: Microbial communities
In images of bacterial communities, the circularity/size plots vary significantly between populations ( Figure S13), with large ranges of circularity (0.2-1.0). The ranges of solidity are somewhat narrower, ranging from 0.7 to 1.0 ( Figure S12). Noticeably, in all these populations, a subpopulation (or the entire population) displays solidity and circularity values independent of size (Figure 7a,b, green circles; Figures S12 and S13), with upper values of circularity and solidity staying at 0.9-1.0 for normalized areas increasing from 0 to 1.0-1.5.

| Fractal dimension and lacunarity
The results of fractal dimension and lacunarity measurements by box-counting method, describing the overall geometry of the F I G U R E 6 Representative distributions of sizes plotted as bar histograms in the three studied systems. The vertical axes represent frequency and the horizontal axes represent size. The sizes are measured as proportional to a radius and normalized to the mean of each distribution. The scale on the x-axis is the same for all populations. In order to facilitate the comparison between histograms, the width of the bins is the same for all studied populations. Red: interstitial spaces populations. Blue: Biomorph populations. Green: Microbial populations [Colour figure can be viewed at wileyonlinelibrary.com] different populations from the three systems, are given in Table 1 and are represented in Figure 8a

| Parametric correlations between populations
Three dimensionless parameters that describe the shape of distributions (mean/SD, skewness, and kurtosis) were determined for size distributions, but also for circularity and solidity distributions.
Every population can consequently be described by these nine parameters, and by fractal dimension and lacunarity (11 parameters in total- Table 1). The three systems may be distinguished by distinct ranges for some of these parameters (e.g., in Figure 8b,c). For example, the different populations of interstitial spaces display a large

| Discrimination of populationscomparison of different sets of parameters
A quantitative study using discriminant analysis was made to test if populations from the three systems could be efficiently differenti-

A discriminant analysis run with Set 3 on all populations shows
again that the biomorph populations overlap with microbial populations (Figure 9c). When the number of populations considered in the discriminant analyses increases, the rate of correct classification follows a trend very close to that observed for Set 2 ( Figure 9f, gray stars).

A discriminant analysis run with Set 5 on all populations separates
the three systems even more efficiently than Set 4 (Figure 9e). When the number of populations considered in the discriminant analyses increases, the rate of correct classification follows a trend almost identical to that identified on Set 4 (Figure 9f, black triangles).

| Size distribution
In the three images from the Angmaat Formation, distributions show a monotonically decreasing size trend (Figure 10b; Figure S15) similar to the size distributions of interstitial spaces, but somewhat narrower (Mean/SD ranging between 1.01 and 1.48).

| Relationship between size and shape
Similar to silica-carbonate biomorphs and micro-organisms, solidities are high (0.6-1.0) and independent of particle size (Figure 10a, black squares and hollow stars; Figure S15). The evolution of circularities with size is, overall, similar to that observed for interstitial spaces ( Figure S15). However, similar to observations of modern microbial communities, circularity values remain high at small sizes (below 0.5 in normalized area).

| Parametric correlations between populations
On diagrams plotting population-describing parameters, parameters describing size distributions appear unable to attribute fossil microbial assemblages of the Angmaat Formation to one of the three systems (e.g., see skewness of size distributions in Figure 10d). However, the parameters describing shape distributions show clearly that populations of the Angmaat Formation can be split into two groups. The two large mosaics T1 and T2 have circularity and solidity distributions similar to interstitial spaces and may even show the same correlations (Figure 10e).
T2-sub, which represents only a part of T2, has circularity and solidity distributions similar to microbial communities (Figure 10d,e).

| Discrimination of populations
In discriminant analyses, microfossil assemblages of the Angmaat Formation are not systematically attributed to one of the three systems ( Figure S15J-L). For example, in a discriminant analysis using the Set 5 of parameters (size and shape distributions, plus fractal dimension and lacunarity- Figure S15L), the three populations (T1, T2, and T2-sub) plot between the three systems.

| Size distribution
In the system of interstitial space populations, all parameter measurements display a monotonously decreasing distribution within the population (Figure 6a; Figure S11). These results are comparable to the lognormal or power-law distributions found in previous measurements of interstitial spaces in clastic sedimentary rocks (Crisp & Williams, 1971;Curtis, Sondergeld, Ambrose, & Rai, 2012;Diamond, 1970;Fusseis et al., 2012;Kuila & Prasad, 2013;Loucks, Reed, Ruppel, & Jarvie, 2009). It is also consistent with results obtained in a study on hydrodynamic modeling of the pore structure of sandstones (Ioannidis & Chatzis, 1993). Since a broad variety of sandstones, as well as one limestone, was selected for this study, it can be inferred that the nature of their clasts or their packing characteristics do not significantly affect the overall shape of the size distributions.
In the system of silica-carbonate biomorphs, a unimodal, positively skewed size distribution is typically observed (Figure 6b; Figure S11), confirming the results of a previous study of Rouillard et al. (2018).
Size distributions of crystals grown experimentally or observed in nature can be explained and modeled by specific regimes of nucleation and growth (Eberl, Drits, & Srodon, 1998;Kile, Eberl, Hoch, & Reddy, 2000). According to these studies, the size distributions observed here for biomorphs correspond to a nucleation with decaying rate followed by surface-controlled growth. The decaying rate of nucleation is consistent with the gel environment used here for growing silicacarbonate biomorphs; the diffusion of barium is slow in this medium and their supply is limited (Eberl et al., 1998;Kile et al., 2000). The surface-controlled growth is also consistent with previous measurements that the growth rate of silica-carbonate biomorphs varies linearly with time (Zhang, 2015). However, silica-carbonate biomorphs are aggregates rather than single crystals, and the pH-and therefore saturation state of the growth medium-oscillates locally during biomorph growth (Montalti et al., 2017). As a consequence, silica-carbonate biomorph size distributions are not readily described by the same processes which explain the growth of single crystals. Overall, the size distribution of biomorphs is consistently distinct from the size distribution of interstitial spaces.
In the system of micro-organism populations, the single-strain, non-colonial population displays a unimodal size distribution clearly distinct from that of interstitial spaces populations and narrower than that of silica-carbonate biomorph populations ( Figure 6d; Figure S11m10; line PCC in Table 1). This kind of distribution has been reported and modeled in previous studies (Harvey & Marr, 1966;Katz et al., 2003;Koch, 1966;Uysal, 2001). On the other hand, the size of colonial, filamentous strains is not only controlled by the growth of individual bacteria, but also by the number of cellular divisions; the resulting size distributions may F I G U R E 1 0 Application of the analytic protocol to microfossil assemblages from the Angmaat formation (1.0 Ga, Baffin Island, Canada).
(a) Evolution of the solidity of the particles with their size in T2 (cf. Figure 4a; black squares in Figure 10a) and in a subset of T2, T2-sub (cf. Figure 4b; hollow stars in Figure 10a should be made before their interpretation. One potential solution to overcome these issues may be to look at the variability of size distributions within a sample (see further, Section 4.1.4).

| Relationship between size and shape
In the system of interstitial space populations, a clear trend is seen in the relationship between size and shape. The solidity ( Figure 7a) and circularity (Figure 7b) of interstitial spaces appear to decrease with their size. In other words, their shapes become more complex as their size increases. We make the hypothesis that this phenomenon is due to the constraining of the shape of interstitial spaces by their surrounding mineral matrix.
In the system of silica-carbonate biomorphs (gel-grown populations and solution-grown population), the solidity appears to be nearly independent of size (Figure 7a), while circularity decreases with size ( Figure 7b). This decrease in circularity with size is less pronounced than that observed for interstitial spaces.
In the system of microbial communities, by contrast, it appears that solidity ( Figure 7a) and circularity (Figure 7b) are independent of size. This may be ascribed to the presence of non-colonial strains. Indeed, the shape of an individual cell is regulated through complex intracellular mechanisms (Ausmees, Kuhn, & Jacobs-Wagner, 2003;Ingber, 2003;Jones, Carballido-López, & Errington, 2001;Pinho, Kjos, & Veening, 2013). The relationship between size and shape, as quantified here by circularity and solidity (Figure 7; Figures S12 and S13), has in most cases enabled a distinction between the three systems. This relationship represents therefore an interesting criterion for discriminating non-colonial biologic populations from certain abiogenic systems such as interstitial spaces and silica-carbonate biomorphic crystal aggregates. This relationship is not systematically verified, however, in communities that are strongly dominated by colonial strains. For these strains, the relationship between shape and size depends first on the modalities of association of cells in the colony. For example, filamentous colonies display more elongated and sinuous shapes when they increase in size; they display therefore similar trends as interstitial spaces. In contrast, some colonies may actually preserve their shape during growth, such as colonies with no preferential orientation during cellular divisions, which preserve spheroidal shapes (e.g., colonial chlorophyceae Sphaerocystis sp.- Tsarenko, 2006).

| Fractal dimension and lacunarity
The It must also be noted that these two parameters are strongly affected by the density (especially for lacunarity) and the dispersion or aggregation (especially for fractal dimension) in populations. As a consequence, larger ranges of values would probably be found for the different systems if they were represented by populations from more diverse sources. More studies are needed to establish whether controls specific to each system exist for these parameters.

| Parametric correlations between populations
The shapes of statistic distributions illustrate underlying processes independent of the absolute size or shape of particles in the popula-  Tables S1, S2, and   S3). Interestingly, correlations also appear to exist between different types of distributions in the three systems. For example, the shapes of the circularity and solidity distributions seem related-although, here again, the correlation is not the same for interstitial spaces populations and microbial populations (Figure 8f; Tables S1 and S3).
Overall, the differences in statistic correlations at the system-level allow microbial populations to be distinguished from the two abiogenic systems in our study and constitute a potential biosignature.

| Discrimination of entire populationscomparison of different sets of parameters
It can be seen in Figure 9f and Figure S14 that the total number of parameters used in the discriminant analyses (increasing from set 1 to set 5) is not the only factor controlling the efficiency of discrimination. Indeed, the discrimination achieved with set 2 (five parameters) is similar, or even slightly higher than the one reached with set 3 (six parameters). One explanatory hypothesis is that while Set 3 uses parameters which describe only one aspect of population morphometry, the distributions of shape (solidity and circularity), Set 2 uses parameters which describe two rather different aspects: the distribution of size and general geometry (fractal dimension and lacunarity). Overall, it appears from this study that parameters describing various aspects of population morphometry should be used in order to improve the efficiency of these classification procedures.
The rate of correct classification reaches high values (around 90%); discriminant analyses could therefore be applied in the future to reliably classify populations of microstructures based on their population morphology.

| Application to test microfossil assemblages
Size distributions in microfossil assemblages have been used previously as a biogenicity criterion in micropaleontological studies (Barghoorn & Tyler, 1965;Butterfield & Chandler, 1992;Knoll & Barghoorn, 1977;Köhler & Heubeck, 2019;Schopf & Barghoorn, 1967;Schopf et al., 2010;Sugitani et al., 2007Sugitani et al., , 2010Sugitani et al., , 2013Wacey et al., 2011). In these studies, the absolute range of sizes and the relative width of size distributions (quantified in our study by the mean/SD parameter) were particularly discussed. Qualitatively, the similarity and continuity of shapes (shape distributions in our study) in a population were also proposed as a criterion of biogenicity Buick, 1990;Schopf et al., 2010). However, as noted by Brasier and Wacey (2012) In order to test the contribution of these problems, a zoom-in image T2-sub was selected from T2. This image represents a region that is less heterogeneous in terms of state of degradation and has a higher contrast between microfossils and matrix, making the segmentation of higher quality. We note however that microfossils are still tightly clumped in this area, preventing in places their recognition as individual particles. For this population, the distributions of shapes are very close to those of modern microbial communities ( Figure 10a,d,e, hollow stars), confirming the potential influence of degradational heterogeneities and/or contrasting issues. In contrast, the size distribution still differs from the one expected for modern microbial communities ( Figure S15). It appears that the clumping of microfossils affected size distributions more than shape distributions.
Discriminant analyses do not attribute the three studied populations of the Angmaat Formation to the system of microbial communities ( Figure 10f; Figure S15J,K,L

| Sensitivity to image treatment
Image treatment is a critical step in population morphometry.
Problems linked to image segmentation can lead to (a) imperfect reproduction of the outlines of the objects, (b) clumping of different objects (e.g., when they are overlapping) into a single particle, and/ or (c) erroneous segmentation of objects/areas of the image that do not belong to the system of relevance ("contamination"). All of these issues lead to a discrepancy between observed objects and segmented particles.
The purpose of segmentation is to separate all and only the objects of interest from their background. In the current study, the quality of segmentation depends mainly on the choice of (a) the graylevel threshold and (b) the size threshold. During image segmentation, these thresholds are adjusted manually. Optimal segmentation is reached when there is a maximal equivalence between observed objects and segmented particles. The sensitivity of segmentation-and subsequently, of morphometric measurements-to threshold adjustment was evaluated for one specific image in Figure S16 (segmentation), Figures S17 and S18A,B (morphometric measurements). Figure S16 illustrates that an increase in graylevel threshold creates "contaminant" particles and increases clumping in the segmented image. Increasing the size threshold allows to remove small contaminant particles, but increases the risk of removing objects of interest. Subsequent morphometric measurements are affected by changes in threshold (an example for size distributions is shown in Figure S17). Due to the clumping into large particles and the appearance of small contaminant particles, increases in graylevel threshold stretch the distribution to lower and higher values (evolution of histograms from left to right in Figure S17). The increase in size threshold also modifies the shape of the size distribution by removing smaller sized particles (evolution of histograms from bottom to top on Figure S17).
The comparison between the source image and the segmented images in Figure S16 shows that the highest segmentation quality is obtained for graylevel thresholds of 100 and 150 and a size threshold of 500 square pixels. At these values, the sensitivity of the segmentation and of morphometric measurements to changes in graylevel threshold is small (compare the two central columns in Figures S16 and S17; compare values at 100 and 150 in Figure   S18B,C). However, the segmentation and morphometric measurements appear more sensitive to changes in size threshold (compare lines 1 to 3 in Figures S16 and S17; compare cyan, green, and red curves in Figure S18A,B). In general, these observations, and in particular the threshold intervals for which morphometric measurements are less sensitive, vary among the images. Sensitivity to the threshold choice depends strongly on the source image and on the initial contrast between the objects of interest and their background. The segmentation of microfossil assemblage images, in particular, is more sensitive to changes of thresholding than most others (see Figure S18C-F). However, in the current study, it must be noted that the two images of interstitial spaces for which the segmentation was imperfect (see Section 2.1.1) display the same morphometric trends as the other images of interstitial spaces (Table 1; Figure 8; Figures S1, S2, S11, S12 and S13). This means that final morphometric measurements, to a certain extent, are robust regarding the quality of segmentation.
The threshold that is eventually chosen is the one which optimizes the equivalence between observed objects and segmented particles ( Figure S16). In the future, in order to improve the quality of image segmentation, it will be necessary to use (a) alternative imaging techniques, for example, fluorescence-based ones, which increase the contrast between the objects of interest and their surroundings, or 3D imaging, which would separate overlapping particles and (b) use specific image treatment methods, such as local thresholding.

| General guidelines and limits of population morphometry for life detection
The parameters used in this study to characterize particles or populations of particles (size, circularity, solidity, fractal dimension, lacunarity) are universally applicable. Several criteria, such as the shape of distributions (mean/SD, skewness, kurtosis), or the relationship between shape and normalized size, are also independent of the absolute shapes or sizes. As a result, the approach of this study is highly relevant to the Early Life or Extraterrestrial Life studies, since those are contexts where the sizes and shapes are difficult to predict.
Although the objects considered in this study are three-dimensional, two-dimensional pictures were used. An important part of the structural information is therefore lost: (a) Overlapping particles are not separated and (b) a single plane does not properly describe an anisotropic population. This study, however, is a first proof of concept. For practical reasons, 2D-images were used, since they are much easier to acquire and to treat than, for example, 3D-tomograms. Moreover, micropaleontology data reported in literature most often consist of 2D-images. The goal is thus to create a rapid tool that can be readily applied to images in literature, or can be directly applied to any microscope image. In order to minimize the issue of particle separation during picture analysis, the pictures in this study were specifically chosen that have a small amount of overlapping particles. The issue of anisotropy in a population, however, could not easily be solved. The general methodology described here will be extended to three-dimensional data in the future.
In order to apply this kind of statistical analysis to a population of particles in an image, several conditions must be fulfilled. The objects of interest must be discernable from each other. If all the objects are clumped together in a large mass, no reliable information can be derived. Therefore, individual particles must be recognizable.
Other types of analysis (e.g., network analysis) must be developed for very dense communities such as bundles formed by filamentous strains. (c) In order to facilitate interpretation, the considered objects should all originate from the same system. A mixture of objects coming from different systems will obscure the signal and/or lead to erroneous interpretations. (d) A sufficient amount of particles must be available in the image for statistical analysis. In this study, it was found that ~100 particles constitute a lower boundary; below this number, the statistic distributions are inaccurate representations of the population. (e) For now, spatial heterogeneities should be minimal. The influence of the scale of study on statistic morphometry was not systematically assessed in the current study and will be the object of future research.
The methodology presented here discriminates specific biologic and abiogenic systems: microbial communities of cells, silica-carbonate biomorphs, and interstitial spaces in clastic fabrics.
This choice of systems is of course not absolute. The methodology presented here could be expanded to other systems. In particular, it would be of interest to study populations of other types of organic and/or mineral biomorphs (carbon-sulfur aggregates, manganese oxide precipitates, core-shell aggregates- Cosmidis & Templeton, 2016;Liu et al., 2011;Muscente et al., 2018)-or other types of interstitial spaces-such as in botryoidal or spherulitic chert fabrics (Brasier et al., 2002(Brasier et al., , 2005. In general, it is critical to have geological knowledge of a sample before applying this method. The abiogenic and biologic hypotheses must be known and defined carefully before statistic approaches can be used to discriminate them. Besides, in this study, although samples of different nature and origins were chosen (see Section 2.1), only a limited amount of images were used to describe every system (10/11 images for each system). Therefore, the entire morphometric range of each system is certainly far from being entirely represented. For example, microbial communities with a fundamentally different composition and/or ecology may present different morphometric characteristics. This imperfection is mainly due to the difficulty to access (and treat) a large amount of geologically/biologically relevant data from various sources. An increase in the volume and the diversity of treated images would certainly improve the robustness of an automated discrimination algorithm as presented in this study.

| CON CLUS ION
Morphometric characteristics (size, circularity, solidity, fractal dimension, and lacunarity) were determined for example populations of microstructures of two abiogenic systems (interstitial spaces in clastic rocks, silica-carbonate biomorphs) and one biologic system (microbial communities). At the scale of single populations, it appears that the relationship between the shape of particles (circularity and solidity) and their size allows the consistent distinction of the three systems. At the scale of several populations, the shapes of statistic distributions of size, circularity, and solidity, described here by their mean/SD, skewness, and kurtosis, show significant differences between the three systems.
Some correlations between these parameters call for future exploration. It is found that discriminant analyses realized with the distribution descriptors of size and shape efficiently separate the three groups of populations in 2D-spaces. Using these discriminant analyses, populations from these three systems can be classified automatically with great accuracy. The same morphometric characterization was applied to assemblages of microfossils from the well-preserved 1.0 Ga Angmaat Formation, Baffin Island, Canada. In these assemblages, biologic size and shape distributions are affected by (spatially heterogeneous) diagenesis and by the presence of abiogenic objects such as interstitial spaces.
However, the relationship between size and shape appears to be a biogenic characteristic well preserved in this context. Given the wide applicability of the performed measurements and the very general nature of the observed trends, statistical morphometric analyses appear promising for the identification of microbial remnants in various contexts; they could complement other existing lines of evidence (geologic setting, composition, etc.). This approach can potentially be applied to identify traces of life on other planets, such as Mars. Noticeably, this study extends the concept of morphospace (Raup, 1967) to the population level.
Population-or system-scale morphometry could help in the future to get a better understanding of the morphogenetic controls specific to microbial life. anonymous reviewers for their helpful comments. We thank four anonymous reviewers for their useful comments which greatly improved a previous version of the manuscript. This is IPGP contribution n°4098.

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interest regarding the publication of this article.