Shape matters: cell geometry determines phytoplankton diversity

Organis ms’ size and shape have a profound influence on ecophysiological performance and evolutionary fitness, suggesting a link between morphology and diversity. While unimodal relationships between size and species richness were found for many taxa (1 – 4) , much less is known on how richness is related to shape, in particular in the microbial realm. Here we analyse a novel globally extensive data set of marine unicellular phytoplankton, the major group of photosynthetic microbes, which exhibit an astounding diversity of cell sizes and shapes (5) . We quantify the variation in size and shape and explore their effects on taxonomic diversity (6, 7) . We find that cells of intermediate volume exhibit the greatest shape variation, with shapes ranging from oblate to extremely elongated forms, while very small and large cells are mostly compact (e.g., spherical or cubic). We show that cell shape has a strong effect on phytoplankton diversity, comparable in magnitude to the effect of cell volume, with both traits explaining up to 92% of the variance in phytoplankton diversity. Species richness decays exponentially with cell elongation and displays a log-normal dependence on cell volume, peaking for compact cells of intermediate volume. Our findings highlight different selective pressures and constraints on phytoplankton of different geometry and improve our understanding of the evolutionary rules of life. Phytoplankton are major aquatic primary producers that form the base of most marine food webs and are vital to the functioning of marine ecosystems. Marine unicellular phytoplankton exhibit an enormous morphological richness, with cell volumes spanning many orders of magnitude and dozens of different shape types, from simple spherical to extremely complex cells(5). This huge variety in phytoplankton cell volumes and shapes presents a unique opportunity for investigating evolutionary constraints on morphological traits and their connection to taxonomic richness, because the geometry of a phytoplankton cell plays an important role in its adaptation to the environment. Cell size and shape affect most aspects of phytoplankton survival, from grazing by zooplankton(8, 9) to sinking(10) and diffusion(11), diffusive transport limitation(12) and nutrient uptake(13 North Atlantic (Scotland), and Indo-Pacific Ocean (the South-Western Southern Atlantic Ocean (Brazil). The data covers 23 ecosystems belonging to typology (coastal lagoons, estuaries, coral mangroves and inlets silled that were during the summer period in the years 2011 2012. Three to nine ecosystems per ecoregion and three locations for each system, yielding a total of 116 local sites replicated three times, were sampled. Phytoplankton were collected with a 6 μm mesh plankton net equipped with a flow for determining filtered volume. Water samples for phytoplankton quantitative analysis wer e preserved with Lugol (15mL/L of sample). Phytoplankton were examined Utermöhl’s Phytoplankton were analysed by inverted microscope (Nikon T300E, data on the of the same were averaged for the present analysis, reduce the effect of intraspecific variability, the data were averaged for each genus and local site. Phytoplankton were identified to species or genus level, each was associated with a species-specific geometric model and their relative linear dimensions sampling design, environments and


Abstract
Organisms' size and shape have a profound influence on ecophysiological performance and evolutionary fitness, suggesting a link between morphology and diversity. While unimodal relationships between size and species richness were found for many taxa(1-4), much less is known on how richness is related to shape, in particular in the microbial realm. Here we analyse a novel globally extensive data set of marine unicellular phytoplankton, the major group of photosynthetic microbes, which exhibit an astounding diversity of cell sizes and shapes (5). We quantify the variation in size and shape and explore their effects on taxonomic diversity (6,7). We find that cells of intermediate volume exhibit the greatest shape variation, with shapes ranging from oblate to extremely elongated forms, while very small and large cells are mostly compact (e.g., spherical or cubic). We show that cell shape has a strong effect on phytoplankton diversity, comparable in magnitude to the effect of cell volume, with both traits explaining up to 92% of the variance in phytoplankton diversity. Species richness decays exponentially with cell elongation and displays a log-normal dependence on cell volume, peaking for compact cells of intermediate volume. Our findings highlight different selective pressures and constraints on phytoplankton of different geometry and improve our understanding of the evolutionary rules of life.
Phytoplankton are major aquatic primary producers that form the base of most marine food webs and are vital to the functioning of marine ecosystems. Marine unicellular phytoplankton exhibit an enormous morphological richness, with cell volumes spanning many orders of magnitude and dozens of different shape types, from simple spherical to extremely complex cells (5). This huge variety in phytoplankton cell volumes and shapes presents a unique opportunity for investigating evolutionary constraints on morphological traits and their connection to taxonomic richness, because the geometry of a phytoplankton cell plays an important role in its adaptation to the environment. Cell size and shape affect most aspects of phytoplankton survival, from grazing by zooplankton (8,9) to sinking (10) and diffusion (11), diffusive transport limitation (12) and nutrient uptake (13)(14)(15)(16). While the role of cell size in determining phytoplankton fitness and diversity has been documented previously (6,7), not much is known about the role of cell shapes.
Here, we characterize broad patterns in cell shapes and their relationship with cell volume and taxonomic richness across main phyla of unicellular marine phytoplankton and heterotrophic dinoflagellates (together called below, for brevity, phytoplankton). We compiled one of the most comprehensive data sets of phytoplankton in terms of sizes, shapes and taxonomic diversity from seven globally distributed marine areas: North Atlantic (Scotland), Mediterranean Sea (Greece and Turkey), Indo-Pacific (the Maldives), South-western Atlantic (Australia), Southern Atlantic (Brazil) and Baltic Sea (see Methods). The data comprises 5,743 cells of unicellular phytoplankton from 402 genera belonging to 16 phyla. We classified each cell as one of 38 fundamental geometric shapes, such as spheres, cylinders, prisms, etc., measured cell linear dimensions and calculated the surface area and volume for each cell (17-19) (see Methods). Cell volumes span almost 10 orders of magnitude, from 0.065 m 3 for the cyanobacterium Merismopedia to 5 • 10 8 m 3 for Dinophyceae's Noctiluca.
The degree of shape elongation can be expressed as the aspect ratio and surface relative extension (see Methods). The aspect ratio, , characterizes the linear dimension of cell elongation, and is less than one for oblate (flattened at the poles) shapes, and is greater than 1 for prolate (stretched) shapes. We also define a shape as compact if 2/3 < < 3/2. The surface extension, , shows the relative gain in surface area of a cell compared to a sphere with the same volume. The minimum level of surface extension is shape-type specific and equals 1 for spheres, 1.14 for cylinders, 1.24 for cubes, and 1.09 for double cones (see Methods). The two measures of shape elongation are related, and the logarithm of the aspect ratio changes approximately with the square root of surface extension (Extended Data Fig. 2).

Variation in cell shape
We found that the taxonomic diversity across different phyla varies with cell shape type and elongation (Fig. 1A). Most Bacillariophyta (diatoms) are cylindrical or prismatic, while other phyla are mostly ellipsoidal, with additional shapes, e.g., conic or of a more complex geometry, being relatively rare. In our database, 46% of genera are prolate, 38% compact and only 16% oblate (Fig. 1B). These proportions vary across phyla and shapes (Extended Data Fig. 1). For instance, more than half of genera classified as elliptical cells have a compact shape, while for other shapes more than half of genera have prolate cells. Oblate shapes comprise up to 20% of genera in diatoms, dinoflagellates (Miozoa), Haptophyta, Charophyta, Cryptophyta, and Euglenozoa, but are rarer (< 10%) in other phyla. Half-shapes such as half-spheres or half-cones are more dominated by oblate forms.
Shape elongation is hypothesized to influence phytoplankton fitness. Several studies argued that elongation is beneficial for the volume-specific nutrient uptake and, therefore, large cells should be elongated to increase the surface to volume ratio (20,21). However, our analysis, based on one order of magnitude more cell measurements than that of previous studies (20,21), shows that cell surface area increases with volume approximately to the power of 2/3 ( Fig. 2A), indicating that cell dimensions scale on average with volume isometrically, and there is no evidence for shape elongation with increasing volume.
By contrast the variation in cell elongation strongly depends on cell volume (Fig. 2C, Extended Data Fig. 3). The distribution of the surface extension as a function of cell volume is approximately humpshaped, with a peak of cell elongation at intermediate volumes (between 10 3 − 10 4 3 ), where the cell surface area can exceed the surface area of a sphere with an equivalent volume up to 5-fold. In contrast, for cells of very small or large volume, surface extension approaches its minimum values, implying that these cells have a compact shape minimizing their surface area. The hump-shaped pattern is also seen in the 75% and 90% quantiles (Fig. 2C), giving evidence that this is not a sample artefact. The same pattern emerges for the aspect ratio, which reaches 100 for prolate cells and drops to 0.025 for oblate cells (Extended Data Fig. 3). This pattern also holds across different trophic guilds (autotrophic, mixotrophic or heterotrophic); however, the maximum cell elongation is reached only by the autotrophs, while in heterotrophs and mixotrophs the maximum aspect ratio equals 10 and the maximum surface extension equals 2 (Extended Data Fig. 4), likely because these two groups need to swim actively.

Influence on taxonomic diversity
Both cell volume and surface extension have a strong effect on taxonomic diversity, , measured here as richness of genera. It follows a lognormal function of volume (Fig. 2D, 0 = 1100 ± 90 3 , 2 = 0.98) and decreases exponentially with shape surface extension (Fig. 2E, ~− 1.43 , 2 = 0.97). Both relationships vary across cell shapes (Extended Data Fig.5, 6). For ellipsoidal cells, the diversity peaks at the smallest volume compared to other shapes ( 0 = 320 ± 40 3 , 2 = 0.91) and has the fastest rate of decrease with surface extension (~− 2.3 , 2 = 0.8), with 54% of genera exceeding surface area of a sphere by less than 10% only. This suggests that in ellipsoidal cells, the reduction of surface area may be advantageous and might lead to a greater diversification. For cylindrical cells (mainly diatoms), diversity peaks at the largest . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint volume compared to other shapes ( 0 = 8,700 ± 800 3 , 2 = 0.98) and declines slower with surface extension (~− 1.5 , 2 = 0.99). There is a comparable effect of surface extension on diversity for conic shapes (~− 1.5 , 2 = 0.93), whereas the effect is weaker for prismatic (~− 0.8 , 2 = 0.57) and complex shapes (~− 0.7 , 2 = 0.63). Both classes include mainly diatoms. The secondary peaks of diversity at between 1.5 and 3 for these shapes suggest that in some cases cell elongation might have a nonmonotonic effect on cell fitness, such that both compact and elongated cells can have high fitness (13).
In summary, both cell volume and surface extension are important drivers of taxonomic diversity which can be approximated as product of a lognormal function of volume and a decreasing exponential function of surface extension As shown in Fig. 3, this function describes the dependence of diversity on both cell volume and extension remarkably well, with 0 = 1,000 ± 200 3 , = 1.74 ± 0.08, = 1.47 ± 0.06 explaining 92% of the variation of phytoplankton diversity for the entire dataset. Across shape types the fit parameters have the same variance as above: the best match is obtained for ellipsoidal, cylindrical and conic shapes ( Fig. 3B-D), and a poorer fit for prismatic and other shape types ( Fig. 3E-F). Diversity of species with elliptical cells peaks at smaller volumes compared to other shape types and has the fastest decrease with surface extension. This implies that the composition of shapes should also change with and . A comparison of the predicted and the observed diversity shows that the fit is unbiased across all shapes, and also in the group of ellipsoidal, cylindrical and conic shapes (Extended Data Fig. 7A-D). However, it overestimates taxonomic diversity of prismatic and other shapes for the ranges of volume and surface extension where the observed diversity is low (Extended Data Fig. 7E-F).
Similarly, the diversity can be related to cell volume and aspect ratio (Methods). However, the aspect ratio has a more complicated functional relationship with taxonomic diversity, which is likely due to the non-linear relationship between these two parameters (Extended Data Fig. 2). Although, on average, the diversity predictions obtained using aspect ratio are poorer than those based on surface extension, aspect ratio is easier to measure with automated plankton monitoring (22).

Discussion
Our study shows that cell surface area increases on average isometrically with cell volume, but the variation in cell elongation exhibits a hump-shaped dependence of cell volume. Surprisingly, the shapes of cells of intermediate volume range from oblate and to extremely prolate forms, while cells of both large and small volumes are compact (mostly spherical). To what extent can this pattern be explained by the constraints on cell dimensions? Linear cell dimensions range from 0.5 m to 1,000 m (Fig. 2B). The minimum cell size is likely constrained by the size of organelles; for instance, for autotrophs the minimum chloroplast size equals 1 (23,24). The maximal feasible cell size can be limited by the scale of diffusive displacement of proteins in cytoplasm during the cell cycle (see Methods). Thus, the minimal (or maximal) cell volume can only be realized in a compact geometry in which all three linear dimensions are approximately equal. A model based on these constraints correctly predicts that the smallest and largest cells should be compact, while cells of intermediate volumes can have a diverse geometry (Extended Data Fig. 3). The model, however, overestimates the measured range in surface extension, yielding values of > 10 for prolate cells and > 30 for oblate cells. This discrepancy indicates the existence of further physiological constraints on cell . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint geometry. In an improved model we assume that cell aspect ratios can vary from 0.025 to 100 only (Fig. 2B). As < 1000 , the allowed range reduces with increasing and approaches 1 when approaches 1,000 m (Fig. 2B, solid line). This constraint may reflect limitations due to mechanical instability, material transport needs within a cell, or reduced predator defence experienced by extremely prolate or oblate cells. With this constraint, the model and the data agree well for prolate cells, but the theoretical model still overestimates the potential surface extension for oblate cells of large volumes (Extended Data Fig. 3). This suggests that there may be unknown additional constraints that prevent the evolution of extremely wide oblate cells large volume.
Our study shows that cell shape, along with cell size, is an important predictor of taxonomic diversity. Diversity is a lognormal function of volume and decreases exponentially with cell surface extension. As diversity typically increases with abundance(2), we hypothesize that species with compact cells of intermediate volume have the highest fitness among unicellular plankton. Thus, the so-called equivalent cell radius is not a proper cell characteristic, because elongated and spherical cells with the same volume are not equivalent in terms of potential diversity and likely fitness. In particular, small or large spherical cells can have the same or even higher richness than elongated cells of intermediate volume.
We also conclude that for all phyla, except for prismatic and complex shapes (mainly diatoms), the minimization of cell surface area is a beneficial strategy independent of cell volume. Reducing cell surface area likely reduces the cost of cell wall, which may be expensive, and makes a cell less vulnerable to predators. In contrast, having a non-spherical shape is easy only for species with a rigid cell wall, such as diatoms (25,26). This can explain why for prismatic and complex shapes we observe secondary peaks of richness for elongated shapes, resulting in significant diversity of diatom shapes across a wide range of cell elongation. This suggests that the appearance of silica cell walls in diatoms is a major evolutionary innovation that allows diatoms to achieve an unusually great shape diversity, which may have contributed to the ecological success of this group (27,28).
The surprisingly good prediction of global taxonomic richness of marine plankton by cell volume and surface relative extension implies either a fundamental metabolic relationship between these parameters and speciation rates or a specific global distribution of niches favouring oblate and prolate shapes in competition with compact shapes, as the environment can select certain cell morphology (29,30). In particular, strongly elongated shapes occur in deep waters (31). Our study suggests that this phenomenon can have another explanation, as elongated shapes might dominate at depths because building complex cell wall is cheaper under high nutrient conditions characteristic of deeper layers, compared to low nutrients of the upper layer. A link between phytoplankton diversity and morphology has not been explored much and previous studies on the topic did not show a consistent pattern. In particular, local species richness showed either a hump-shaped function or was independent of cell volume (7), or decreased as a power function of volume (6). There may be several explanations for the discrepancy between our and previous results. First, unlike previous studies, we consider cell surface extension as an important driver and separate its effects from the effects of cell volume. Second, our study includes a wider range of cell volumes and, third, it includes samples from world ocean ecosystems of various typology and in different times, so this global pattern may be different from the local patterns influenced by specific environmental conditions, such as nutrient or light levels, grazing, species sorting or mass effects.
Our results show the complex effects of cell size and shape on taxonomic richness and open new avenues of research. In particular, there may be an indirect effect of temperature and nutrients on diversity arising from their effect on cell volume (32,33), which would be important to investigate in the context of rapid environmental change. We don't know what selection pressures favour . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint prolate or oblate cells and what environmental niche those cells occupy or if there is a link between cell elongation and generalist or specialist strategies. Answering these questions would help us understand the ecological and evolutionary constraints on phytoplankton diversity in the ocean.

Databases
We combined two databases on biovolumes and size-classes of marine unicellular phytoplankton (see Data Availability statement).

Baltic Sea
The first database includes information on phytoplankton species and heterotrophic dinoflagellates, covering a total of 308 genera found in the different parts of the Baltic Sea since the 80s of the 20th century to 2018 (PEG_BVOL, http://www.ices.dk/marine-data/Documents/ENV/PEG_BVOL.zip). The measurements were prepared by the HELCOM Phytoplankton Expert Group (PEG) and originally in more detail described by Olenina et al. (18). The phytoplankton samples were taken in accordance with the guidelines of HELCOM (1988) as integrated samples from surface 0-10, or 0-20 m water layer using either a rosette sampler (pooling equal water volumes from discrete 1; 2,5; 5; 7,5 and 10 m depth) or with a sampling hose. The samples were preserved with acid Lugol's solution(34). For the phytoplankton species identification and determination of their abundance and biomass, the inverted microscope technique(35) was used. After concentration in a sedimentation 10-, 25-, or 50ml chamber, phytoplankton cells were measured for the further determination of species-specific shape and linear dimensions. All measurements were performed under high microscope magnification (400-945 times) using an ocular scale.

Different ecoregions around the globe
The second database includes a biogeographical snapshot survey of natural phytoplankton and heterotrophic dinoflagellates communities obtained by Ecology Unit of Salento University (https://www.lifewatch.eu/web/guest/catalogue-of-data) (36). The data cover a total of 193 genera and were sampled in five different coastal ecoregions: North Atlantic Sea (Scotland), Mediterranean Sea (Greece and Turkey), Indo-Pacific Ocean (the Maldives), South-Western Atlantic Ocean (Australia) and Southern Atlantic Ocean (Brazil). The data covers 23 ecosystems belonging to different typology (coastal lagoons, estuaries, coral reefs, mangroves and inlets or silled basins) that were sampled during the summer period in the years 2011 -2012. Three to nine ecosystems per ecoregion and three locations for each system, yielding a total of 116 local sites replicated three times, were sampled. Phytoplankton were collected with a 6 μm mesh plankton net equipped with a flow meter for determining filtered volume. Water samples for phytoplankton quantitative analysis were preserved with Lugol (15mL/L of sample). Phytoplankton were examined following Utermöhl's method (35). Phytoplankton were analysed by inverted microscope (Nikon T300E, Nikon Eclipse Ti) connected to a video-interactive image analysis system (L.U.C.I.A Version 4.8, Laboratory Imaging). Taxonomic identification, counting and linear dimensions measurements were performed at individual level on 400 phytoplankton cells for each sample. Overall, an amount of 142 800 cells constitutes the present data set. The data on the dimensions of the same species were averaged for each replica and for the present analysis, to reduce the effect of intraspecific variability, the data were averaged for each genus and local site. Phytoplankton were identified to species or genus level, each cell was associated with a species-specific geometric model and their relative linear dimensions were measured. Detailed information about sampling design, sampled environments and taxonomic . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint list of phytoplankton can be found on the website of the project (http://phytobioimaging.unisalento.it/) (36).

Combined data set
Combining both data set, we obtained a data base that contains information on phytoplankton cell shape type and linear dimensions of a total of 402 genera of unicellular marine phytoplankton (phytoplankton and heterotrophic dinoflagellates) from 7 locations: Baltic Sea, North Atlantic Ocean (Scotland), Mediterranean Sea (Greece and Turkey), Indo-Pacific Ocean (the Maldives), South-Western Atlantic Ocean (Australia), South Atlantic Ocean (Brazil). Phyla were identified according to www.algaebase.org (37).

Cell volume and surface area
We calculated cell volume and surface area based on formulae published earlier(17, 19, 38) and http://phytobioimaging.unisalento.it/AtlasofShapes. To standardize the calculations for both databases and automate the process, we have rederived all formulae using Maple software and corrected some formulae, yielding a list of analytic expressions for cell volume and cell surface area for each of the 38 shape types (see Supplementary material for the entire list of rederived formulae and a Maple script, which can be used as a tool for further derivations).

Cell dimensions
To characterize cell linear dimensions in 3D space additionally to cell microscopic characteristics, which can include up to 10 measurements of different cell parts, we use 3 orthogonal dimensions of each cell, charactering the minimal, middle and maximal cell dimensions, which are denoted as , and . For most of shapes such as sphere, ellipsoid, cube or cone the meaning of these dimensions is clear. For some asymmetrical cells with, for instance, different horizontal extents at the top and bottom, we used the largest of these two extends, because the smallest one (or average) does not properly describe the geometric limitations. For instance, a truncated cone is characterized by the height and the radius at the top and bottom. However, the top radius is typically extremely small, and is not related to the geometric constrains. Thus, for such shapes we used height as one dimension and the doubled bottom radius as the other two dimensions. For more complex shapes, consisting of few parts measured separately (e.g., half ellipsoid with a cone), we used the sum of linear dimensions of these parts as projected to each orthogonal axis (see Supplementary material for the details for each shape type).

Measures of cell elongation
To characterize cell elongation, we used aspect ratio and relative surface extension (calculated as the inverse shape sphericity). For cells with axial symmetry the aspect ratio is defined as the ratio between the principal axis of revolution and the maximal diameter perpendicular to this axis. It indicates the linear cell elongation and is greater than one for prolate shapes, equal to one for shapes with equal linear dimensions (cubes, spheres, cones with equal height and bottom diameter, etc.), and less than one for oblate shapes. To generalize the definition of aspect ratio for cells without axial symmetry, we classify a cell as prolate, if < √ , so is closer to the minimal dimension in terms of geometric averaging, and as oblate, if > √ . For prolate cells the aspect ratio equals / , for oblate cells we use the inverse value. Note that due to intraspecific and intragenus variability cells of the same genera can be attributed to various elongation types.
. CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint The relative surface area extension, , shows the gain in surface area due to the deviation from a spherical shape and is calculated as the ratio of the surface area of a cell with a given morphology to the surface area of a sphere with the same volume, = √36 3 2/3 ⁄ . Mathematically it can also be termed the inverse shape sphericity.
Values of surface extension are bounded from below by a shape dependent boundary. For an ellipse the minimal surface extension = 1 is achieved when all semi-axes are equal, that is, if the ellipse is a sphere. For a cylinder = (3/2) 1/3 = 1.14, when its height equals diameter; for a parallelogram or prism on a square base = (6/ ) 1 3 ≈ 1.24 (cube).

Prolate, oblate and compact cells
Strictly speaking, only cells with aspect ratio of 1 are neither prolate nor oblate and can therefore be identified as compact. However, the aspect ratio changes over four orders of magnitude, and cells with a small difference in linear dimensions are closer to the compact shapes than to extremely oblate or prolate cells. To separate these groups, we define a cell to be compact if / < 3/2, so that the maximal cell dimensions is less than 150% of the minimal dimension. Such a choice of the border between compact, prolate and oblate cells is due to the specific dependence between the aspect ratio and surface extension (Extended Data Fig. 2). As shown in the Extended Data figure, for cells with small , the aspect ratio changes much faster than surface extension. As the border of the aspect ratio can be approximated as log = ±1.3√ − 1, the aspect ratio of 3/2 (or 2/3) can correspond to only a 2% increase in the surface area with respect to a ball.

Using aspect ratio for predicting biodiversity
Like the surface extension, the aspect ratio can be used as predictor of taxonomic diversity. The regression analysis based on volume and aspect ratio gives 2 = 0.89 across all data and 2 ranging from 0.23 to 0.86 for specific shapes (Extended Data Fig. 8). The reduced 2 values compared to the fitting based on surface extension probably occur because of a more complicated functional dependence of diversity on aspect ratio (Extended Data Fig. 9). For instance, for ellipsoidal prolate shapes diversity monotonically decreases with aspect ratio but shows a peak for oblate shapes at ≈ 1/2. For cylinders the picture is even more complicated with two peaks of diversity at ≈ 3 and 1/3.
The discrepancy between the dependence of diversity on the surface extension and aspect ratio occurs likely from the nonlinear relationships between these parameters (Extended Data Fig. 2). The logarithm of aspect ratio changes approximately as √ − 1, implying an extremely high rate of change of aspect ratio with for compact shapes, and a much smaller rate for elongated shapes. Consequently, projecting diversity onto the surface extension axis results in an exponential decrease, while projecting it on the aspect ratio axes results in a bimodal distribution with a local minimum of diversity shapes for = 1. However, the projections show only a part of the entire picture. As, shown in the bivariate plot (Extended Data Fig. 2A) the diversity peaks for spherical cells (both surface extension and aspect ratio of around 1) and then decreases with deviation from this shape towards prolate or oblate forms. This decrease is asymmetric and occurs faster for oblate shapes.

Diffusion constraints on the cell's longest linear dimension
The mean diffusive displacement in a 3D space equals √< 2 >= √6 , where D is the diffusion coefficient and time interval. The maximal cell size can be limited by the mean diffusive . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint displacement of molecules in cell cytoplasm during one life cycle. For instance, the diffusion of proteins in cytoplasm of bacteria, Escherichia coli, ranges from 0.4 to 7 2 / (Ref (39)). Diffusion rates in cytoplasm presented in the Cell Biology by then Numbers database (http://book.bionumbers.org/what-are-the-time-scales-for-diffusion-in-cells/) lay also in this range. According with this data, the mean diffusive displacement in the cell cytoplasm during one day (a typical reproduction time scale for phytoplankton) should range from 455 to 1900 m. These values are close to the maximal cell size of 1000 , found in our study.
. CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint
. CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Fig. 1. Diversity distribution across phyla (rows) and shape types (columns). The area of each figure is proportional to the number of genera (shown next to or within it). See Extended Data Fig. 1 for detailed analysis.
CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint  Table 1 for fitting parameters.
A B E C D . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint  Table 1 for regression results, and Extended Data Fig. 7 for comparison between predicted and observed diversity. . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 2. Bivariate effect of cell surface extension and aspect ratio on diversity. (A) Distribution of taxonomic diversity (shown by colour) over aspect ratio (logarithmic binning) and surface extension. The grey line shows a fitting parabola log = ±1.3√ − 1 to the upper boundary of the aspect ratio for a given surface extension. Horizontal red lines at = 3/2 and = 2/3 show the borders between compact, oblate and prolate cells, as defined in Methods. Diversity peaks for compact cells with smallest sphericity ( = 1, = 1) and decreases both with increasing surface extension and absolute value of logarithm of aspect ratio. (B) Distribution of taxonomic diversity over aspect ratio. When projected on this axis the distribution of diversity shows peaks for cells with = 2 and = 1/2. We suppose that these peaks occur due to the specific shape of the distribution in Fig. A, where aspect ratios of compact shapes can change very fast with a small increase in surface extension, so the distribution is strongly stretched in the vertical direction resulting in a local minima at = 1. (C) Distribution of taxonomic diversity over surface extension. In this projection the diversity distribution decays exponentially.

A B
С . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 3. Prediction of surface extension and aspect ratio using various constraints on linear cell sizes. To make a theoretical prediction of a potential variation in aspect ratio and surface extension across cells we calculate the surface area and volume of ellipsoidal cells based on two models: (i) assuming all three linear dimensions are log-uniformly distributed in the range from 1 to 1000 and (ii) additionally assuming that the aspect ratio is constrained, / < 100 for prolate cells and / < 40 for oblate cells. (A) Comparison of the aspect ratio of prolate (red circulars) and oblate (blue circulars) cells with outer hulls for volume and aspect ratio in the first model (black line) and in the additionally constrained model (black dashed line). (B, C) the same for combinations of volume and surface extension for prolate (B) and oblate (C) cells. The first model, assuming only that cell dimensions can vary from 1 to 1000 , reproduces the hump-shaped dependence of maximal aspect ratio and elongation on volume (black solid lines show the outer hulls across 50,000 ellipsoids with randomly chosen linear dimensions), but this model strongly overestimates the maximal possible aspect ratio (ranges from 10 -3 to 10 3 ) and surface extension (achieves 20 for prolate ellipsoids and more than 30 for oblate ellipsoids). The second model, with an additional constraint on cell aspect ratio, makes a relatively good prediction of the variation of aspect ratio and surface extension as a function of volume for prolate cells, but it overestimates the aspect ratio and surface extension for oblate cells. In particular, the model predicts that surface extension for oblate species can reach 5, while the observed maximal surface extension for oblate cells equals 2 for cells with intermediate.
Oblate . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ) to the histogram (see Extended Data Table 1 for fitting   parameters).
. CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 6. Distribution of taxonomic diversity as a function of surface extension for the most common shapes types partitioned by phyla. Solid lines show a least square fit of a linear function ln = − to the log-transformed histogram (see Extended Data Table 1 for fitting parameters).
A B C D E F . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 7. Comparison of observed diversity and diversity predicted based on nonlinear regression models in Fig. 3 (blue dots). Black dashed lines shows 1:1 diagonals and solid lines are linear regressions through the data points. The closer the solid line is to the dashed line, and the smaller the variability of datapoints around this line, the better is the prediction of diversity by the model function = exp(−(log − 0 ) 2 /(2 2 ) − ) in Fig. 3 (main text). An increase in the variation of the predicted diversity in the range of small can partly be explained by the fact that observed is constrained by 1, while predicted values can be less than 1. The regression analysis shows that the predictions for ellipsoidal (B), cylindrical (C) and conic (D) shapes are unbiased, because the solid and dashed lines are almost parallel. By contrast, for prismatic (E) and other shapes (F) the regression lines deviate from the diagonals, and the model is biased as predictions of diversity in the range of small observed are overestimated. However, as prismatic and other shapes are relatively rare, the model provides also a good and unbiased prediction of diversity across all shapes (A).
. CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 8. Effects of volume and aspect ratio on richness. Bivariate histogram of taxonomic diversity as a function of aspect ratio and volume. To reduce the number of fitting parameters the aspect ratio here is measured as / , so that no distinction between prolate and oblate cells has been made. Note that due to intraspecific and intragenus variability cells of the same genera can contribute to diversity in different bins. To provide a better fit for prismatic and other shapes (E, F), where diversity peaks at intermediate values of the aspect ratio, we also assumed a Gaussian dependence on the aspect ratio. See Extended Data Table 1 for the results of regression analysis.
A B C D E F . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint Extended Data Fig. 9. The same as in Extended Data Fig. 6 but plotted as a function of the logarithm of the cell aspect ratio.

A B
C D E F . CC-BY-ND 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted February 7, 2020. ; https://doi.org/10.1101/2020.02.06.937219 doi: bioRxiv preprint 0.98 155±4 1100±100 1.36±0.04