Quantitative photography for rapid, reliable measurement of marine macro‐plastic pollution

Plastics are now ubiquitous in the environment and have been studied in wildlife and in ecosystems for more than 50 years. Measurement of size, shape and colour data for individual fragments of plastic is labour‐intensive, unreliable and prone to observer bias, particularly when it comes to assessment of colour, which relies on arbitrary and inconsistently defined colour categorisations. There is a clear need for a standard method for data collection on plastic pollution, particularly one that can be readily automated given the number of samples involved. This study describes a new method for standardised photography of marine plastics in the 1–100 mm size range (meso‐ and macro‐plastics), including colour correction to account for any image‐to‐image variation in lighting that may impact colour reproduction or apparent brightness. Automated image analysis is then applied to detect individual fragments of plastic for quantitative measurement of size, shape, and colour. The method was tested on 3793 fragments of debris ingested by Flesh‐footed Shearwaters (Ardenna carneipes) on Lord Howe Island, Australia, and compare results from photos taken in two separate locations using different equipment. Photos were acquired of up to 250 fragments at a time with a spatial resolution of 70 μm/pixel and were colour‐corrected using a reference chart to ensure accurate reproduction of colour. The automated image analysis pipeline was found to have a 98% success rate at detecting fragments, and the different size and shape parameters that can be outputted by the pipeline were compared in terms of usefulness. The evidence shown in this study should strongly encourage the uptake of this method for cataloguing macro‐scale plastic pollution, as it provides substantially higher quality data with accurate, reliable measurements of size, shape and colour for individual plastics that can be readily compared between disparate datasets.


| INTRODUC TI ON
Since the development of cheap, durable plastics in the 1950s, plastic debris has become a major source of anthropogenic pollution found in polar regions (Obbard et al., 2014), on remote oceanic islands (Lavers & Bond, 2017), mountains (Napper & Thompson, 2020), abyssal regions (Chiba et al., 2018), and the atmosphere (Brahney et al., 2020).Plastics are present throughout the entire ocean, with an estimated 170 trillion particles of plastic floating at the surface (Eriksen et al., 2014(Eriksen et al., , 2023)).The impact of plastic on the marine environment has become a topic of high concern, particularly on the health of wildlife (Kühn & Van Franeker, 2020;Laist, 1997).However, marine plastic pollution also has negative implications for tourism (Krelling et al., 2017;Zielinski et al., 2019), habitat stability (Lavers et al., 2021;Zhang et al., 2022), incidence of disease (Lamb et al., 2018) and marine biodiversity as a whole (Gall & Thompson, 2015).
The wide variety of sizes, shapes and polymer densities means that plastic items can disperse throughout the entire water column, from the ocean surface to benthic sediment (Wang et al., 2019).
Consequently, plastics affect a wide range of fauna occupying different trophic levels and habitats within the marine environment (Cole et al., 2016;Unger et al., 2016).In the last 25 years the number of marine species known to be impacted by plastics has increased significantly to over 4000 species (Laist, 1997;Tekman et al., 2023).Wildlife is affected by plastic pollution though entanglement and ingestion (Gall & Thompson, 2015).Currently, 44% of seabird species, 56% of marine mammal species and 100% of sea turtle species have been reported to ingest plastic (Kühn & Van Franeker, 2020).Plastic ingestion leads to increased morbidity and mortality via starvation, tissue damage and scarring, and transfer of toxic co-pollutants such as trace metals (Charlton-Howard et al., 2023;Lavers & Bond, 2016a;Lavers et al., 2014;Szabo et al., 2021).
The impact of ingested plastics on marine fauna may be strongly influenced by the size, shape, and colour of individual objects: large items may displace real food or become trapped in the digestive tract, highly angular objects may perforate tissues, the release of toxic chemicals and co-pollutants will depend in part on surface area, and there is mounting evidence that fauna selectively target certain colours of plastic during foraging (Duncan et al., 2019;Lavers & Bond, 2016b;Okamoto et al., 2022;Ryan, 1987).Size, shape and colour parameters may also provide indications of a plastic object's history, for example, fragmentation of large primary plastics into smaller secondary plastics (Hartmann et al., 2019), and discolouration of plastic by environmental weathering or biological activity (Andrady, 2017;Provencher et al., 2017).As plastics in the environment continue to accumulate, it is becoming increasingly important to have reliable, consistent, and accurate methods for quantifying the abundance but also size and colour, of plastic pollution if we are to understand its impact on the environment and individual species (Barrows et al., 2017).
Yet, to date, there is no universally accepted method for categorising plastic colour and particle size that can be readily deployed.Studies have adopted their own ad-hoc methods, leading to many colour and size categories that are neither comparable nor easily reproduced (see Tables S1-S3), hindering a wider synthesis of patterns and trends across space, time, and species (Provencher et al., 2017).This requires detailed, accurate and consistent data on the quantity, size, shape and colour of individual pieces of plastic (Cowger et al., 2020;Provencher et al., 2017).When these properties are reported, most are determined by hand: researchers manually counting pieces of plastic, measuring their size with callipers or by sieving, and categorising colour based on visual inspection (Lavers et al., 2021;Provencher et al., 2017).Even with training, such methods can be slow, laborious and prone to human error or unintended bias (Kotar et al., 2022).The assessment of colour is particularly subjective; the apparent colour of an object will vary significantly depending on lighting and context and may be categorised differently by different people, which necessitates a relatively small number of colour categories for reproducibility.Several studies have highlighted the disadvantages of using non-standardised methods for data collection and reporting, however, despite efforts to draw attention to this important issue, there continues to be poor uptake of recommended standards (Choi et al., 2021;Lavers et al., 2022).This has made it difficult to identify the impacts of plastic on vulnerable species, ecosystems, and the Earth as a whole (Avery-Gomm et al., 2018;Villarrubia-Gómez et al., 2018).
Digital photography and automated image processing offer a potentially valuable method for rapidly and accurately counting and measuring plastics.Digital photography can be used to take consistent images of plastics, with appropriate processing, that can be processed by open-source software to detect individual objects and measure their size, shape and colour.This method combines well-established techniques such as thresholding and shape analysis that have been in use in other research fields such as microbiology and computer vision for many years (Lamprecht et al., 2007) but are readily applicable to the challenge of rapidly and reliably detecting hundreds of pieces of plastic in a single photograph.We will also discuss how to account for variations in camera and lighting conditions to ensure consistent analysis of colour, similar to methods now being used in visual ecology (Troscianko & Stevens, 2015;van den Berg et al., 2020).We will discuss how this method can be used to characterise approximately 3800 pieces of plastic and pumice that were removed from the stomachs from seabirds, with the goal of demonstrating and testing the methodology rather than interpret the results from this sample-set in terms of implications for plastic ingestion by seabirds, which will be addressed in subsequent work.By developing a new data collection and analysis pipeline specifically for plastic pollution, we can provide a consistent method for environmental scientists that improves both the quantity and quality of data they collect from fieldwork samples, which is urgently needed to inform better policies regarding the manufacturing, use and safe disposal of plastic materials.were counted separately.The bird samples varied in size between 7 and 216 objects, for a total of ~670 objects.To remove potential bias, observers completed the sorting exercise separately and did not know the outcome of the image analysis method.The time taken to count and sort the sample sets was recorded for three out of nine observers.

| Photography
Photographs were taken of the material obtained from each bird.
Where possible, material was grouped based on whether it was removed from the proventriculus or gizzard.Samples from 2021 were photographed at the NHM using a Canon EOS 4000D digital singlelens reflex (DSLR) camera with an EF-S18-55 mm III lens.The camera was in manual mode with the following settings: exposure time of 1/60th s, f-number f/5.6, focal length 18 mm, and an ISO speed of 200.Photos of samples from 2022 were taken at UTAS using a Canon EOS 1500D DSLR with an EF-S18-55 mm f/3.5-5.6 III lens, and the following settings: exposure time 1/60th s, f-number f/5.6, focal length 36 mm, ISO speed of 200.In both cases, the camera was mounted above the samples, which were laid on crushed black velvet.Illumination was provided by an RB 5020 DS2 LED Lighting Unit (colour temperature: 6000 K) with two light banks mounted at roughly 45° either side of the camera stand (Figure 1).An X-Rite (current brand name: Calibrite) ColorChecker Classic mini reference chart with scale bar was included in all photographs.The field of view was estimated from image size and scale bars to be 36.4× 24.3 cm (a resolution of 0.0703 mm/pixel) for the 2021 NHM photos, and 40.7 × 27.1 cm (0.0678 mm/pixel) for the 2022 UTAS photos.Photos were saved in lossless RAW digital format as.CR2 files.

| Image pre-processing
Photos were imported into Adobe LightRoom (version 6.3.1) for lens distortion correction.Lens-corrected images were saved in TIFF format using the sRGB colour space (IEC61966-2.1).All processing The laboratory camera set-up used at the Natural History Museum (NHM) in Tring, UK.(b) A typical photo acquired using the NHM set-up.This photo (#0226) shows 147 fragments removed from the proventriculus of bird BW-2021-19 along with the colour reference chart, which includes a 5 cm scale bar, six greyscale squares, and 18 colour squares.times were measured for a MacBook Pro M1 Max with 10 cores and 32 GB RAM.

| Colour correction
A custom Python script was used to find the 24 squares of the reference chart in each photo and calculate the most effective 3D transformation that would align observed RGB values with reference values for the (post-2014) ColorChecker Classic card (Colorchecker Data, 2023).The transformation was described by a 3rd order Vandermonde matrix whose parameters were optimised via linear least squares reduction of the difference between observed and reference values, using the "colour" python package (Mansencal et al., 2022).The transformation was applied to the original photo, which was then saved in TIFF format for subsequent analysis.This script used the following packages: numpy (Harris et al., 2020), pandas (McKinney, 2010), scipy (Virtanen et al., 2020), skimage (Van Der Walt et al., 2014), and colour (Mansencal et al., 2022).Colour conversion to/from CIE LAB assumes a D50 standard illuminant and standard observer (Schanda, 2007).The colour correction Python script is available online for people to freely use (Razzell Hollis, 2023a), and colour-corrected images are available on the NHM Data Portal (Razzell Hollis et al., 2023).

| Object detection
Pre-processed images were batch-imported into CellProfiler (version 4.2.1), an open-source image analysis software package (Lamprecht et al., 2007).Each image was converted to greyscale, cropped to the area containing only plastic fragments, and objects detected using global Otsu two-class thresholding with a minimum value of 0.18 to reduce the likelihood of false positives in images with very few fragments (Otsu 1979).Objects with equivalent circle diameters outside of 20-1000 pixels (1.41-70.3 and 1.35-67.8mm for 2021 and 2022 photos respectively) were automatically discarded, and any holes within detected objects were filled in.The following parameters were calculated for each detected object: maximum and minimum Feret diameters (in mm), major and minor axis lengths (mm), area (mm 2 ), elliptical eccentricity, compactness and solidity (all in arbitrary units) and mean red, green and blue values.Elliptical eccentricity, compactness and solidity are all bounded between 0 and 1, where 1 would be a perfect circle and lower values indicate increasing non-circularity.RGB colours were averaged across all pixels of each object to minimise variance due to noise (e.g.grain), as well as unwanted colour artefacts such as colour fringes along object edges that are introduced by chromatic aberrations in the lens.All measured values were outputted by CellProfiler in a spreadsheet, and each object was saved as a separate image after expanding its perimeter by 20 pixels to provide a buffer in case of incomplete detection.Example CellProfiler scripts are available online (Razzell Hollis, 2023b).

| Colour analysis
Mean RGB values exported from CellProfiler were used to assess the spread of fragment colours, using custom Python scripts to conduct Principal Component Analysis, and generate figures.These scripts used numpy, pandas, scipy, sklearn (Pedregosa et al., 2012) and colour packages (see above for references).PCA was conducted of colour values in sRGB space using 2 components.To compare RGB values to typical observer-driven colour categorisation, one of the authors manually categorised a total of 333 fragments from 2021 into the same 6 colour categories used in the observer trial.
Colour/size analysis and visualisation scripts are available online (Razzell Hollis, 2023b).

| Statistical analysis
Size, shape and colour parameters were analysed statistically in Python using numpy, pandas, scipy and sklearn packages (see above for references).Log-normality tests of major and minor axis lengths were done by taking the natural logarithm of the length data and applying D'Agostino and Pearson's test for normality.For each measured size/shape parameter, datasets from 2021 and 2022 were compared using 2-sample Kolmogorov-Smirnov tests to determine if their distributions were statistically similar.Due to the relatively low number of tests involved (N = 8), p-values were not rescaled in order to preserve statistical power (Moran, 2003).Linear correlation between different parameters was assessed using Pearson's standard correlation coefficient.

| RE SULTS
We analysed 3793 fragments (a mixture of plastics and pumice) that had been removed from the stomachs of 129 individual birds: 1453 fragments from 2021 and 2340 fragments from 2022 ( represents 11% of total objects; pumices were not separated during photography and so were analysed together with plastics.For validation, fragment counts were compared to the original tallies taken in the field, which show some variation due to fragments that were missed in either the first or second counts, or fragmentation of plastics during transport, or simply variability in individual ability to distinguish plastics and pumices.The number of objects removed from each bird varied between 1 and 216, median counts were six objects for birds sampled in 2021 and 2022 objects in 2022, with just 11 birds having >100 objects (see Supplementary Information Figure S1 for histograms).This uneven, varying distribution is consistent with previous reports on trends in plastic ingestion by shearwaters (Lavers et al., 2021).

| Observer trial
Although observers were given the same set of samples, the total fragment count varied between six human observers (644 to 684 pieces).
Total counts did not increase over time, so we assume that the variation is the result of human error rather than fragmentation due to repeated handling.The most abundant colour category was white, accounting for 50%-75% of fragments counted, with a large variance in overall white count between observers (Figure 2a).Samples with a large variance in white count also had large variance in yellow count, suggesting that these two colours are difficult to discriminate.This is supported by a high Pearson correlation coefficient between variances of white and yellow (Figure 2b).There was a similarly strong co-variation between blue and green, albeit at lower overall counts.It took the observers 53-67 min to count and categorise ~670 fragments.

| Photography
Each sample contained 1-250 individual fragments, which we arranged such that no two fragments overlapped, producing a total of 170 images (Figure 1b).Arranging fragments and capturing a photo took between 1 and 20 min per sample depending on the user and number of fragments.The time required to image each set of fragments varied depending on the user and the number of samples, taking approximately 5 min per photo at NHM and 12 min per photo at UTAS.

| Colour correction
Figure 3a shows the image of the reference chart before recalibration and compares the observed colours to their reference values (shown as digitally added circles overlaid on each square).Even though LEDs were used to provide pure white light (colour temperature: 6000 K), there is still a visible discrepancy in both observed colour and brightness that impacted the observed colours plastics.
By converting RGB coordinates to CIELAB coordinates, we can separately plot observed versus reported lightness (Figure 3c) and chromaticity (Figure 3d) for each of the 24 squares shown on the reference chart in image #0226, exposing how lighting and colour balance each vary across the colour gamut.
By comparing observed and reference RGB values for the colour chart in each photo, we can determine the necessary correction that will adjust that photo's RGB values to better match their 'true' values.
This correction results in the reference chart in Figure 3a becoming visually indistinguishable from its reference colours (Figure 3b), and substantially reduces the absolute error in terms of both brightness (Figure 3c) and chromaticity (Figure 3d).coordinates) for a sub-set of 12 photos taken at the NHM and five at UTAS (Figure 4). Figure 4a shows the spread of errors for 24 squares of the reference chart in each image prior to colour correction.Across all 17 images the mean error ± standard deviation (SD) was 0.18 ± 0.06, and individual errors ranged between 0.05 and 0.35.For context, the maximum possible Euclidean distance across the cubic RGB space is √3, or 1.73.After correction, the mean error was reduced to 0.04 ± 0.02 (Figure 4b), a relative reduction of 78%.The results for all 170 images are shown in the Supporting Information (Figure S3).Running the colour correction script took an average of 13 s/photo.

| Object detection
CellProfiler generated two visual data products for validation of object detection: the original image with detected objects marked using false colour overlay, and the original image with sequential ID numbers at the centroid of each detected object (see Figure S5).The Table 2 summarises the results of the object detection pipeline of all 170 samples.Detections were validated by a human, recording how many objects were successfully detected (true positives), incorrectly detected (false positives) or not detected when they should have been (false negatives).Individual results for each photo are reported in Table S4.Overall, the false positive rate was 2.2% and the false negative rate was 1.1%, with 97.8% of detections being true and 98.8% of objects successfully detected.Only true positives could be used for subsequent size/shape analysis.
Here, 3742 out of 3793 objects (98.8%) were successfully detected using the algorithm.Each detected object was automatically analysed to derive parameters that describe its size and shape, including approximating the object using an ellipse with the same second central moments.Area was taken directly from the number of pixels above the intensity threshold, multiplied by the projected area per pixel for the image.Object length was measured using two separate parameters, elliptical major axis length, and maximum Feret diameter (i.e. the distance between the two parallel planes restricting the object perpendicular to that direction; F max ), also called the calliper diameter.When the major axis lengths of the 3742 detected objects were plotted against F max (Figure 5b) we find that the elliptical major axis are a good approximation of length, appearing highly correlated to F max (Pearson correlation coefficient of 0.99).Elliptical minor axis length (Figure 5c) showed a similarly high degree of cor-  Using the elliptical approximation, the 3742 detected objects varied between 1.4 and 60 mm in length and 0.7 to 32 mm in width.
No objects <1.4 mm in length were detected, simply because that was the minimum size that was set for object detection to avoid excessive false positives.Apparent area varied between 1.5 and 1150 mm 2 (Figure 6a-c, Table 3).The distributions of fragment area, length and width are all highly asymmetric with low mean/median values and long tails towards higher values.Length and width distributions appear like log-normal curves but both failed a log-normality test (p < 0.001, see Figure S6).Maximum and minimum Feret diameters provided similar results to major and minor axis lengths (Table 3).
In terms of shape, most fragments were relatively rounded in that most values were nearer the circular end (1) of each range: the overall median eccentricity was 0.74, the median compactness was 0.72, and the median solidity was 0.95 (Figure 6d-f, Table 3).We found that there was a statistically significant difference in size/shape distributions between 2021 and 2022 datasets in terms of seven out of eight size/shape parameters, (p-values between <0.0001 and 0.0552).Running the entire CellProfiler pipeline took approximately 10 s/photo.

| Measuring colour
When showing that observer-driven colour categories tend to be broad clusters with significant overlap.
Because both original datasets are three-dimensional, only two principal components could be calculated, accounting for a total of 99% of variance in RGB space (Figure 8a, Table S5).In both cases, the first component axis corresponded to the dominant trend (white to tan to brown to black), while the second component described secondary variation in chromaticity from blue/ cyan to yellow/red.Given the considerable overlap of data points in Figure 8a, we also plotted the density of points using a 2D histogram (shown in Figure 8b), highlighting that most fragments occur along a relatively narrow line in the PCA space, with the blue and red extremes of the spread representing only a very small fraction of the total.

| Practical considerations
Taking photographs for the accurate measurement of size, shape and colour of objects requires some care when setting up.A fixed camera facing directly down at the samples provides a consistent measurement of size across the imaged area, provided that the photos are corrected for lens distortion.image, so that measured dimensions in pixels can be readily converted to millimetres and any variation in reproduction of known colours due to changes in lighting or camera settings can be readily identified.Once taken, photographs should be saved in a lossless raw image format (e.g.TIFF) to avoid introducing errors and artefacts from compression.
The object detection method was very reliable, correctly detecting 98.8% of objects across 170 images (a false negative rate of 1.1%) with only 2.2% of all detections being false positives.False positives tended to be bright spots in the background material, parts of the reference chart that were still within frame even after cropping the image, or darker fragments that were incorrectly detected as multiple separate objects.False negatives tended to be very dark fragments that could not be distinguished from the background, but in a few cases were two fragments that were placed too close together and got incorrectly detected as a single object.
Because of the minimum threshold and use of a black background, this method is less able to detect black objects and transparent plastics.For sample sets with a high abundance of black fragments, researchers could use a white background instead.Although CellProfiler outputted data for all detected objects, both true and false, only those attributed to true positives should be used for subsequent analysis, and it is recommended that a human observer double-checks all detections from automated software to protect data quality by preventing accidental inclusion of erroneous data from false positives, at least in an initial pilot stage before full automation.As the properties of undetected objects were not recorded, they could not be included in subsequent analysis.As the properties of undetected objects were not recorded, they could not be included in subsequent analysis, however that loss accounted for only 1.1% of all fragments, less than the variation in count between human observers.

| Colour correction
One of the greatest challenges in quantitative photography is ensuring that the observed colour of a photographed object is consistent, regardless of the camera used or the specific lighting conditions at the time.The human brain excels at automatically adjusting per- Digital cameras and software do provide several methods for correcting colour and brightness values (either in-camera or during subsequent image processing) but most are concerned with adjusting for differences in colour balance (e.g. the colour temperature of ambient lighting).We originally attempted to use the ColorChecker Camera Calibration software to derive a colour correction profile (CCP) from the reference chart and apply it to the photographs during pre-processing in Adobe LightRoom, but found the correction was only partially successful, tended to be inconsistent amongst images, and could not account for brightness at all (see Figure S4).
Instead, we found it was more effective to develop our own Python script that uses the "colour" package to derive and apply the necessary correction, which substantially reduces the error in RGB coordinates for the reference chart as well as minimising the difference between images taken using different set-ups or lighting conditions.
Based on the average errors shown in Figure 4, we are confident that with this colour correction algorithm we can accurately reproduce most colours to within a Euclidean distance of 0.08 (mean error plus 2 sigma) of their true RGB values, even when using different camera set-ups.
We also attempted to photograph a subset of samples using more ubiquitous smartphones with rear-facing cameras but found the automatic image pre-processing done by the phone's built-in software made reliable assessment of object size impractical.Size parameters for 43 objects in one sample were consistently underestimated by ~4% despite rescaling using the scale bar within the image (see Figure S10).

| Measuring size and shape
The physical dimensions of plastic fragments in the environment and of those ingested and retained by wildlife is important for understanding the mechanisms by which plastics are transported, and what risks they pose to wildlife (Roman et al., 2019).Once ingested, large plastics may cause blockages and perforations in the gastrointestinal tract (Semensatto et al., 2022); small plastics can become embedded in tissues and/or release harmful chemicals into the body (Mattsson et al., 2017;Rivers-Auty et al., 2023;Tanaka et al., 2013).
However, these categories are not always consistently used across the literature and 'micro-plastics' has been used to refer to objects anywhere between 1 μm to 20 mm in size (Frias & Nash, 2019).
Overall, size is rarely reported in plastic ingestion studies and is generally only presented as summary statistics (Lavers et al., 2022;Semensatto et al., 2022).To reduce ambiguity and increase data quality particle size should be reported more explicitly, such as describing plastics using multiple parameters, and data distributions presented as well as summary statistics (e.g.Table 3 and Figure 6).
Furthermore, the limits of sample and data collection processes must also be adequately described along with their implications, for example, the choice of net/mesh/filter size for sample collection, or the minimum threshold for object detection.In this study, samples of plastic were isolated by hand during sorting of ingested material and thus items below a certain size (~1 mm) are unlikely to be included in the sample set, while the CellProfiler object detection algorithm had a minimum diameter threshold of ~1.4 mm (20 pixels)-this means the absence of objects below this size does not indicate that such objects were absent from the population of ingested plastics.
Object detection in image processing will readily return the apparent area of an object, based on the number of contiguous pixels that were selected by thresholding, but there are several different ways to measure size.Physical measurements include using callipers, which provide the maximum length of an object at a given angle (i.e. the Feret diameter), and sieving, which separates objects based on their width.These are not used systematically, some studies report the longest dimension while others report the shortest, leading to further ambiguity (Metz et al., 2020;Tokai et al., 2021).Further, such manual methods are time consuming whereas the added benefit of standardised photography will produce these basic measurements as well as others, greatly enhancing our study of the effects of plastic shape and size on wildlife.
CellProfiler and other shape analysis packages can provide estimates of both length and width of imaged objects.There are multiple measurements available, the most common are Feret diameters and the elliptical approximation.Feret diameters are equivalent to calliper measurements, the maximum Feret diameter (F max ) is the longest dimension of the object while the minimum Feret diameter (F min ) is the shortest.However, these values can be distorted by poor perimeter detection and outliers, and there is no guarantee that the F max and F min diameters will be orthogonal to one another.The ellipse approximation takes the object's shape and matches it to an ellipse with the same second central moments, but as shown in Figure 5a, this does not necessarily result in the exact length or width of the object as a human would interpret it.It does have some advantages: it is a good approximation of the overall size of the object weighted by the distribution of its area, the major and minor axes will always be orthogonal, and the axes will be less distorted by extreme points or noise along the detected perimeter.In terms of their overall accuracy, elliptical axes and Feret diameters are highly correlated to one another and very similar in magnitude, with the elliptical approximation underestimating length by 2% and width by 1% on average.We conclude that both methods provide similar information on the size of an object, but it is always important to state exactly which definition is employed when measuring plastic size.
In addition to length, width and area, it is possible to derive other parameters that describe the shape of a piece of plastic.Two objects may have similar absolute dimensions as one another but differ in terms of how angular, round, concave or irregular they are (Montero & Bribiesca, 2009).While no single numerical parameter can fully encompass the diversity of fragment shapes shown in Figure 3, the chosen parameters need to be selected based on their suitability or significance for the analysis.The three parameters we used were elliptical eccentricity (how round/flat the elliptical approximation is), compactness (the ratio of the object's area to a circle with the same perimeter) and solidity (how much of the object's convex hull is occupied; Montero & Bribiesca, 2009).These three properties exhibit some degree of non-linear correlation (see Figures S7-S9), but they differ in their sensitivity to different kinds of shape variation: eccentricity increases as the object gets longer and thinner, compactness decreases as the object's perimeter deviates from a circle in any direction, while solidity decreases as the object becomes less convex.This results in differently shaped distributions shown in Figure 6, which show that most objects are relatively elliptical (high eccentricities), compact (high compactness) and mostly convex (high solidities).
The automatic measurement of length, width, area and other parameters for 98% of objects means that we can report parameters with greater accuracy and precision, allowing for more informative descriptions of size distributions (Figure 6) than simply categorising fragments into inconsistently defined groups such as macro-, meso-or micro-plastics, and allows us to observe smaller variations in size distribution that may be potentially indicative of different selection or transportation/degradation mechanisms (Gewert et al., 2015).Using the distributions we observe, we found that the 2021 and 2022 sample sets differed from one another to a statistically significant degree in seven out of eight parameters (see Table 3), which may suggest that ingested plastic debris can vary appreciably from year to year in terms of size and shape.
It is worth noting that even under ideal conditions we are still approximating 3D objects using 2D shapes, and as a result, we may get slightly different results for the same objects if they are placed upside-down or at a different angle.However, most plastic fragments included in this study are relatively flat and show a consistent colour across their surface (Figure 3) and so their shapes should be reasonably approximated by the images in this dataset.Because we are measuring 2D properties only, we also cannot estimate or measure object mass, which is a key parameter in monitoring the extent and impact of plastic pollution.

| Measuring colour
By convention, the assessment of plastic colour is done by human observers during cataloguing, putting each fragment in a colour category based on visual inspection (Provencher et al., 2017).The list of categories is not consistent between studies, their limits are not well defined operationally, and assignments are inevitably subjective (Martí et al., 2020;Provencher et al., 2017).Reducing the number of colour categories should improve the overall reliability of visual classification, but even when using only six categories the results can still vary considerably amongst observers (Figure 2).This is because such broad categories still fail to account for the continuous gradient of true colours (Figure 7), which means intermediate shades can be assigned to one of several nearby groups, for example, white versus yellow or blue versus green, depending on lighting, context and subjective colour perception.The use of a standard colour reference has been suggested to overcome inter-observer subjectivity and allow for the use of more categories (Martí et al., 2020;Provencher et al., 2017), but to date few have adopted this approach, likely because it is time consuming to do manually.As plastics become smaller it also becomes increasingly more difficult to visually assess colour and thus the use of a standard colour reference may be limited to larger plastic fragments (Kotar et al., 2022;Lusher et al., 2020).
Using quantitative photography, we can more accurately and reliably measure the average colour of each detected fragment and identify trends rather than simply grouping fragments based on subjective assessments.Recording colour in 8-bit depth sRGB format can theoretically provide over 16 million unique colour coordinates, although these do not cover the full gamut of possible colours that human eyes can see.The number of colours that can be reliably and reproducibly distinguished from one another will be smaller due to variances introduced by camera set up and lighting that result in a non-zero errors in reproduction of true colour (Figures 3 and 4).
Provided that a suitable reference chart is included in all photos, and images are colour-corrected using the method described, those errors are minimised and the number of possible colours that can be reliably distinguished from one another is maximised.
Most fragments in our study were assigned to either white, brown, or black; but these groups show significant overlap in RGB space and are part of a continuous gradient of fragment colour that extends from off-white to nearly black (Figure 7).This may be indicative of the original colours of plastic objects entering the ocean, or of a population of originally white material that becomes more and more discoloured over time.Yellowing of white plastics is known to occur in response to prolonged exposure to solar radiation (Andrady, 2017), and in the case of seabirds and other wildlife, exposure to the digestive fluids inside the stomach (which contain pigmented oils and squid ink; Provencher et al., 2017).For this reason, the colour (and size) of plastic fragments has been used as an indicator of environmental weathering to infer their age (Martí et al., 2020) There are still limitations to displaying 3D colour coordinates in a 2D medium such as a static diagram like Figure 7, where points with different coordinates can appear to overlap depending on the viewing angle.PCA is a common method for multivariate analysis and can be used to reduce a multi-dimensional coordinate system to a 2D projection that captures as much of the variance as possible (Quinn & Keough, 2002).The resulting projection will depend on the data inputted to the PCA but can be applied to other datasets once calculated to ensure consistency.The projection plots the white-brownblack trend along the 1st component axis and yellow/blue variance along the 2nd component axis.While PCA cannot fully represent the 3D nature of trichromatic colour space, it is useful for simplifying data to assist interpretation while sacrificing the minimum amount of variance (which can be negligible, e.g.<1% for Figure 8a).The curving 3D trend shown in Figure 7 in now associated with a specific coordinate axis in Figure 8a and can be fitted with a simple linear function for a more exact representation.
It is worth noting that because most colour spaces are based on human trichromatic vision, they do not necessarily tell us how colours will appear to other species, especially to those with tetrachromatic vision (including birds) that can extend into the ultraviolet.
Even for non-human species with trichromatic vision, the sensitivity ranges of their receptors may not align with those in typical human vision and as such standard RGB cameras may not be representative of how they see colour (Troscianko & Stevens, 2015).Measuring colour as it would be truly perceived by other species is an ongoing challenge for visual ecology that requires an understanding of the species' sensitivity to different wavelengths, and specialised cameras with filters designed to detect light at the appropriate ranges, and for seabirds, an understanding of their perception in different media (e.g.seawater) with any wavelength-dependent refractive and diffusive effects.the ability to adjust/refine the analytical process without the need to retake images, and the provision of high-resolution photographs that can be used for future reference, we believe that this method is worth the time required.

| CON CLUS IONS
We have demonstrated a new method of quantitative photography for rapid, reliable, repeatable, and robust analysis of the size, shape and colour of plastic pollution in the 1-100 mm size range (mesoand macro-plastics), with broad potential application for all studies focusing on the accumulation and impact of plastic pollution in the environment.We tested this method on ~3800 pieces of plastics and pumice that had been ingested by seabirds.The method cannot distinguish between plastics and pumices, but we show that it is an effective tool for quickly acquiring large datasets, capable of detecting and measuring fragments with a 98% success rate and providing several parameters that describe different aspects of object size and shape, as well as colour.We have demonstrated that such automated techniques can measure plastic colour more accurately than conventional methods that involve subjectively assigning fragments to poorly defined colour categories, instead providing colour values in terms of coordinates in a semi-continuous colour space (e.g.RGB).This substantially improves data quality However, the method is limited to 2D analysis of size/shape and cannot provide information on 3D parameters or object mass, which are themselves important factors in characterising plastic pollution and should be measured separately if needed.But as plastic pollution accumulates in the natural environment and increasingly impacts both wildlife and human health, it is essential that we establish robust, standardised and easily applied methods that ensure comparability and applicability of data into the future and inform the development of robust mitigation and management plans that limit the impact of plastics on vulnerable species.Table S5.PCA loading values and percentage of variance described by each component for the PCA presented in Figure 8.
Table S1.Range of colour categories reported by a selection of at-sea studies of ocean plastics from 1971 to 2021.Papers were selected to cover multiple decades (1970s, 1990s, 2010s, 2020s) and regions (e.g.Atlantic Ocean, Southern Ocean, Tasman Sea, etc.).'At-sea' was used when the sampling location was not specified.
Table S2.Range of colour categories reported by plastic ingestion studies from 1980 to 2015.Seabirds were chosen as a representative wildlife group.Papers were selected to cover multiple decades (1980s, 1990s, 2000s, 2010s), regions (e.g.Australia, Alaska, Brazil, etc.), and seabird families.The term 'seabirds' was used when three or more seabird families were included in the study.
Table S3.Range of colour categories reported by plastic ingestion studies in the last 3 years (2020-2022) after Provencher et al. (2017) highlighted the need for standardisation and proposed the use of eight colour categories: off/white-clear, silver-grey, black, redpink, green, blue-purple, brown-orange, and yellow.Seabirds were chosen as a representative wildlife group.Papers were selected to cover multiple regions (e.g.Australia, Alaska, Brazil, etc.) and seabird Families.The term 'seabirds' was used when three or more seabird families were included in the study.
Samples were obtained from Flesh-footed Shearwater (Ardenna carneipes) fledglings on Lord Howe Island, New South Wales, Australia in 2021 and 2022 (see Lavers et al., 2021 for precise dates and other details specific to bird data collection).Fieldwork was done under appropriate permits and was approved by the Charles Sturt University Animal Ethics Committee (A22382), New South Wales Office of Environment and Heritage (SL100169), Lord Howe Island Board (LHIB 07/18).In brief, material was removed from the stomachs of fledglings via a combination of lavage (flushing the stomach with seawater) of living birds and necropsy of deceased birds (beach-washed, roadkill, and others).Fragments of plastic and pumice were counted, sorted following Provencher et al. (2017), and colours classified using the categories of Lavers et al. (2021).Fragments from 2021 were sent to the Natural History Museum (NHM) in Tring, UK for analysis; fragments from 2022 were sent to the University of Tasmania (UTAS), in Hobart, Australia.

A
trial was conducted to demonstrate variation in visual classification of plastic colour amongst human observers, comprising volunteers at UTAS and/or co-authors, with approval from the University of Tasmania Human Ethics Panel (H0028598).Observers (n = 9) with varying degrees of experience in categorising plastics were given plastics from the same eight birds from 2022 and asked to count and sort fragments into the following 6 colour categories: white, blue, green, red, yellow and black.Non-plastic items (e.g.pumice) We then applied the colour correction algorithm to all 170 photos, determining the necessary correction for each image based on the observed values of the reference chart, which should account for any variations that may result from differences in camera, lighting, or colour balance (FigureS2).We assessed the effectiveness of the correction by comparing the error in colour reproduction (the Euclidean distance between observed and reference RGB F I G U R E 2 (a) Summary of object counts for the same ~670 plastic fragments that were manually sorted into six colour categories by nine human observers, showing the mean and min-max range of counts between observers for each colour.(b) Pearson correlation of variation (SD) in object counts of different colours.
former shows how accurately the detection threshold finds object boundaries, the latter is a useful reference for identifying individual fragments from each sample during subsequent laboratory analyses.CellProfiler can also isolate each object and save it as a separate image (three examples from photo #0226 are shown in Figure5a, overlaid with illustrations showing different measures of size parameters), a useful output for future analysis and data visualisation.

F
I G U R E 3 (a) RGB image of the colour reference chart from photo #0226 prior to colour correction, with reference RGB values for each square shown inside circles for comparison.(b) The same image after automatic colour correction.(c) CIELAB L* (lightness) values for each square before (solid circles) and after correction (open circles), versus their reference values (open squares).(d) CIELAB a* and b* values for each square.(e) CIELAB L* values for all squares plotted against their reference values, before and after correction.

F
I G U R E 4 (a) Boxplots showing the RGB error (expressed as Euclidean distance between observed and reference colour coordinates) for 24 ColorChecker squares in 17 example photos prior to colour correction, photo capture dates are given in MM-DD format, all were taken in 2022.Red lines indicate the median error for each photo, circles are outliers beyond the interquartile range.(b) Boxplots showing reduced errors for the same photos after colour correction.(c, d) Histogram of errors before and after colour correction.F I G U R E 5 (a) three examples of successful fragment detection, showing the image of each fragment with its detected outline (in yellow), the ellipse with the same second-moments (dashed white line), and the maximum and minimum Feret diameter (dashed and dotted blue lines).(b) Major axis length of the ellipse versus maximum Feret diameter for 1430 successfully detected fragments from 2021 and 2176 fragments from 2022.(c) Minor axis length versus minimum Feret diameter.(d, e) Normalised histograms for the ratio of major and minor axis lengths versus maximum and minimum Feret diameters.
the average colours of all successfully detected fragments are shown as coloured coordinates (Figure 7a,b), the spread of colours predominantly follows a continuous curve going from off-white (in the upper left corner) to tan and brown to almost black (in the lower right corner).MANOVA analysis of red, green, and blue values indicated that the mean colours for 2021 and 2022 were statistically distinct from one another (Wilks' Λ = 0.957, p < 0.0001), though the trends appear similar.We also manually categorised 333 fragments from 2021 manually into 6 different colour categories which were plotted against their RGB coordinates (Figure 7c) for comparison, Normalised histograms for size parameters of 1425 successfully detected objects from 2021 and 2317 detected objects from 2022, showing the distribution of measured values for the following parameters: (a) area, (b) elliptical major axis length, (c) minor axis length, (d) elliptical eccentricity, (e) compactness and (f) solidity.Plastic size categories are taken from Hartmann et al. (2019).

TA B L E 3
Comparison of summary values for size and shape parameters of detected fragments from 2021 (N = 1425) and 2022 (N = 2317) surveys.IQR refers to inter-quartile range.p-values are from 2-sample Kolmogorov-Smirnov tests of the 2021 and 2022 distributions.p-values below the significance threshold (p < 0.05) are given in bold.

F
The average colours of 1425 successfully detected fragments from 2021 (a) and 2317 fragments from 2022 (b), plotted against their colour coordinates in sRGB space.(c) A subset of 309 fragments that were manually categorised by colour and plotted against their 'true' RGB colour coordinates.The inset diagram illustrates the full shape of the colour space, and open circles indicate the positions of key idealised colours (e.g.red, yellow, green, cyan, blue, magenta, white, black).F I G U R E 8 Principal component analysis of fragment sRGB colour values (combined 2021 and 2022 surveys).(a) Fragment colours plotted against their coordinates on the first and second component axes, with the percentage of variance described by each component given in the axis labels.Open circles indicate the positions of key idealised colours (red, yellow, green, cyan, blue, magenta, white, black).(b) Twodimensional histogram of PCA coordinates, showing the number density of data points in the same component space.

Figure 3
Figure3for the reference chart provides a clear example of how much colour can deviate from expected values even under controlled conditions.If these errors are not accounted for, it will be difficult to reliably express colours of objects such as plastic debris in such a way that a second observer (potentially using a different camera, or different lighting) would get a similar result.
and precision and reveals that most plastics ingested by seabirds occur along a continuous colour gradient extending from white to yellow to brown to black, which may be the result of discolouration from time spent in the open ocean or inside the bird's digestive system.Detailed analysis of the results from this sample-set and its implications for plastic ingestion by seabirds will be addressed in future work.The quantitative photography method we have outlined offers a new, standardised process for analysing the 2D size, shape and colour of plastics, with the advantage of being substantially more precise and faster than manual assessment by human observers.The method was specifically designed to be open-source and user-friendly, and the substantial increase in data yield (automatic simultaneous measurement of multiple parameters), data quality (more accurate measurements) and time efficiency (rapid analysis of thousands of fragments per hour) should encourage uptake of the method by researchers, thereby rapidly improving the quality and comparability of the data available on marine plastic pollution.

Figure S2 .
Figure S2.CIELAB lightness (L*) for 3 different greyscale squares of the reference chart as measured in different photos over time, before (solid circles) and after (open squares) colour/brightness correction, vs the 'true' lightness value (dashed line).

Figure S3 .
Figure S3.Colour reproduction errors for all 170 photos, before and after correction.Outliers are indicated by open circles.

Figure S4 .
Figure S4.First attempt at colour correction using a colour correction profile (CCP) derived from the colour reference chart using ColorChecker Camera Calibration software and applied in Adobe Lightroom, showing minimal correction of error in terms of chromaticity (a*, b*) and no correction of error in terms of lightness (L*).

Figure S5 .
Figure S5.Examples of two primary outputs of CellProfiler object detection algorithm: original RGB image labelled with sequential ID numbers for detected objects, and false colour overlay of greyscale image depicting areas of detected objects.

Figure S6 .
Figure S6.Natural log histograms of size data for 3741 successfully detected objects from 2021 and 2022 showing non-normal distributions of both length and width according to the elliptical approximation.

Figure S7 .
Figure S7.Linear correlation plots for measured size and shape parameters for all 3741 successfully detected objects.

Figure S8 .
Figure S8.Natural log correlation plots for measured size and shape parameters for all 3741 successfully detected objects.

Figure S9 .
Figure S9.Pearson correlation coefficient matrix for (a) raw values and (b) natural logarithm of values of measured size and shape parameters.

Figure S10 .
Figure S10.comparison of object size measured using Canon camera set-up (left), hand-held Canon camera outdoors (middle) and handheld smartphone camera outdoors (right) showing discrepancy in size when imaged with smartphone despite scaling from in-frame scale bar.Smartphone camera was a rear-facing iPhone 12 Mini taken using automatic settings.

Table 1 )
. Pumice TA B L E 1 Summary of manual object counts originally reported during sample collection in the field versus object counts taken during photography by a second observer.Counts are broken down by survey year and sorted into plastics and pumices by visual assessment.
. However, this requires precise, calibrated measurements of both size and colour, which is currently not captured by the categorical data reported in most studies.By comparison, quan-

Table S4 .
List of photographs taken and the number of detected fragments in each versus manual counts during collection and during photography.