Soiling in Solar Energy Systems: The Role of the Thresholding Method in Image Analysis

The use of image analysis has often been suggested as a practical way to monitor the soiling accumulated on the surfaces of solar energy conversion devices. Indeed, the deposited soiling particles can be counted and characterized to calculate the area they cover, and this area can be converted into an energy loss. However, several particle counting methodologies exist and can lead to dissimilar results. This work focuses on the role of thresholding, an essential step where particles are distinguished from a background based on the pixel brightness. Sixteen automatic thresholding methods are assessed using 13 200 micrographs of glass coupons soiled at nine locations globally. In low‐to‐intermediate soiling conditions, the “Triangle” method is found to return the minimum coefficient of variation and a mean deviation closer to zero. On the other hand, methods assuming a bimodal distribution of pixel brightness underestimate the area coverage. In addition, since soiling can be unevenly distributed over a surface, different loss estimations can be returned when the same image analysis process is used on different spots on a sample's surface. For these reasons, image analysis should be repeated at multiple locations on each investigated surface.


Introduction
In 2021, over half of the newly installed energy capacity globally consisted of photovoltaics (PV). [1]This trend is projected to continue, favored by a drop in costs of almost 90% since 2010.Installed PV capacity is anticipated to reach 2 TW by 2025, a figure that would double its 2022 value in just 3 years. [1]Also, the capacity of concentrating solar thermal (CST) has been increasing with time and reaching new markets while experiencing a significant drop in levelized cost of electricity by nearly 70% over the past decade. [2]owever, renewable energies have significant land use requirements, [3] meaning that a large amount of land will be converted into solar energy parks in the coming years.This can create competition with other activities, decreasing land for agriculture and posing The use of image analysis has often been suggested as a practical way to monitor the soiling accumulated on the surfaces of solar energy conversion devices.Indeed, the deposited soiling particles can be counted and characterized to calculate the area they cover, and this area can be converted into an energy loss.However, several particle counting methodologies exist and can lead to dissimilar results.This work focuses on the role of thresholding, an essential step where particles are distinguished from a background based on the pixel brightness.Sixteen automatic thresholding methods are assessed using 13 200 micrographs of glass coupons soiled at nine locations globally.In low-to-intermediate soiling conditions, the "Triangle" method is found to return the minimum coefficient of variation and a mean deviation closer to zero.On the other hand, methods assuming a bimodal distribution of pixel brightness underestimate the area coverage.In addition, since soiling can be unevenly distributed over a surface, different loss estimations can be returned when the same image analysis process is used on different spots on a sample's surface.For these reasons, image analysis should be repeated at multiple locations on each investigated surface.threats to biodiversity. [4]For this reason, while the research community is looking into new solutions, such as agrivoltaics, [5] floating PV, [6] and building-integrated PV, [7] it is also essential to maximize the performance of operating solar energy systems.Indeed, by increasing the energy yield of an existing installation, not only is the energy output enhanced, but also the utilization of land and materials is minimized.
The energy yield of solar power systems can be enhanced through the use of appropriate operation and maintenance (O&M) routines.The accumulation of dust, dirt, and contaminants is a serious issue for these systems, as it decreases the amount of light reaching a PV cell or a CST receiver, resulting in reduced electricity or heat generation. [8]This phenomenon, known as "soiling", affects systems worldwide and is significantly site specific. [9]The magnitude of soiling varies depending on factors such as geographical position, system configuration, physicochemical dust properties, and weather patterns, which also change with seasons.Hence, effective and continuous monitoring of soiling is essential to develop effective O&M strategies that mitigate its detrimental effects. [10]mage analysis has been often investigated as a potential low-cost soiling monitoring technology, at least since the first outdoor microscope was installed in Qatar. [11]Indeed, using image analysis of a micrograph, one can count and characterize the size and the shape of particles deposited on a surface. [12]In most soiling-related image analysis studies, [13][14][15][16] the U.S. National Institutes of Health's open-source Image software package has been employed for this purpose.Several authors [17][18][19][20] presented empirical correlations to convert the area covered by soiling on a glass coupon into the electrical loss experienced by adjacent PV modules.However, these correlations can lead to inconsistent results, as they are obtained with different experimental and analysis procedures.
A recent round-robin study [21] showed that part of this variability is due to the image analysis process, which can produce dissimilar measurements for the same micrograph even when the same software is used by different operators.In particular, thresholding was found to be one of the key steps in image analysis affecting particle counting.Thresholding is a process needed to convert a micrograph into a binary black and white image, where each pixel is identified as either background or part of a particle based on its brightness (or intensity).The thresholding value can be set from the analysis of the pixel intensity distribution, either through an automatic threshold method or manually by the user.Typical intensity distributions for low-to-intermediate soiling locations are shown in Figure 1.Several automatic thresholding methodologies exist and a number of these, all integrated with ImageJ [22] and concisely described in Section 4, will be compared in this work.Indeed, when dust particles appear distinctly in micrographs (i.e., they are well separated and have clear edges), different thresholding methods are likely to produce similar measurements.However, sometimes particles and soiling deposits have no clear edges, in which case the thresholding method can greatly affect the measurement.When asked to employ an automatic threshold detection algorithm, 9 out of the 11 operators involved in the previous work [21] utilized ImageJ's Default thresholding method.Therefore, not enough data was available to evaluate the impact of the thresholding method on the variability of image analysis.
In light of the significant impact of thresholding and of the lack of information on this matter, the present study aims to extend the previous publication [21] to specifically evaluate the robustness and the reliability of various global thresholding methods available in ImageJ.This is done by increasing the number of analyzed micrographs to 13 200, compared to the 8 from the previous work.These were obtained, as detailed in Section 4, through an experimental campaign during which glass coupons were exposed to the outdoor conditions at nine locations worldwide for different exposure times, ranging from 1 to 32 days.The micrographs, taken at regular intervals in different spots of each coupon, represent a wide variety of soiling conditions.This unprecedented amount of data required the development of an automated ImageJ script that enabled the systematic analysis of each micrograph with the same procedure producing a different set of results for each available thresholding method.
The article is organized as follows.The results are reported and discussed in Section 2, which also includes an evaluation of the assumptions and the limitations of the current analysis and some recommendations for future works.The conclusions are reported in Section 3. The methodologies are described in Section 4.

Results and Discussion
The results of this work are obtained from the analysis of the micrographs of glass coupons exposed at nine locations worldwide.As described in Section 4, images (micrographs) were regularly taken on various spots of the glass coupons.These were then processed in ImageJ, which returned the area of particles deposited on each micrograph.The results, obtained using different automatic algorithms, were then compared to evaluate the robustness of the various procedures.Finally, a survey was conducted among various soiling experts to visually assess the reliability of the different methodologies in a variety of conditions.

Comparing the Methods
Figure 2 shows the coefficient of variation (i.e., the relative dispersion of data points around the mean) returned by the various thresholding methods, described in Section 4, for three particlespecific parameters, namely the number of counted particles (N), the average projected particle area (A), and the fractional area covered by particles (f ).These parameters were defined and explored in the previous publication utilizing ImageJ's Default thresholding method. [21]In the present study, this method is shown in red at the left side of each of the plots in Figure 2 and 3.As can be seen, the lowest variabilities are found for the "Percentile" method.This is not surprising and should not be misinterpreted, as this method assumes that half of the pixels represent particles, while the other half represent background.For this reason, it returns quite consistent results, with an average fractional area coverage of 0.495 AE 0.065, whereas the other methods have an average of 0.109 AE 0.141.Because of the low-to-intermediate soiling conditions in the present dataset, the "Percentile" method is found to overestimate the average coverage.This is significant when compared to other methods (Figure 3).One can expect this method to underestimate area coverages for heavier soiling conditions for which particles cover most of the surface of the coupons.It should be noted that also another method, "Mean", is likely to show similar behaviors (Figure 2 and 3).This is due to the assumption for the algorithm (method), which sets the mean of the pixel's intensities as the threshold.As one can imagine, in low-soiled micrographs with dark backgrounds, this method would return low threshold values and therefore high numbers of counted particles.The upper left corner of Figure 4 shows the distribution of the minimum thresholds for each method for the whole dataset.As shown, "Percentile" returns the lowest minimum threshold values.As aforementioned, this is not surprising, since for images with dark backgrounds, the pixel intensity distribution is heavily positively skewed (i.e., right tailed), and therefore the median value is low.The same applies to the "Mean" method, which has the third-lowest threshold.As for the median, also the mean values of heavily positively skewed (i.e., right tailed) distributions are likely low.The "MinError" and the "Triangle" methods are two other methods of lowest variability for area coverage (Figure 2).The "MinError" is an iterative implementation of the algorithm proposed by Kittler and Illingworth, [23] where different thresholds are tested and the one minimizing the error between the actual and the process images is selected.The "Triangle" method sets the threshold as the point of maximum distance between the pixel intensity histogram and the line between the histogram peak and the farthest end of the histogram.Due to its approach, this algorithm is particularly well suited for images with a dominant background, as in these cases and in most soiling studies.As shown in the rightmost plot of Figure 3, for the three aforementioned methods, the "Triangle" method is the one returning the smaller deviation from the average fractional area coverage.
"Intermodes" and "Minimum" are the approaches returning the highest values for the minimum thresholds (upper left of Figure 4).This is not surprising as these methods are specifically recommended for bimodal pixel intensity histograms, which is not the case in the present dataset.The former one indeed sets the threshold to the average of the two modes of the histogram.The latter one sets the threshold to the deepest point in the valley between the two peaks.Given the high thresholds, these models return the minimum number of particles and area coverage values among the investigated methods.On the other hand, the "Default", "IsoData", and "Shanbhag" methods return extremely high variability in the number of particles (their median coefficient of variation in Figure 2 is close to 100%), which also directly impacts the area coverage.This is particularly noticeable in micrographs with low soiling, such as those taken from coupons exposed outdoors only for a few days.According to the ImageJ developers, [24] "Default" is a variation of "IsoData", and therefore their similar results are not surprising.In the latter method, the threshold is set at the end of an iterative process that starts with the calculation of the two averages, one for the pixels identified as background and one for those identified as particles, given an initial threshold.The threshold is incremented until the threshold is larger than the mean of the two averages."Shanbhag" is an entropy-based method, which sets the threshold to the value that minimizes the sum of the so-called fuzzy membership coefficient of each. [25]he discussion above raises the question as to whether the thresholds using ImageJ are dependent on the type of soil (e.g., the optical nature of the particles found at a given location).Dark-field microscopy (DFM) utilized in this study is primarily dependent on light scattering due to the morphology of the surface. [26]Due to the operating principle of a light microscopy-based dark field microscope, it follows that the directly transmitted light is not collected, while the scattered light is observed and recorded.Using DFM, the images are therefore not very dependent on the bulk optical properties (i.e., bulk absorption and transmittance) of the particles that are deposited, but are instead primarily dependent on the geometry (shapes and sizes) of the particles.That stated, the scattered light in the case of a soiled surface can be determined by all of these factors.As mentioned in prior work, [21,27] the scattering can be described from Mie scattering theory, [28][29][30] which itself utilizes the real and imaginary index of refraction of the particles as an input.Those optical parameters are dependent on the chemical and optical nature of the particles that are deposited (e.g., the soil type).Different results and conclusions could arise if other types of microscopy (e.g., bright-field microscopy) are used.
Figure 4 reports how the minimum threshold distributions vary depending on the investigated method, for the whole dataset and in each individual location.As shown, the ranking found for the whole dataset holds with minor variations for all the countries but Qatar.Qatar is the location where the micrographs have the highest average intensity (twice that registered in Jordan, and more than three and four times those found in Australia and the USA, respectively) and this can possibly explain, at least in part, the results obtained at this location.This also suggests that the effectiveness of the various methods is possibly affected more by the density of the accumulated particles rather than by their types or shape.However, the lack of additional coupons exposed to intermediate-to-high soiling conditions does not make it possible to draw any final conclusions on this assertion.Previous studies, however, have also suggested that high particle densities can confound the results returned by ImageJ. [31]Even if the lowto-intermediate soiling levels investigated in this work are the most typical ones experienced by PV modules globally, high levels such as those found in the coupons from Qatar should be nonetheless investigated in future studies.The top plot on the left ("All coupons") shows the results when all the coupons are considered.The remaining plots show the results specific to each investigated location.Methods are ordered in ascending order according to the minimum threshold in the "All coupons" plot.Outliers are not shown.The maximum threshold is always 255.

Survey
Observations are characterized in studies as subjective and objective.To assess for the former, a survey was conducted, whose details are given in the "Experimental Section".It consisted of a blind test showing nine micrographs, each one having its own particle counting images, processed with the different ImageJ thresholding methodologies investigated in this work.Eighteen experts, at different career stages, were involved from the PV and the CST communities.The nine micrographs represented a variety of conditions, from low to high area coverages, and showed a number of features, such as agglomerations of particles, scratches, and other types of deposits.
The results of the survey, in which experts selected the methods that best matched their particle counting expectations, are shown in Figure 5.The figure considers the percentage of times each method was selected by experts for a given micrograph.If all the experts selected the same method for a micrograph, the result would be 100%.Similarly, if a method was selected by no expert on that micrograph, a result for that micrograph would be returned as 0%.Each boxplot shows the combined results for each method considering nine micrographs, one per country (i.e., it is calculated from nine data points, each showing the results for the indicated thresholding method on one of the micrographs).
Those taking the survey did not know beforehand the name of the ImageJ thresholding method that they were selecting.That said, "MaxEntropy" and "Triangle" are the methods that were selected most frequently.As suggested by its name, the "MaxEntropy" algorithm (method) selects the threshold that maximizes the interclass entropy.On the other hand, methods conceived for bimodal pixel intensity histograms ("Intermodes" and "Minimum") were not often selected, because the micrographs analyzed in this work had mostly unimodal distributions.As expected, "Percentile" was the least selected method (0% in the majority of micrographs), because of the unrealistic assumption for low-to-intermediate soiling conditions of pixels equally split between background and foreground.Interestingly, however, this was the most selected method for the micrograph taken on day 7 in Cape Verde.This is a heavily soiled micrograph, with a number of large bright particles and a pixel intensity histogram that has a smooth peak in the dark and a long right leaning tail with a sudden mode at the brightest pixel intensity.For this reason, no expert selected "Triangle" for this same micrograph, as it left a large number of uncounted particles.
The wide ranges displayed in Figure 5 also demonstrate how different soiling can be, in terms of image analysis, at the different locations of the study.Overall, the results of the survey suggest that some methods might be more appropriate for image analysis of soiling micrographs.However, it can be seen that a one-size-fits-all solution is still elusive, and this can be attributed to at least two reasons.First, no method was selected by 100% of experts on any micrograph.Therefore, one can conclude that a visual analysis is not a sufficient way to validate the effectiveness of image analysis.Second, even if a successful method were found for one micrograph, it would not necessarily work for other micrographs, even if taken following the same microscopy procedure.In addition, without the use of a suitable soilingrelated standard image, there is no way to determine which of the methods that were favored or selected in the survey are correct and accurate.This was discussed also in a previous publication. [21]

Spatial Distribution of Soiling
A previous study suggested that various image analysis parameters have either higher or lower variability. [21]owever, this section solely focuses on the analysis of the area covered by particles (f ).Indeed, because of its reported strong correlation with optical, electrical, or thermal losses, [17][18][19][20] this is thought to be the most relevant image analysis parameter for soiling studies.Also, building upon the previous findings of this study, the results presented in this section are obtained using the "Triangle" thresholding method.
Soiling is not necessarily uniformly distributed over the glass coupon surfaces.This means that micrographs taken at different spots on the same coupon might return different results.This is shown in Figure 6, where each boxplot represents 300 micrographs (a 10 Â 10 array taken in 3 separate areas on each glass coupon) using the methodology described in Section 4. As the size of the boxplot increases, so does the level of soiling nonuniformity.
If the "Triangle" thresholding method is considered, a linear correlation can be observed between the mean and the standard deviation of the area coverages reported in Figure 6.This result, shown in Figure 7, suggests that a single micrograph might not be enough to characterize a full coupon.This is especially the case when the value of f is large.From Figure 7, one can see that the higher the level of soiling, as given by f, the larger is the standard deviation and therefore the uncertainty.The ratio of x to y anywhere on the plot is the coefficient of variation.It is therefore appropriate to consider a margin of error and the associated sample size.Specifically, the required sample size can be calculated for different margins of errors on the estimation of f, taking into account a 95% confidence interval.The results are shown in  .On average, it is found that 3-4 micrographs are needed if one can accept a AE 0.05 error for the fractional area coverage.At least 20 samples (micrographs) are needed if the error is lowered to AE0.02.Typically, since the accumulation of soiling increases with time, the recommended number of samples grows over time.On average, the number of recommended samples after 28 days is 3-4 times higher compared to the first day.
For coupons with low soiling (such as those coming from Spain, USA, and Australia in this case), two samples are typically sufficient.For the coupons with the highest losses (such as those from Cape Verde and Qatar in the present dataset), a higher number of samples is required.Intermediate results are found for the rest of coupons, with slightly higher numbers for those exposed in Chile and Jordan compared to those from Algeria and Morocco.Since micrographs can be stitched together, one can use a linear relationship to convert the recommended number of micrographs to a minimum area to be photographed.

Assumptions, Limitations, and Future Works
The aim of the present work is to highlight both the possibilities and the challenges associated with soiling estimation based on image analysis.This practice is quite common within the research community and has the potential to facilitate low-cost soiling monitoring and therefore its mitigation.However, despite its diffusion in the soiling-related scientific literature, the lack of standardized practices has so far the possibility of generalizing the results of the experimental investigations in the PV and CST fields.Through the analysis of 13 200 micrographs, this work was able to assess the applicability of methods for soiling studies, at least for DFM and for locations with low-to-intermediate losses.In addition, the identification of a minimum recommended sample size can help researchers and experts reduce the uncertainty of their investigations.However, this is just an initial contribution toward the production of general guidelines for such activities, as a number of limitations have to be taken into account, despite the unprecedented size of the sample population.These are discussed in this section and should be addressed in future studies.
First, it should be highlighted that the data collection period in this work was limited to approximately 1 month.Therefore, it should be kept in mind that the aforementioned results apply only to the indicated locations during a specific time period represented by the studied datasets.This means that, even if grouped by country, the results of Figure 8 should not be assumed representative for the investigated locations.The collection time is indeed too short to characterize the entire annual soiling profile of a site and does not allow for the evaluation of the local seasonality and interannual variability.Despite that, the results provide an indication of the uncertainty that one can encounter for a given soiling loss.
In addition, the present work provides a comparison of different thresholding methods, but it does not assess their accuracy, as also noted in the previous publication. [21]This indeed would require knowledge of the actual particle count and size distribution.However, this information is not available at the present time.Indeed, in the opinion of the authors of the present study, comparative studies on the accuracy of the image analysis methodologies could be conducted through one of two methodologies, each with its own potential and challenges.The first method could be through the use of a laser diffraction spectrometer, which employs a laser beam scattered from the particles to characterize the distribution of sizes. [32,33]However, this method requires the removal of the particles from the surface of the glass coupon, and therefore the destruction of the surface features and topology.More importantly, this step introduces a large uncertainty in the process, due to the removal or loss of some of the particles while others might adhere and remain on the glass coupon's surface.
The second approach could be the development of reference images, where particles of known number and size are displayed.This is currently being studied, as the images would need to replicate several features of real micrographs in order to avoid biased results.Real micrographs of soiled coupons, like those analyzed in this work, might present issues such as nonuniform background or uneven illumination, out-of-focus particles, translucent particles, surface scratches, and/or agglomerations of particles with and without similar characteristics.
Examples of the aforementioned issues are shown in Figure 9.The top row shows a micrograph with limited soiling and a large, bright, translucent halo.This generates a second peak in the pixel intensity histogram (shown in the middle), which is unusual in the present dataset, and confounds the results obtained using the "Triangle" method.As shown, indeed, the image analysis counts the large halo as a big particle.In cases like this in which there is a large halo, additional features would be required for ImageJ to return a physically meaningful result.
In the bottom row of Figure 9, the large white particles move the mode of the pixel intensity distribution toward the brightest side of the histogram.The mode is at 255, and a high threshold is identified by the "Triangle" method, leading to an erroneously small number of counted particles.For these reasons, wherever possible, it is always important to visually check the results of image analysis, even when the image processing algorithm is considered robust and reliable.Issues like the aforementioned ones would affect the particle counting and, as also discussed within the ImageJ wiki community, [22] might hinder the definition of a unique universally valid procedure.Despite that, it is believed that it is possible to provide the solar energy community with best practices and caveats when image analysis is employed for PV or CST soiling analyses, which has been the focus of work.
Image analysis can return values for different parameters, such as the particles' perimeters, and its results can be used to calculate additional factors, such as the cleanliness level. [21]hese can be more or less affected by variability and should be the subject of additional studies on the role of the threshold.19][20] In addition, it should be noted that the present work employs a single simple algorithm for image analysis, with the goal of comparing the role of the thresholding procedure.This approach was chosen because it appeared to be the most common among the experts involved in the previous study. [21]However, in reality, more complex algorithms could be employed, making use of different thresholding methods.Also, as discussed in another study, [21] the use of different menu entries in ImageJ within the same algorithm, such as Image > Auto Threshold instead of Image > Adjust > Threshold…, can produce dissimilar results.Similarly, the use of local thresholding [34] or preimplemented filters in ImageJ (e.g., FFT, IFFT, bandpass filter) as a preprocessing step could have led to different particle counting outcomes.
For these reasons, future works should evaluate the use of more complex algorithms, able to address the challenges faced by micrographs taken of coupons exposed under different soiling conditions.These same issues are currently hindering the development of reference micrographs and, therefore, the identification of accurate universal image analysis procedures.More studies should be conducted under a wider variety of conditions and experimental procedures to allow for the accumulation of more knowledge on this topic and a better quantification of the uncertainty of image analysis.Meanwhile, authors are encouraged to keep sharing their image analysis methodologies to make results more easily replicable.
Furthermore, it should be noted that the present work focuses only on ImageJ, the software that, as mentioned previously, has been predominantly used for image analysis in soiling studies.This is based on traditional image analysis processes.However, it must be acknowledged that other solutions are also available and can find application in soiling studies.These include, for example, artificial neural networks, which have been employed to estimate the losses from aerial pictures of soiled modules. [35]inally, one should also take note that the present study utilized DFM.This has its strengths as well as limitations, based on the discussion in Section 2.1.Unlike in a PV panel, where both scattered light and the direct light through the particles eventually reach the PV cell, or for a CST mirror, where the scattered light is excluded, DFM relies mainly on the scattered light.Although it may be a useful technique to detect and count particles and estimate their particle size distribution, it is not the appropriate type of microscopy or imaging technique to probe other characteristics, such as the spectrum of the light collected by the solar energy conversion device (e.g., PV panel or CST system).

Conclusions
Image analysis can be an effective and low-cost tool for estimating and characterizing soiling in PV and CST systems.Indeed, if micrographs of soiling accumulated on glass coupons are taken, these can be analyzed using tools like ImageJ to calculate the area covered by soiling.This enables the estimation of losses and the evaluation of potential mitigation strategies.However, prior works have warned about potential inconsistencies of the image analysis results for soiling in solar energy applications.
This work further explores the uncertainty related to image analysis by making use of 13 200 micrographs of soiled coupons exposed at nine locations worldwide.First, the robustness of various thresholding methodologies is analyzed.It is found that, despite having the lowest variability, soiling experts should not employ the "percentile" approach in locations with low-tointermediate soiling, because it will assume near-to-half area coverage independently of the soiling conditions.This was also confirmed by the results of a survey conducted among experts, which found this method appropriate for the counting of soiling particles in very few cases.Similarly, methods that assume bimodal pixel intensity distributions did not work well in the investigated dataset, returning the minimum number of particles and being rarely selected by the experts.Overall, the "Triangle" method appears to be the best option among the investigated methodologies for the soiling conditions analyzed in this study.Indeed, it returns the minimum coefficient of variation and a mean deviation closer to zero.It was also one of the two methods that were most often selected as best by the participants in the survey.However, additional studies are needed to evaluate the accuracy of the various image analysis methods and to further contribute to identifying best practices for soiling estimation.In particular, the challenges associated with the use of laser diffraction for this purpose and the current lack of reliable reference images should be addressed by the R&D community in future studies.
In order to identify best practices to minimize the uncertainty in the image analysis-based soiling estimation, the present study also investigated the potential nonuniform distribution of soiling on the glass coupons.It was found that, in order to keep the estimation error lower than AE 5 %, more than one micrograph should be taken per coupon.The recommended number of measurements typically increases with the soiling loss, and therefore it can be expected to be higher: 1) in locations with high soiling levels, and 2) as the number of days of exposure increases.With these aspects in mind, image analysis of particles deposited on the surfaces of PV and CST systems can be better understood, both for its utility and potential limitations.The sharing of more experimental data and results is essential to better understand the most effective practices to conduct such studies and the conditions in which the various methods perform better.However, to ensure the reproducibility of the results, future studies should always include the description of the employed methodologies.The absence of a clear methodology and the lack of a current standardized practice hamper the ability to apply findings of one study to other investigations, limiting the broader impact and utility of the research.

Experimental Section
Outdoor Soiling Collection: Glass coupons were deployed to nine locations worldwide, listed in Table 1, for a maximum period of 32 days.Coupons were mounted on PV modules (Figure 10) and exposed to natural soiling and cleaning events.Different replicates were mounted at each site, so that measurements could be taken at various times.On each measurement day, a coupon was removed from the experimental setup and shipped to Fraunhofer CSP, Germany, where microscopy was performed.
Microscopy: The soiled glass surfaces were examined by light microscopy using a Carl Zeiss Axio Scope A1 (black and white camera) with 20Â magnification and dark-field imaging mode.Using a motorized XYZ linear stage and the light microscope's associated software tool, line scans were made for each sample between one edge of the sample to almost the end of the opposite edge.For each scan, 10 images per row and 10 adjacent image rows were acquired, resulting in a total of 100 images for each area of the sample (left plot of Figure 11).Three separate areas were recorded in this way for each glass coupon.Each micrograph was 1388 pixels Â 1040 pixels at a 3.156 pixels μm À1 scale.
Image Analysis Process: Image analysis returns the number of particles on each micrograph (right plot of Figure 11).In this case, it also calculated the projected area of each particle in a given micrograph.An automated image analysis process was developed in Python 3.7 to automatically run a macro in ImageJ. [36]The macro, whose script is reported (below), consisted of an algorithm programmed to 1) open each micrograph; 2) set the scale to 3.156 pixels μm À1 ; 3) set the background as black, and the particles as white; 4) automatically determine the global threshold for a given thresholding method, using the Image > Adjust > Threshold… menu entry; 5) produce a file with the projected area of each particle on the image; and 6) save the pixel intensity distribution and the minimum and maximum threshold values.
Macro Script: #@ String name #@ String fname The methods were compared using the coefficient of variation and the deviation from average.The coefficient of variation was employed to evaluate the variability returned by the various methodologies.It was calculated as the ratio between the standard deviation and the mean of each distribution; the higher the coefficient of variation value, the less invariant the results.Since it allowed for a comparison of variance for different types of parameters, it was also utilized in a previous ImageJ study. [21]The relative deviation from average was calculated as: RD½% ¼ P P À 1 ⋅100%, where P is the mean value of a parameter calculated on a given location, day, and method, and P is the mean value returned by considering all the methods for the same parameter on the same day and location.A positive RD is returned if the method overestimates a parameter's value compared to the other methods.
The required number of measurements per location (n) was calculated using a sample size equation, [49] expressed as: n ¼ z 2 α=2 ⋅σ 2 =TE 2 , where z α=2 is the z score for a standard normal distribution (where α presents the α-level), σ is the standard deviation, and TE is the target (or desired) error.In the present study, a confidence level of 0.95 was considered, which means that α=2 ¼ 0.025 and z α=2 ¼ 1.96.
Survey: An anonymous survey was circulated among various soiling experts to evaluate the effectiveness of the various thresholding methodologies in particle counting.A copy of the survey is available in the Supporting Information.
The survey was developed on the Google Form platform and was conducted on a group of 18 experts (6 researchers or scientists, 5 professors, 4 PhD students, 3 engineers).The experts that were invited to participate were 1) authors of a previous study, [21] 2) involved in the data collection campaign, or 3) currently affiliated with one of the groups involved in the data collection campaign.
The survey counted nine micrographs among those investigated in this work, one per country.These were visually selected by the lead author to represent a variety of soiling conditions, from low to high fractional area coverages.The micrographs were also chosen to represent different exposure times, from 1 to 28 days.Finally, the micrographs showing various features, such as agglomerations of particles, scratches, and other types of deposits, were included, in order to allow the experts to assess the performance of the methods in a variety of conditions.
The initial question sought to establish the participants' familiarity with ImageJ, a widely recognized image analysis software.The binary nature of the question allowed respondents to indicate whether they had previous experience with the tool.Sixteen of the 18 experts had performed image analysis, and 14 had used ImageJ before.This preliminary result underscored ImageJ's significant presence in the field of image analysis and its widespread utilization.This foundational understanding set the stage for subsequent sections of the survey, enabling a more comprehensive exploration of the participants' experiences, preferences, and insights related to ImageJ's functionalities and impact.
For each micrograph, 17 processed images were provided, produced with 17 ImageJ thresholding methods (see Table 2).The methods in the survey also included the "IJ_IsoData", which was not analyzed in the rest of the work.In the processed images, counted particles were red colored.Methods were numbered 1-17.The experts were not told the correspondence between method name and number.For each micrograph, the experts were asked to select the method(s) that best matched their expected particle counting.They could select as many methods as they wanted.

Figure 1 .
Figure 1.Two examples of the image analysis workflow for coupons exposed 1 day in Australia (top row) and 14 days in Chile (bottom row).Each row shows: the original micrograph, the pixel intensity distribution, and the particle counting results (counted particles are red colored).

Figure 3 .
Figure 3. Relative deviation from average for each ImageJ method when the number of particles (N), the average particles' area (A), and the fractional area coverage (f ) are calculated.Positive values indicate that the method overestimates the parameter's value compared to the average of all methods.

Figure 2 .
Figure 2. Coefficient of variation for each ImageJ method when the number of particles (N), the average particles' area (A), and the fractional area coverage (f ) are calculated.

Figure 4 .
Figure 4. Minimum threshold distributions for each ImageJ methodology.The top plot on the left ("All coupons") shows the results when all the coupons are considered.The remaining plots show the results specific to each investigated location.Methods are ordered in ascending order according to the minimum threshold in the "All coupons" plot.Outliers are not shown.The maximum threshold is always 255.

Figure 5 .
Figure 5. Results of the survey conducted on 18 experts.The boxplots show the number of times each method was selected as one of those returning the best particle counting for each micrograph.They are constructed using nine data points, one for each micrograph.

Figure 8
Figure 8.On average, it is found that 3-4 micrographs are needed if one can accept a AE 0.05 error for the fractional area coverage.At least 20 samples (micrographs) are needed if the error is lowered to AE0.02.Typically, since the accumulation of soiling increases with time, the recommended number of samples grows over time.On average, the number of recommended samples after 28 days is 3-4 times higher compared to the first day.For coupons with low soiling (such as those coming from Spain, USA, and Australia in this case), two samples are typically

Figure 6 .
Figure 6.Top plot: distribution, as a boxplot, of fractional area coverages, measured with the "Triangle" method of ImageJ, for each country on the discrete data collection days (1, 3, 7, 14, 28) over the study period.Bottom plot: zoom in on the top graph on those days.

Figure 7 .
Figure 7. Standard deviation of the fractional area coverage versus the mean of the fractional area coverage distribution for each individual coupon.The data are calculated using the Triangle method.The colors indicate the installation location for the soiled glass coupons.The dark line represents the best fit, which follows this equation: y ¼ 0.209⋅x þ 0.015.The linear correlation returns R 2 of 63.0% and p-value < 0.05.

Figure 8 .
Figure 8. Sample size to achieve a target error and a confidence interval of 0.95 in the calculation of the fractional area coverage.The color corresponds to the day on which the measurement was done.Day 2 for Cape Verde is not shown.Day 15 for Jordan and Day 32 for Spain are reported as Day 14 and Day 28, respectively, for conciseness.

Figure 9 .
Figure 9. Examples of micrographs that mislead the "Triangle" method.Each row shows: the original micrograph, the pixel intensity histogram, and the results of the particle counting (counted particles are red colored).The top row shows a micrograph from a coupon exposed for 1 day in the United States.The bottom row shows a micrograph from a coupon exposed for 7 days in Qatar.

Table 1 .
List of locations in which the coupons were deployed.Positive latitudes are north of the equator, and positive longitudes are east of the Prime Meridian.