Detecting clear‐sky periods from photovoltaic power measurements

A method for detecting clear‐sky periods from photovoltaic (PV) power measurements is presented and validated. It uses five tests dealing with parameters characterizing the connections between the measured PV power and the corresponding clear‐sky power. To estimate clear‐sky PV power, a PV model has been designed using as inputs downwelling shortwave irradiance and its direct and diffuse components received at ground level under clear‐sky conditions as well as reflectivity of the Earth's surface and extraterrestrial irradiance, altogether provided by the McClear service. In addition to McClear products, the PV model requires wind speed and temperature as inputs taken from ECMWF twentieth century reanalysis ERA5 products. The performance of the proposed method has been assessed and validated by visual inspection and compared to two well‐known algorithms identifying clear‐sky periods with broadband global and diffuse irradiance measurements on a horizontal surface. The assessment was carried out at two stations located in Finland offering collocated 1‐min PV power and broadband irradiance measurements. Overall, total agreement ranges between 84% and 97% (depending on the season) in discriminating clear‐sky and cloudy periods with respect to the two well‐known algorithms serving as reference. The disagreement fluctuating between 6% and 15%, depending on the season, primarily occurs while the PV module temperature is adequately high and/or when the sun is close to the horizon with many more interactions between the radiation, the atmosphere and the ground surface.

when comparing solar irradiance estimates against relevant high-quality ground-based measurements or for efficiency estimates of photovoltaic (PV) systems during steady conditions, meteorologists and PV system operators need to select clear-sky instants (CSIs) over a measurement period (Ineichen, 2016;Lefèvre et al., 2013;Wandji Nyamsi et al., 2023;Wandji Nyamsi, Saint-Drenan, Arola, & Wald, 2023).A CSI, generally associated with a given measurement, is defined as the measurement instant with the absence of visible clouds over the entire sky vault.A set of consecutive CSIs denotes a clear-sky period.
Numerous methodologies for selecting CSIs from broadband irradiance measurements have been developed and extensively evaluated (Bright et al., 2020;Gueymard et al., 2019;Lefèvre et al., 2013;Long & Ackerman, 2000;Reno & Hansen, 2016;Younes & Muneer, 2007).They typically rely on measurements of global irradiance received on a horizontal surface at ground level, denoted G, its direct component, denoted B, that is, irradiance coming from the direction of the sun, and the diffuse component, denoted D, that is, all irradiances coming from remaining directions from the sky vault so that G ¼ B þ D. A list of such methodologies can be found in Gueymard et al. (2019) or Bright et al. (2020).The choice of a methodology by users depends essentially on the measurement type available spatially and temporally.In that sense, measured G is more commonly available at meteorological stations than measured B and D because of instrument, maintenance and operation costs.
Considering the exponential growth in installed PV capacity and expansion of PV networks to respond to the increasing demand for renewable electricity production during the past decade (IRENA, 2019), PV power measurements have become more and more spatially and temporally available all over the world.This emerging network of solar PV installations constitutes an interesting new source of cloud information.From a meteorological perspective, there is a connection between solar electricity production (PV output) and cloud conditions, which could be utilized to identify CSIs following principles of the above-mentioned solar radiation-based methods.PV output data will, however, have its own characteristics related to, for example, inverter performance, slope and inclination angles of the solar panels and their temperature-dependent efficiency.In addition, such detection of clear-sky periods from PV power measurements could be useful for the performance monitoring of PV systems through, for example, identification of degradation and soiling.The goal of this study is to present and test, as pioneering work, a method for detecting CSI directly from PV power measurements.The algorithm by Reno and Hansen (2016) is chosen as a starting point, as it identifies clear-sky periods based only on broadband global horizontal irradiance measurements.The method is developed and tested using solar PV output data from two PV sites of the Finnish Meteorological Institute (FMI).
The study is organized as follows: Section 2 describes all the measured and modelled data used in this study.Section 3 presents an overview of the Reno and Hansen (2016) algorithm, which is abbreviated as RH16.Section 4 describes the developed method for detecting CSI from PV power measurements.Visual and quantitative assessments of the performance of the newly developed method carried out by visual inspection and comparisons to two widely used algorithms are given and discussed in Section 5 as well as possible explanations for the discrepancies between methodologies.Eventually, the conclusions are itemized in Section 6.

| DATA USED IN THIS STUDY
All data used in this study can be freely accessed through public sources available on the web or are available from the authors upon request.Details on access are specified in this section and are given in the 'Data Availability' section.

| Irradiance and PV power measurements
All ground-based measurements used here were primarily presented in Böök et al. (2020) alongside a detailed description.Briefly, 1-min high-quality measurements were collected from two meteorological stations of the FMI over the period from 01 January 2019 to 31 December 2021.The stations are located at FMI's office in Helsinki (60.204 N,24.961 E) and Kuopio (62.892 N,27.634 E).Firstly, at Helsinki station, to gather irradiance data, measured G and D were operated with two pyranometers in which the one for D is associated with a shadow ball and sun tracker.Both are Kipp & Zonen pyranometers model CM11 elevated at 25 m above the ground level (agl).For PV power data, measurements are provided from a PV system grouped in two identically large PV module sets with distinct maximum power point trackers (MPPT).Both PV module sets are manufactured by the SolarWorld Protect model SW 250, which has a nominal capacity of 10.5 kWp per array of seriesconnected PV modules, hereafter referred to as string (Böök et al., 2020).The module material is polycrystalline silicon.The PV system is on a flat roof with fixed tilt orientation characterized by a PV tilt angle from a horizontal plane of 15 and a PV azimuth angle 135 clockwise from North, at a height of at 17 m agl.From the backside of each array of PV modules, module temperatures are measured, and they are performed at opposite corners.
Secondly at Kuopio station, similar instruments as described previously for Helsinki station were used with few differences as follows: G and D measurements are carried on at 16 m agl, where measured D was operated by a Delta-T SPN1 pyranometer.PV modules are manufactured by SolarWatt Blue 60P with a nominal capacity of 10.14 kWp per array of PV modules.The PV tilt and azimuth angles are 15 and 217 , respectively, at 10 m agl.The FMI ground-based measurements are available from the authors upon request.

| ECMWF wind speed components and temperature
Among other data, the European Centre for Mediumrange Weather Forecasts (ECMWF) provides reanalysis variables of eastward and northward components of the 10 m wind speed, denoted u and v respectively, as well as air temperature at 2 m agl, denoted T air .These variables are obtained from a data assimilation principle combining model data and observations from around the world.They are considered to represent the atmospheric state fairly well.
In this study, u, v and T air from ECMWF ERA5 data are selected.The data have a temporal resolution of 1 h covering the period from 1940 onwards and a spatial resolution of 0.25 latitude x 0.25 longitude (approximately 30 km) for the globe (Hersbach et al., 2023).The hourly data are resampled in time to the closest pixels of two FMI stations by linear interpolation to derive 1-min data.The data derived from this latter procedure serve as inputs to the PV model described in Section 4 for estimating PV power under clear-sky conditions.The ERA5 products have been downloaded from the website https:// cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form, last access: 01 March 2023.

| McClear products
The McClear model (Gschwind et al., 2019;Lefèvre et al., 2013) is developed based on a look-up-table approach by means of the radiative transfer model libRadtran (Emde et al., 2016;Mayer & Kylling, 2005) with the most improved correlated-k parametrization of Kato et al. (1999); kato2andwandji, as named in libRadtran, (Wandji Nyamsi et al., 2014;Wandji Nyamsi, Arola, et al., 2015).This clear-sky model has been chosen because several users, including researchers and academics in various domains, have reported the high accuracy of its outputs when compared to appropriate ground-based measurements.For an exhaustive list of articles performing McClear validations, please refer to Wandji Nyamsi, Saint-Drenan, Arola, and Wald (2023).With the geographic coordinates at any place in the world, it provides within a few seconds a time series of G, D, B and its component at normal incidence B N under clear-sky conditions as well as extraterrestrial irradiance on a horizontal plane, denoted E O , at the ground level for any period from 2004 until 2 days ago with different temporal summarizations (1, 15, 60 min, 1 day and 1 month) from 2004 onwards.The McClear v3 service exploits datasets from various databases such as the Copernicus Atmosphere Monitoring Service (CAMS) and NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) observations characterizing the atmospheric state and surface type.One-minute is the temporal period of interest to us.
McClear products are freely accessible by machineto-machine calls to the Web service McClear on the SoDa Service (Gschwind et al., 2006, www.soda-pro.com, last access: 01 March 2023) or manually through a web interface.In the verbose mode, the data returned by the service contains 1-min values of readings from CAMS interpolated in space and time, namely, the optical depth of aerosols at 500 nm, and the total column contents in water vapor and ozone.It also contains 1-min values of solar zenith angle, denoted θ S , computed with the SG2 algorithm (Blanc & Wald, 2012), and of irradiance at the top of atmosphere and at ground level, and ground albedo, denoted ρ g .This mode was conveniently exploited for the collection of McClear products for the entire measurement period.Among McClear products, G, D, B N , θ S and ρ g are selected and they have been downloaded from the website http://www.soda-pro.com/web-services/radiation/cams-mcclear, last access: 01 March 2023 after registration and are also the inputs to the PV model.Reno and Hansen (2016) have intuitively designed their algorithm by examining the diurnal variation of G under clear-sky conditions represented by a smoothly varying curve.RH16 algorithm uses measured G and the corresponding clear-sky estimate, hereafter named clear-sky G.For a given time period, RH16 algorithm computes five parameters characterizing the steadiness, profile and the amount of measured G over time and compares these parameters to those obtained from clear-sky G.These are (1) the mean value, (2) the maximum value, (3) the total length of the sequence of line segments connecting the points of temporal variation, (4) the standard deviation of the slope between sequential points in the time series normalized by the mean value of measured G and (5) the maximum difference between changes in measured and clear-sky G.This leads to five criteria with specific thresholds for classifying the given time period as clear or not.

| OVERVIEW OF THE RENO AND HANSEN (2016) ALGORITHM IDENTIFYING CLEAR-SKY PERIODS WITH BROADBAND GLOBAL HORIZONTAL IRRADIANCE MEASUREMENTS
The first criterion is that the mean values of measured and clear-sky G should be close enough, meaning that the difference between mean values of measured and clearsky G should be less than a threshold value.The second criterion is like the first one but using the maximum values instead of mean values.The third criterion is also like the first one too but using the total lengths of the sequence of line segments connecting the points of time series instead of mean values.The fourth criterion relies on the standard deviation of the slope between sequential points in the temporal variation of measured G normalized by the mean value of measured G, which should be less than a threshold value.Finally, the fifth criterion is that the maximum difference between changes in measured G and changes in clear-sky G over the given time period should be less than a threshold value.
The authors have acknowledged that threshold value for each criterion must be empirically established and adjusted depending on the number of measurements.A given time period that successfully passed all five tests is declared clear.An individual measurement within the given time period is declared clear as well.The RH16 algorithm may be applied to an entire day by using a moving time window of individual measurement.In their case study, the authors have established threshold values and assessed their algorithm for G measurements of 1-min temporal resolution with a 10-min moving window.Within the validation procedure, the RH16 algorithm has shown its high performance when compared to the Long and Ackerman (2000) algorithm (abbreviated as LA00), which additionally requires measurements of D and B N .Therefore, one clear advantage of RH16 algorithm is that it is solely based on measured G.

| METHOD FOR DETECTING CLEAR-SKY PERIODS FROM PV POWER MEASUREMENTS
The development of the method for detecting clear-sky periods proposed in this research is inspired by the RH16 algorithm with important adjustments and modifications summarized as follows.The first one is that our method uses PV power measurements with its own characteristics related to, for example, inverter performance, slope and inclination angles of the solar panels and their temperaturedependent efficiency, instead of broadband irradiance measurements of G as originally made with RH16 algorithm.The second one is the use and calibration of a clear-sky PV model for applying to any place in the world instead of the clear-sky model for broadband irradiance estimates as originally made with RH16 algorithm.The third one is the modified mathematical formulation of each criterion including newly established threshold values.Therefore, the method is divided into three steps: (1) estimating PV power under clear-sky conditions primarily following an approach described, elaborated upon, tested, and validated by Böök et al. (2020), (2) calibration of a PV model with an initial set of detected CSI from PV power measurements and (3) optimizing detection of CSI.

| Estimated PV power under clear-sky conditions
This first step of the method is split into four sub-steps described hereafter in more detail and primarily inspired by the Böök et al. (2020) and (3) the diffuse irradiance on a tilted PV plane from the sky dome, denoted D T clear , with D T clear so that G T clear is mathematically computed as follows: where θ i is the angle of incidence between the Sun's rays and the tilted PV plane and θ T is the PV tilt angle.
4.1.2| Calculating the total effective irradiance of the PV panel The irradiance G T clear hitting the PV surface is partly absorbed, and the remaining part is reflected away from the PV surface.That absorbed part, which is effectively converted into PV power, is called the total effective irradiance of PV panel, denoted G T eff _clear , and can be estimated by subtracting the angular reflection losses from G T clear .Other losses that are due to the spectral variation of irradiance, the accumulation of snow/dirt/dust and any shadowing effects are neglected.The Martin and Ruiz (2001) modelling approach is used and provides factors accounting for these angular reflection losses on B T N_clear , D T g_clear and D T clear , denoted α BN , α dg and α d , respectively, and computed as follows: where a r = 0.159 is the empirical angular losses coefficient, representative of a traditional polycrystalline silicon module while c 1 ¼ 4 3π and c 2 ¼ À0:074 correspond to fitting parameters for approximate analytical solutions.Finally, G T eff _clear is computed as follows: 4.1.3| Deriving PV module temperature The Sandia PV Array Performance Model (SAPM; King et al., 2004) is used to derive the PV module temperature, denoted T module , as follows: where a ¼ À3:47 and b ¼ À0:0594 are empirical coefficients used to estimate the PV module back surface.Following the SAPM model, cell temperature T cell is computed as 1000 W m -2 ΔT where ΔT = 3 C is the difference temperature between the cell and the module back surface at an irradiance of 1000 W m À2 .ws is the wind speed computed from two wind components and using the wind profile power law relationship for neutral stability conditions (Chen et al., 1998) and h mod the module's height expressed in meters.T amb is assumed to be equal to T air .

| Estimated PV power
Finally, instantaneous PV power estimates under clearsky conditions, denoted PV e clear , with the superscript 'e' denoting estimated, is derived from the relative efficiency, denoted ƞ rel , following the Huld et al. ( 2010)'s formulation as follows: where PV STC is the total nominal capacity of the PV system also called PV power at Standard Test Conditions 2,3,4,5,6 are the standard PV performance coefficients given in Table 1 (Huld et al., 2011).

| Calibrating PV power estimates under clear-sky conditions
Uncertainties of instantaneous PV e clear originate from various sources and lead to biased estimates (Urraca et al., 2018).The global irradiance and its direct and diffuse components, used as inputs, are subject to uncertainty, as are also other inputs and the PV model itself.Following the ISO Guide to the Expression of Uncertainty in Measurement (ISO/IEC, 2008), the relative uncertainty, as the range associated with a 95% confidence interval, can be characterized by relative standard deviation, denoted σ.Uncertainties in PV power estimates are introduced by (1) a clear-sky model that estimates irradiances on a horizontal plane often reaching up to σ 1 = 8% (Ineichen, 2016;Wandji Nyamsi, Saint-Drenan, Arola, & Wald, 2023); (2) a transposition model of approximately σ 2 = 5% uncertainty (Loutzenhiser et al., 2007); (3) by power rating at STC related to PV STC depending on module manufacturers and reaching up to σ 3 = 3%, for example, for a thin-film module (Dirnberger & Kräling, 2013); (4) degradation of PV modules, which is synthesis of very rapid and gradual reduction of PV efficiency within the first few days of exposure and years respectively approximated to σ 4 = 6% for UV effects, for instance, Kaaya et al. ( 2021) and ( 5) other modelling errors such as spectral effects, set as σ 5 = 5%.Finally, the total combined uncertainty of PV e clear , denoted σ 0 , can be calculated by the rule of squares as (Thevenard & Pelland, 2013): Hereafter, σ 0 is assumed to be the total margin of error associated with PV e clear .An initial set of CSIs is required for calibration.For a given day of measurement period from sunrise to sunset, on every set of N-minute series of PV e clear and measured instantaneous PV power, denoted PV m (superscript 'm' for measured), obtained with a moving N-minute time window, five tests are applied as follows: • Test #1 relies on the fact that under clear-sky conditions, both mean values of N-minute series from PV m and PV e clear should be close enough, mathematically expressed by their relative difference with respect to the mean value of PV e clear in absolute value lower than the threshold set as σ 0 , mathematically formulated as follows: PV e clear or PV m at i-th minute.This test differs from the original RH16 algorithm in the sense that we have used the relative difference instead of the difference.
• Similarly to test #1 and instead of the mean value, test #2 relies on the closeness of both maximum values, mathematically formulated as follows: max where max Y ð Þ is the maximum value over PV e clear or the PV m time series.This test differs from the original RH16 algorithm in the sense that we have also used the relative difference instead of the difference.
• Similarly to test #1 and instead of the mean value, test #3 relies on the closeness of both total line length values, mathematically formulated as follows: q and t i is the i-th minute.L e clear and L m are total lengths of the sequence of the line segments connecting the points of temporal variation for PV e clear or the PV m time series, respectively.This test differs from the original RH16 algorithm in the sense that we have also used the relative difference instead of the difference.
• Test #4 relies on the normalized standard deviation of rate of change in PV power measurements with respect to the mean value of the PV power measurements, which, under clear-sky conditions should be less than • Test #5 relies on the maximum difference between changes in PV e clear and PV m normalized by the absolute difference between consecutive PV e clear values.The latter should be less than σ 0 and mathematically formulated as follows: max This test differs from the original RH16 algorithm because of the introduction of the normalization term.For this study, thresholds L 0 and τ 0 were set empirically to 100 and 0.005, respectively, for 10-min series of PV power and a moving 10-min time window.These thresholds were established by visually investigating T A B L E 1 Standard PV performance coefficients for polycrystalline silicon module (Huld et al., 2011).
À0.017162 À0.040289 À0.004681 0.000148 0.000169 0.000005 clear-sky and cloudy periods over one full year with both calibrated PV e clear and PV m .For different lengths of time series and a sliding time window, the thresholds L 0 and τ 0 should be adjusted and updated.Only N-minute series of PV m that simultaneously and successfully passed the five tests are retained as the N-minute period under clear-sky conditions and their corresponding minutes are classified as clear.
This process is repeated from sunrise to sunset to retrieve a set of CSIs for the given day.Then, relevant CSIs are kept and gathered with clear-sky detection of other days to cover the month and so on.Both instantaneous PV e clear and PV m at those retrieved CSIs are then used for determining the calibration coefficient, denoted in order to minimize the relative bias and relative root mean square error as much as possible.One can notice that β 0 can be determined, for instance, on a monthly and annual basis, depending on the choice/preference/need of the users.

| Detecting CSIs
For each day, the previously described process for the initial detection of CSIs is literally applied in the iterative scheme with calibrated PV e clear È É j for the j-th iteration so that: where β jÀ1 is now a monthly calibration coefficient for the j-th iteration considering PV e clear . This allows a reasonable number of CSIs to be retrieved.Iterating this way also takes into account the possible dynamicity of atmospheric conditions from 1 day to another.The iteration stops when either for j = 20 or for β j À β jÀ1 ≤ 0:00001.

| VALIDATION
The assessment of the performance of the developed method was carried out by visual inspection and quantitatively by comparing with the two well-known RH16 and LA00 algorithms.Contrary to the former algorithm, the latter requires measured G and D, which are available at both stations.The assessment is carried out in all seasons except winter given the frequently snowy conditions at the time of year.For illustration purposes, a focus is put on March, June and September 2020 at both stations.A sliding 10-min window is applied for clear-sky detection.The selected measurement day is 12 June 2020.All data are normalized, that is, divided by the nominal peak capacity for PV power or the extra-terrestrial irradiance at the normal incidence for irradiance.Measured PV power and irradiance data are plotted with black left and magenta right y-axis respectively.With respect to the black left y-axis, the green dot is the measured PV power PV m from one of the PV strings, string #2 is labelled 'Meas DC2', the blue cross is the estimated clear-sky PV power labelled 'Esti DC' obtained from the combination with McClear products and the PV model after applying the monthly calibration factor PV e clear , the black plus sign is the measured PV power from string #2 plotted for CSI obtained from RH16 algorithm using measured G labelled 'CSI-RH16', the red dot is the measured PV power from string #2 plotted for CSI obtained from the new method labelled 'CSI-WL23'.With respect to the magenta right y-axis, measured G is plotted as a magenta dot labelled 'Meas G' and the corresponding one for CSI obtained from LA00 algorithm labelled 'CSI-LA00' is plotted in cyan asterisk.

| Visual inspection
Based on both RH16 and LA00 algorithms applied to measured G, this specific day is almost a fully clear-sky day with a smooth parabola shape except some periods close to sunrise and sunset or more generally when the sun is close to the horizon.It is worth mentioning, as reported by the authors, that the LA00 algorithm tends to perform less accurately for θs≥80 .Visually, a clear-sky day is also detected by the WL23 method with PV power measurements.Therefore, one can conclude an overall agreement of both methodologies irrespective of the type of measurements used.Nevertheless, discrepancies are seen, for instance, between 08:00 and 10:00 UTC delimited, by the light-yellow area, and these require further investigations as indicated in the following paragraph.
Between 08:00 and 10:00 UTC, one notices that CSIs are detected by the RH16 algorithm using measured G as well as by the LA00 algorithm with measured G and D whereas several minutes are classified as cloudy ones by the WL23 method using PV m .It is worth mentioning that two independent measurement instruments are involved here so that, for instance, something may occur on the PV sensor but not affect the irradiance sensors.
A zoom in on the PV power from 8 to 10 UTC, shown as an inset in Figure 1, indicates noticeable deviations between measured and estimated PV power, leading to a failure of several tests so that a large portion of the time window is classified as being not under clear-sky F I G U R E 1 Diurnal variation of PV power as well as global horizontal irradiance.Measured PV power is the green dot labelled 'Meas DC2'.Estimated clear-sky PV power after monthly calibration is the blue cross labelled 'Esti DC'.Measured PV power for clear-sky minutes from RH16 (our own) algorithm is the black plus sign (red dot) labelled 'CSI-RH16' ('CSI-WL23').Measured global irradiance is the magenta dot labelled 'Meas G'.The CSI from LA00 algorithm labelled 'CSI-LA00' is a cyan asterisk.The sub-plot shows the temporal variation of PV power and module temperature with black left and dark orange right y-axis, respectively, between 08:00 and 10:00 UTC.conditions.Also shown in the inset is the module temperature, denoted T module (plotted on the dark orange right y-axis), which peaks at 47 C at 9 UTC and reaches a local minimum of ca 42 C at close to 9:45 UTC.It is known that the PV efficiency is temperature-dependent such that the relative efficiency typically decreases by 0.5% with each 1 C increase in cell temperature (Dubey et al., 2013).Thus, the temperature-dependent PV efficiency together with the variations in T module explains the ripples in PV m , which eventually cause the method to categorize a large part of this time window as cloudy.In this context, it is worth noting also that the original ERA5 wind speed and ambient temperature (see Section 2.2) have a relatively coarse resolution (1 h, 30 km).Although, they are resampled in time to 1-min resolution by linear interpolation to the closest pixel to the selected location, such 1-min resampled data may not always be able to capture exact atmospheric features occurring at very short space scales and timescales.In addition, the temperature model employed cannot capture the dynamic behaviour of module temperature, that is, the heat capacity of the module is not taken into account.In real life, changes in irradiance and windspeed will lead to changes in module temperature, but with a  slight delay, with a time constant of roughly 10 min (Barry et al., 2020).This behaviour is not considered with a static temperature model.Figures 2 and 3 display the clear-sky detection of the presented method performed over the first and second half of June 2020 at Helsinki station.Similar graphs for March and September 2020 are displayed in Appendix A. The green dot is the measured PV power from string #2 labelled 'Meas DC2'.The blue dot is the estimated PV power labelled 'Esti DC' derived from the clear PV model after monthly calibration is applied.The red dot is the measured PV power from string #2 plotted for CSI from presented method labelled 'CSI-WL23'.For an almost completely clear-sky day, that is, when there is apparently no visible cloud in the entire sky vault over the whole day, each day looks similar to 12 June 2020 shown in Figure 1.For instance, the morning and entire afternoon on 10 June 2020 experienced significant variability leading low and high PV m apparently due to clouds between the sun and the observer.Nevertheless, the same date shows few clear-sky periods from 06:00 to 08:00 UTC.Such variability observed on both 16 and 29 June 2020, so that the entire day can be classified as being under cloudy-sky conditions.

| Comparison with other algorithms
A quantitative assessment of the WL23 method is carried out over June 2020 by comparing RH16 and LA00 algorithms.All three methodologies including WL23 method were applied to appropriate data over each day between sunrise and sunset and covering the full month.Then, each minute was classified as being under clear-sky and cloudy-sky conditions.Table 2 reports the assessment results as a percentage of minutes between sunset and sunrise classified as under clear-sky and cloudy-sky conditions matching with methodologies for both stations.Statistics for Kuopio are in brackets.Similar results are given in Tables B1 to B2 in Appendix B for March and September 2020.The comparison shows an overall agreement under clear-sky and cloudy-sky conditions for more than 84% of minutes regardless of the station, with a better agreement with the RH16 algorithm.More cloudy periods are observed in Kuopio than Helsinki.
The total discrepancy is evaluated to 13% and 16% in Helsinki and Kuopio, respectively, when, for instance, the WL23 method classifies the minute cloudy but the RH16 algorithm classified it as clear.This disagreement is partially explained as previously mentioned mainly due to the use of distinct measurement instruments and for situations when the PV module temperature is high enough.In addition, the agreement difference when comparing with RH16 and LA00 algorithms is partially explained as follows: during a given period with visible clouds close to sun, measured G may be slightly affected, making RH16 algorithm classify it as a clear-sky period while measured D is significantly sensitive making LA00 algorithm classify it as a cloudy-sky period.
Further investigations are carried out by displaying the histogram (Figure 4) of θ S range for this total discrepancy for Helsinki during June 2020.More than half of cases occur when the sun is close to the horizon and a quarter of cases when the PV module temperature is high enough.For the [70 , 90 ] bin, that is, when the sun is close to the horizon, multiple interactions, that is, reflection and scattering, between sun rays and air molecules/ clouds/ground surface occurs along the long optical paths of sun rays in the atmosphere.PV power measurements are low and even close to usual uncertainties affecting models and measurements at very large θ S so that a restriction on the application of the developed method with a noticeable confidence level could be made for θ S ≤ 80 .

| CONCLUSIONS
In this study, a novel method for detecting clear-sky periods from PV power measurements has been developed and assessed.It is inspired by the Reno and Hansen (2016)   on a horizontal plane as well as ground albedo.In addition, the ECMWF 20th century reanalysis ERA5 wind speed components and air temperature are altogether inputs to PV model designed for estimating PV power at ground level under clear-sky conditions.The principle is to compare measured and estimated clear-sky PV power on the steadiness, profile, and the absolute magnitude throughout the day by means of five parameters that yield to five tests with established threshold values.The proposed method has been validated at two stations located in Finland offering collocated 1-min PV power and broadband irradiance measurements.
Regardless of the location, the method performs well when visual inspection with a smooth parabola shape is carried out, except some periods close to sunrise/sunset or during the period when module temperature is adequately high.The quantitative assessment was also carried out with two well-known algorithms identifying clear-sky periods with broadband global and diffuse irradiances on a horizontal plane.Dealing here with two distinct measurement instruments induces situations so that a specific disturbance would affect one instrument but not the other one.Considering the universality of this study, the proposed method has the potential to be applied at any place in the world where PV power measurements are available, based on the global reach of both the McClear and ERA5 products.It is emphasized, however, that local conditions such as seasonal snow cover and soiling will need to be considered.Such method could be easily extended for detecting CSIs from UV irradiance and photosynthetically active radiation measurements combined with their respective clear-sky models (Wandji Nyamsi, Espinar, et al., 2015;Wandji Nyamsi et al., 2017;Wandji Nyamsi et al., 2019;Wandji Nyamsi et al., 2021).

RH16
approach.To carry out this approach, initially, 1-min values of G, D, B N , E O , θ S and ρ g were collected from the McClear service as previously described in Subsection 2.3.Hereafter, G, D, B N and E O from McClear have been referred to as G clear , D clear B N_clear and E O_clear , respectively, for the sake of simplicity.4.1.1| Transposing irradiances on a horizontal plane onto a tilted PV plane Each instantaneous G clear , D clear and B N_clear were then transposed into irradiances on a tilted PV plane resulting in three components: (1) direct component of G T (with superscript T denoting irradiance transposed onto a tilted PV plane) at normal incidence, denoted B T N_clear , with

Figure 1
Figure1displays an example of the temporal variation of PV power and irradiance at Helsinki station.Similar observations were seen in the data from Kuopio station.The selected measurement day is 12 June 2020.All data are normalized, that is, divided by the nominal peak capacity for PV power or the extra-terrestrial irradiance at the normal incidence for irradiance.Measured PV power and irradiance data are plotted with black left and magenta right y-axis respectively.With respect to the black left y-axis, the green dot is the measured PV power PV m from one of the PV strings, string #2 is labelled 'Meas DC2', the blue cross is the estimated clear-sky PV power labelled 'Esti DC' obtained from the combination with McClear products and the PV model after applying the monthly calibration factor PV e clear , the black plus sign is the measured PV power from string #2 plotted for CSI obtained from RH16 algorithm using measured G labelled 'CSI-RH16', the red dot is the measured PV power from string #2 plotted for CSI obtained from the new method labelled 'CSI-WL23'.With respect to the magenta right y-axis, measured G is plotted as a magenta dot labelled 'Meas G' and the corresponding one for CSI obtained from LA00 algorithm labelled 'CSI-LA00' is plotted in cyan asterisk.Based on both RH16 and LA00 algorithms applied to measured G, this specific day is almost a fully clear-sky day with a smooth parabola shape except some periods close to sunrise and sunset or more generally when the sun is close to the horizon.It is worth mentioning, as reported by the authors, that the LA00 algorithm tends to perform less accurately for θs≥80 .Visually, a clear-sky day is also detected by the WL23 method with PV power measurements.Therefore, one can conclude an overall agreement of both methodologies irrespective of the type of measurements used.Nevertheless, discrepancies are seen, for instance, between 08:00 and 10:00 UTC delimited, by the light-yellow area, and these require further investigations as indicated in the following paragraph.Between 08:00 and 10:00 UTC, one notices that CSIs are detected by the RH16 algorithm using measured G as well as by the LA00 algorithm with measured G and D whereas several minutes are classified as cloudy ones by the WL23 method using PV m .It is worth mentioning that two independent measurement instruments are involved here so that, for instance, something may occur on the PV sensor but not affect the irradiance sensors.A zoom in on the PV power from 8 to 10 UTC, shown as an inset in Figure1, indicates noticeable deviations between measured and estimated PV power, leading to a failure of several tests so that a large portion of the time window is classified as being not under clear-sky Visual illustration of clear-sky detection for Helsinki from 1 June 2020 to 15 June 2020.Measured PV power is represented by green dots and red dots indicate minutes that are detected as clear.Estimated calibrated PV power is represented by blue dots.

F
I G U R E 3 Same as Figure 2 but from 16 June 2020 to 30 June 2020.T A B L E 2 Percentage of minutes between sunset and sunrise classified as being under clear-sky and cloudy-sky conditions for each algorithm during June 2020.

F
I G U R E 4 Histogram of discrepancy cases as a function of θ S range for Helsinki during June 2020.

F
I G U R E A 4 Same as Figure A1, but for the second half of September 2020.T A B L E B 1 Percentage of instants between sunset and sunrise classified as being under clear-sky and cloudy-sky conditions for each algorithm for March 2020.Statistics for Kuopio are in brackets.
Perez et al. (1990)989)d depends on G clear , D clear , B N_clear , θ S , Φ s the solar azimuth angle computed with the SG2 algorithm(Blanc & Wald, 2012), θ T , Φ T the PV azimuth angle, air mass m θ S ð Þ computed fromKasten and Young (1989)formulation and the extra-terrestrial irradiance at the normal incidence, denoted E ON_clear , as an input of thePerez et al. (1990)model was obtained by dividing E O_clear by cos θ s ð Þ. Mathematical expressions of θ i and m θ S ð Þ are given as follows: D T clear is computed with the Perez et al. (1990) model recently updated by

Meteorological Applications Science and Technology for Weather and Climate
Statistics for Kuopio are in brackets.
algorithm, which uses broadband global irradiance measurements on a horizontal plane.The proposed method uses a combination of McClear products such as estimated clear-sky global, diffuse and direct irradiances Percentage of instants between sunset and sunrise classified as being under clear-sky and cloudy-sky conditions for each algorithm for September 2020.Statistics for Kuopio are in brackets.
A P P END I X B: Tables reporting the percentage of instants between sunset and sunrise classified as being under clear-sky and cloudy-sky conditions matching with methodologies for both stations.Statistics for Kuopio are in brackets T A B L E B 2