Improving the Estimation of the Diffuse Component of Photosynthetically Active Radiation (PAR)

Most weather forecasting models are not able to accurately reproduce the great variability existing in the measurements of the diffuse component of photosynthetically active radiation (PAR; 400–700 nm) under all sky conditions. Based on the well‐known relationship between the diffuse fraction (k) and the clearness index (kt), this study addresses improvements in estimations by proposing adaptations of previous models, which were previously applied only to the total solar irradiance (TSI; 280–3,000 nm). In order to reproduce this variability, additional parameters were introduced. The models were tested employing a multisite database gathered at the Mediterranean basin. Since Artificial Neural Network (ANN) models are not limited to fixed coefficients to predict the diffuse fraction of PAR (kPAR), these types of models are more accurate than empirical ones, reaching determination coefficients (r2) up to 0.998. However, the simpler linear model proposed by Foyo‐Moreno et al. (2018), https://doi.org/10.1016/j.atmosres.2017.12.012 shows a similar performance to the ANN models, directly predicting the diffuse component of PAR (PARDiffuse) from TSIDiffuse, with a r2 up to 0.997. Results obtained here also determine that the most important variables for estimating PARDiffuse are kt or kt,PAR, and the apparent solar time (AST). Therefore, PARDiffuse can be modeled using TSI measured in most radiometric stations, reaching r2 up to 0.858 for empirical models and 0.970 for ANN models. This modified approach will allow for the very accurate construction of long‐term data series of PARDiffuse in regions where continuous measurements of PAR are not available.


10.1029/2023JD039256 2 of 16
As in the spectral range of total solar irradiance (TSI; 280-3,000 nm), the global PAR irradiance (PAR Global ) reaching the Earth's surface can be divided into a direct and a diffuse component.Precisely, the diffuse component of PAR (PAR Diffuse ) is especially important since plant photosynthesis tends to increase under diffuse light conditions (Gui et al., 2021;Mercado et al., 2009).Mercado et al. (2009) evaluated the impact of changes in diffuse radiation of PAR on the global land carbon sink.Other studies also reported that an increase in PAR Diffuse causes an increased light use efficiency by plants (Gu et al., 2002;Kanniah et al., 2012;Zhou et al., 2021), since the diffuse radiation tends to produce less canopy photosynthetic saturation (Gu et al., 2002).Precisely, since this diffuse component can promote vegetation photosynthesis, this is known as the diffuse fertilisation effect (DFE; Gui et al., 2021).
However, despite the interest of the applications that derive from deepening the knowledge of PAR, several authors have shown that there is a scarcity of experimental measurements of PAR radiation on the surface, and especially of its diffuse component (Ferrera-Cobos et al., 2020a;Niu et al., 2019;Wang et al., 2016).Surface PAR is not a common observation at meteorological stations.This is particularly true in the Mediterranean area (Di Biagio et al., 2009).Therefore, considering the scarcity of PAR radiation measurements, different models have been proposed in the literature for its estimation.Notable among these models are those that estimate PAR from TSI from surface measurements, from satellite data averages (e.g., Foyo-Moreno et al., 2017;Hao et al., 2019;Peng et al., 2015;Vindel et al., 2018), or from radiation spectral measurements (Trisolino et al., 2016).García-Rodríguez et al. (2022) also proposed two different ways to model PAR from meteorological indices: firstly, by a multilinear regression model, and secondly by using artificial neural networks (ANN).Another very common approach is to consider the PAR-TSI fraction as a constant when estimating PAR radiation (e.g., Janjai et al., 2015;Yu et al., 2015).However, all this diversity of models to obtain PAR values have no equivalent to evaluate the diffuse component of PAR, for which a much more modest number of models have been proposed (e.g., Foyo-Moreno et al., 2018;Jacovides et al., 2010;Lozano et al., 2022).
From this contextualization, the aim of this study is to carry out a comparative analysis of several empirical and ANN models, used to estimate the diffuse component of PAR with the goal to improve its estimations, which can be applied to optimize crop management practices, increase prediction accuracy, and optimize solar panel performance, and so on.The basis of some of the models is the k tk relationship for PAR, but also others estimate PAR Diffuse directly.The article will be structured as follows.The experimental setup and data quality are described in Section 2. Section 3 includes the description of the empirical and ANN models together with the statistical analysis, while Section 4 presents the results and discussion, in terms of a comparative analysis between ANN and empirical models.Finally, the main conclusions are given in Section 5. Glossary is a list of abbreviations to assist the reader.

Experimental Setup and Data Control Quality
An experimental dataset with surface measurements of global and diffuse PAR andTSI in Granada (Spain, 37.16°N 3.61°W, 680 m a.s.l., years 1994 and1995), Almería (Spain, 36.83°N 2.41°W, 21 m a.s.l., years 1993-1995), and Renon (Italy, 46.42°N 11.28°E 1735m a.s.l., years 2014and 2015) have been used in this work.The data were acquired at 1-min intervals for each station and hourly values were generated from them.Therefore, data were stored at 1-hr average intervals.Previous studies using 1-min data have shown that the functional forms analyzed in this study can also be applied at higher temporal resolution despite differences in dispersion, for example, Engerer (2015) and Yang and Gueymard (2020) employ the kk t relationship in TSI with a temporal resolution of 1-min.The period of study of each station (2 or 3 years) includes a wide range of seasonal conditions and solar zenith angles.
In Granada and Almería, the instrumentation used to measure PAR were two Licor Li-190SA quantum sensors (Lincoln, NE, ISA), and two radiometers  to measure TSI, being one of each kind of instrument mounted on a polar axis shadowband in order to measure PAR Diffuse and TSI.The Li-190SA has a relative error lower than 5% meanwhile the CM-11 directional error is lower than 10 Wm −2 for solar zenith angles (θ z ) up to 80 (Kipp & Zonen, 2000).In Renon, PAR was measured by a BF2 sunshine sensor (Delta-T Devices, Burwell, United Kingdom).This device uses an array of silicon photodiodes and a shading pattern on the radiometer dome to determine the diffuse component.The BF2 has a relative error lower than 15%; more details can be found in Foyo-Moreno et al. (2018).The diffuse measurements in both spectral ranges were corrected following the Batlles et al. (1995) method, and a conversion factor of 4.57 μmol m −2 s −1 /Wm −2 10.1029/2023JD039256 3 of 16 (McCree, 1972) was used to convert to energy units the PAR measurements.Further details for the instrumentation can be found in Foyo-Moreno et al. (2018) and Lozano et al. (2022Lozano et al. ( , 2021)).
An in-depth quality control analysis was performed to detect and remove low-accurate and anomalous data.Firstly, only those measurements recorded at θ z < 80° have been used in order to avoid the cosine response error in solar radiation measurements.Secondly, we employed two tests based on the clearness index (k t ), defined as the ratio between global irradiance and extraterrestrial global irradiance both on a horizontal surface and the diffuse fraction (k; defined as the ratio between diffuse and global irradiance).The expressions to estimate the clearness index (k t,PAR ) and the diffuse fraction (k PAR ) in the PAR range are: where I TOA,PAR is the PAR irradiance on the top of the atmosphere (TOA), and can be computed by the product of the eccentricity correction factor of the Earth's orbit (E 0 ), the cosine of θ z , and the solar constant for the PAR range (I SC,PAR = 531.8Wm −2 ; Gueymard, 2018).PAR Global is the sum of the components of PAR, that is, the direct and diffuse components.Therefore, 0 < k PAR < 1 and 0 < k t,PAR < 1 constraints were applied.Thirdly, based on the dependence between the solar radiation and θ z , it is possible to parametrize two envelopes (upper and lower) of the data using a linear equation.The upper envelope corresponds to the maximum values, clear skies for global PAR and overcast skies for diffuse PAR, meanwhile the lower envelope corresponds to the minimum values (Foyo-Moreno et al., 2017, 2018).In order to avoid outliers and extreme values, these envelopes have been computed according to the 1st and 99th percentiles.The experimental data were then considered to be the data that remained between these two envelopes, both for PAR Global and PAR Diffuse .Finally, a visual inspection was performed to detect anomalous data (mainly due to voltage malfunctioning and outliers) after the performance of the above tests.The final dataset was 2,578 rows (of data) for Granada, 4,471 for Almería and 4,714 for Renon.

Methodology
To estimate PAR Diffuse two types of approaches have been widely applied until now.The first one uses the well-known dependence of k on k t .This relationship has been widely studied in the scientific literature in the TSI spectrum.This relation is used to estimate the diffuse component of the solar radiation due to there being much less uncertainty than the relation between the absolute values of diffuse and global irradiances (e.g., Badarinath et al., 2007aBadarinath et al., , 2007b;;Meloni et al., 2006).In fact, this relationship is also employed in several studies to make estimations in other solar spectral ranges such as the TSI or ultraviolet (UV) spectral ranges (e.g., Ridley et al., 2010;Sánchez et al., 2017).Figure 1 shows the relationship between those non-dimensional indexes for PAR, that is, k PAR and k t,PAR .Subsequently PAR Diffuse is obtained by multiplying the estimated ratio (k PAR ) by PAR Global .The second approach for estimating PAR Diffuse uses the absolute values of the irradiances involved in them, thus the estimation of PAR Diffuse is direct.
To carry out the analysis of both approaches, the dataset described in the previous section was employed.This data set has been averaged into hourly timesteps and randomly divided into two subsets: (a) a subset containing 75% of the data, to fit the models and obtain the empirical coefficients as well as to train the ANN, and (b) a subset composed of the 25% of data remaining for the validation and comparison of the empirical models and the ANN models.

Empirical Models
Most of the models analyzed in this study are based on relationships proposed in the literature to estimate the diffuse component of TSI.All of them present functional forms that are easy to fit in order to obtain their Relationship between k PAR and k t,PAR for the experimental measurements at Granada for the 2-year period analyzed (1994 and 1995).
10.1029/2023JD039256 4 of 16 empirical coefficients.In addition, most of them involve independent variables that only require PAR Global data and solar geometry factors, calculated from the date and time at which each measurement is recorded and, therefore, favoring their application worldwide to generate long-term time series.
The first model (M1) analyzed in this work is based on the model proposed by Reindl et al. (1990).Reindl et al. (1990) analyzed a multilinear relationship between k in the shortwave radiation vs. several variables, including k t and solar elevation.Their main two findings are: (a) k t is the most important predictor of the diffuse fraction in TSI range for cloudy skies (medium and low values of k t ), and (b) under clear skies (high values of k t ) the importance of k t dramatically decreases and the solar position becomes more relevant.According to this, the model has been applied to the PAR range, predicting k PAR using, as input variables, k t,PAR and the cosine of θ z , which form the expression: where a i are the fitting coefficients.
The second model ( M2) is an adaptation to the PAR range of the model proposed by Ridley et al. (2010), based on a logistic function that tries to reproduce the differences in the behavior of k PAR for different intervals of k t,PAR .Jacovides et al. (2010) and Kathilankal et al. (2014) have analyzed different versions of this logistic model in the PAR spectral range including different independent variables.In particular, in our work the original and most complete version of the model is analyzed in the PAR range, as follow: where b i are the fitting coefficients, AST is the apparent solar time (in hour), computed following the Iqbal (1983) equation.α is the solar elevation in degrees, k' t,PAR is the daily clearness index and Ψ PAR is the so-called persistence index.All these variables are introduced in order to reproduce the diffuse fraction variability.The solar elevation, α, accounts for the increase in Rayleigh scattering as α decreases, while AST considers the differences in the atmosphere between morning and afternoon.k' t,PAR is a measure of the PAR daily variability, primarily associated with clouds.Finally, Ψ PAR is a measure of the atmospheric stability in terms of k t : the closer the Ψ PAR is to k t , the clearer the atmosphere, and the opposite for overcast skies.This variable is evaluated with respect to the previous and next time (time−1, and time+1, respectively).These two last variables in the PAR range are defined as: The third model ( M3) is the adaptation of the Ridley's model suggested by Lozano et al. (2022) who analyzed the relationship between k PAR and k t concluding in their preliminary analysis that k PAR can be predicted directly from TSI.Therefore, the resulting expression is the following: where c i are the fitting coefficients, k' t is the daily clearness index and Ψ is the persistence index, and are defined as: 10.1029/2023JD039256 5 of 16 where I TOA is the TSI irradiance on TOA, computed as described above for PAR solar range: where I SC is the solar constant (I SC = 1,361.1 Wm −2 ; Gueymard, 2018).
The fourth model ( M4) is inspired by de Miguel's et al. ( 2001) model, who proposed a third-order polynomial expression with k t as the only independent variable.As a result they obtained a curve reproducing the behavior of k PAR in all the k t,PAR range.However, a model that uses an unique variable to predict the behavior of k PAR is unable to reproduce all its variability.For this reason, as a novelty, this study proposes, in addition to applying the model to the PAR range, to add two independent variables to reproduce the k PAR variability.Thus, the variables included in this new model are the solar position (in terms of α) and the persistence index (Equation 4b).The resulting expression is: where d i are the fitting coefficients.
The fifth model ( M5) is the same model as M4, but in this case using k t as predictor of k PAR , instead of using k t,PAR : where e i are the fitting coefficients.
Finally, the sixth model (M6), proposed by Foyo-Moreno et al. ( 2018), has been analyzed.Unlike the previous models, M6 directly estimates the PAR Diffuse , instead of k PAR .To this aim, M6 is based on the decomposition of the TSI Global into its direct and diffuse components, from the following expression: where f 1 is the fitting coefficient, and TSI Diffuse is the diffuse component of TSI.This empirical model is based on the previous model proposed by Foyo-Moreno et al. (2017) and developed from experimental measurements recorded at Granada.Ziółkowski et al., 2021) and have taken a relevant role in solar radiation modeling in recent years (e.g., Ağbulut et al., 2021;Elsheikh et al., 2019;Kamadinata et al., 2019).In the present work, a multilayer perceptron (MLP) has been used to model k PAR and PAR Diffuse .A MLP is a feedforward ANN organized in at least three layers of fully connected neurons which employ a non-linear activation function.As an ANN a MLP is characterized by three fundamental elements: the training algorithm, the activation function and its architecture which determines the connections between neurons.This particular type of ANN is composed of a set of input neurons (input layer), equal to the variables used to estimate k PAR or PAR Diffuse and a set of one or more hidden layers of neurons (hidden layers).More details on solar irradiance predictions using a MLP model can be followed in Alados et al. (2004Alados et al. ( , 2007)).In our study, an automatic architecture selection was used in order to select the best architecture and to build a network with one hidden layer.The automatic architecture selection helped to specify the best number of units (neurons) for the hidden layer, ultimately setting it as the minimum number of hidden neurons with the maximum possible performance.An ANN with one hidden layer was employed in several previous works to estimate solar radiation (e.g., Alsina et al., 2016;Hasni et al., 2012;Kamadinata et al., 2019).Finally, a layer of output neurons (output layer), which is equal to the modeled variable (k PAR or PAR Diffuse ) was used.
The activation functions used are hyperbolic tangent expressed as: where x is the input vector.The weight values were initialized as randomised values, and in order to minimize the mismatch between measured values and the computed values the learning algorithm modified the weights in the so-called forward-propagation phase.The error in each neuron, per iteration, is calculated as follow: where SE is the sum of the square error, z i represents the real value and z i * is the estimated value.A more detailed description of this algorithm can be found in Alados et al. (2007Alados et al. ( , 2004) ) where the same type of MLP was employed for modeling radiometric UV erythemal irradiance.
The ANNs have been built using the same input variables as employed in empirical models described in Section 3.1, selecting the input variables by two predictor importance tests, for each location, the first in order to predict k PAR , and the second one to predict PAR Diffuse directly (Table 2).
A predictor importance test is a statistical analysis of the relative importance of the predictors on the studied variable, and it is a crucial part of the pre-processing analysis to build an ANN, since it allows a reduction in the number of inputs into the model, restricting them to those that have proven to have a greater influence when performing the predictions.In our study, the predictor importance test was performed with the IBM spss software; this test is a part of the MLP app.The first predictor importance test points out k t,PAR is the most important factor when predicting k PAR , for all the sites, with at least 0.5 relative importance followed by AST (0.24-0.26).
The other variables have less relative importance, varying between 0.04 and 0.09 for the three sites except for Renon in which K´t ,PAR reaches approximately 0.16 of relative importance.The high relative importance as a predictor of k t,PAR is not surprising given the relationship between k PAR and k t,PAR shown in Figure 1, which is one of the basis of most models in the scientific literature.This is because the knowledge of the atmosphere transparency is essential but not sufficient to reproduce k PAR variability.
Taking into account the results of this importance analysis, we have evaluated two ANNs to predict k PAR , in each of the three locations of study (Granada, Almeria and Renon): (a) ANN1 from k t,PAR and AST, (b) ANN2 using as parameters k t instead of k t,PAR , in order to be able to compute k PAR even if there is a lack of global PAR measurements, and AST.
From the second predictor importance test performed for PAR Diffuse as the output variable, we obtain three variables with a high and similar relative importance: k t,PAR , AST and cos θ z .These results allow us to build an ANN LOZANO ET AL. 10.1029/2023JD039256 7 of 16 from these variables: (a) ANND1 using k t,PAR , AST and cos θ z , (b) ANND2 similar to ANND1 but using k t instead of k t,PAR , and (c) ANND3 built by using TSI Diffuse as a unique predictor.The advantage of using this parameter is that TSI is a variable more frequently measured in the radiometric stations than PAR, and in those in which the diffuse component of TSI is measured, this variable allows reproduction of the solar radiation interactions through the atmosphere.

Fitting and Validation Statistics
In order to analyze and compare the models, a statistical analysis has been performed including the mean bias error (MBE), the coefficient of determination (r 2 ) and the relative root mean square error (rRMSE).The MBE provides us with information about whether the model overestimates (positive values) or underestimates (negative values) with respect to the experimental values.The lower the value for both rRMSE and MBE, the better the behavior of our model (Ma & Iqbal, 1984).r 2 is an estimate of the total variance explained by the model, while the rRMSE allows us to make a term-to-term comparison between the experimental and estimated values of the diffuse fraction, and quantifies the differences between the estimated and experimental values.These statistics can be obtained by the following expressions: where k PAR,i is the experimental diffuse fraction, and k PAR,i * is the estimated diffuse fraction.

Modeling k PAR
Table 3 shows the fittings coefficients obtained for the analyzed models.p-value have been obtained for all the coefficients in order to determine their statistical significance.The coefficient is considered to be significant when p-value <0.05 or p-value <0.01, marked with * or **, respectively.For Renon, all the coefficients for the five models are significant, except the coefficient e 4 in M5 (for k 3 t,PAR ).M1 coefficients are significant for Granada and Almería, and M2 has significant coefficients in Almería as well.For Granada K' t,PAR and K' t are not significant for any model while for Almeria K' t is not significant.On the other hand, most of the coefficients for the polynomial expression in models M4 and M5 are not significant for both Granada and Almeria.Finally, ⍺ is not significant in M2 and M3 models for Granada.These findings show that for different models and locations k t and the solar position could be irrelevant in contrast with the result obtained by Reindl et al. (1990) for TSI in which they found that the solar position and/or k t are the main factors needed to model k, depending on the atmospheric conditions in terms of k t .
Table 4 shows the values of the statistics previously defined in Section 3.3 for all the models analyzed.The values are very similar for both subsets of data (fitting and validation datasets), which confirms the good results of the models and allows us to focus on analyzing the validation dataset.The MBE values are close to 0, which indicates the absence of overestimation or underestimation of these models.In general, all the models work remarkably well, M1 showing the lowest performance with r 2 ranging from 0.550 in Almería to 0.679 in Granada.All the other models have a similar statistical behavior in each location.It is remarkable that the best performance of all the models (except for M1) is obtained in Renon.In this location the best performance was observed counterintuitively in model M5, which employs k t to model k PAR instead of using k t,PAR , with a r 2 of 0.858, and indeed this model has the lowest rRMSE values, 18.3%, although M2 to M5 show quite similar performance.In Granada

Table 3
Fitting Coefficients for the Analyzed Empirical Models (M1 to M5) for Granada (GR), Almería (GR), and Renon (RE) LOZANO ET AL. 10.1029/2023JD039256 9 of 16 and Almería, all the models have pretty similar performance, with Almería having the lowest performance of the three locations.In Almería, the minimum values were obtained with M4 giving a r 2 = 0.752, and the maximum values were from M3 with r 2 = 0.771.It is important to note that models M2 to M5 employ all the same variables to obtain k PAR , although there are two differences between them.Firstly, the functional form in M2 and M3 are based on the logistic equation while M4 and M5 are based on a polynomial equation.Although the behavior of the models is quite similar regardless of the functional form analyzed, in Granada and Almeria the logistic models behave slightly better, while in Renon the polynomial model does.Secondly, in the models M2 and M4 k PAR use measurements in the PAR range of solar radiation, meanwhile M3 and M5 employ the same variables but from measurements in the TSI range.Lozano et al. (2022) suggested the possibility to model PAR directly from TSI Global measurements, and these findings confirmed that there are no statistical differences when modeling PAR Diffuse from TSI or PAR measurements.
Figure 2 shows the values of k PAR versus k t,PAR for models M1, M2 and M4 and versus k t for models M3 and M5, evaluated in Granada.Similar results are observed for Almería and Renon (not shown here).As expected, despite the relatively high values of r 2 and the good statistical results (low MBE and relatively low rRMSE), M1 is unable to reproduce the variability in k PAR .Its apparent good performance is related to the use of the two main factors to model solar radiation (clearness index and solar position).However, a linear model needs to add other variables in the equation to improve the results.In that sense, the rest of the models have a better development because instead of using a linear equation the first two models (M2 and M3) employ a logistic equation meanwhile the latter two (M4 and M5) use a third order polynomial expression with the same variables as in the previous two models.
When comparing models, the logistic and the polynomial models behave quite similarly, as shown in the statistics summarized in Table 4. Focusing on models that include the clearness index in the PAR range (M2 and M4), their r 2 only differs up to 2% in all the locations (validation dataset), and the difference in rRMSE ranges from 0.6% in Granada to 1.5% in Renon.For models including the clearness index in the TSI range (M3 and M5), the differences in their r 2 are 1% in all the locations, and the difference in its rRMSE ranges from 0.2% in Renon to 0.7% in Almería.Finally, focusing on the differences between modeling k PAR from PAR measurements or from TSI range measurements (Model M2 or M4 vs. M3 or M5) the differences in r 2 and rRMSE are also low, and up to 4% and to 2.8%, respectively, both found in Renon.On the other hand, there are only slight differences between modeling k PAR from logistic or polynomial models when considering all the locations.The site-dependant model that  best fits the experimental measurements was the polynomial using TSI range measurements (M5) reaching a r 2 = 0.858 and a rRMSE = 18.3% in Renon.
It is important to point out that there are very few comparative studies of models for k PAR like the one carried out in this work.Thus, for example, in the study carried out by Jacovides et al. (2010) in Greece, only different functional forms are analyzed with k t,PAR as the only independent variable, without considering the variability of the diffuse fraction.Table 5 shows the values of the statistics in the validation of the ANN models for the diffuse fraction, k PAR , following the same procedure for the empirical models in the above analysis.All the ANN models have a good performance with its r 2 ranging between 0.967 (ANN1) in Granada and 0.980 (ANN3) in Renon.Again, the percentage of explained variance and the statistics for these ANN models are within the range of other authors, even if we compare the metrics obtained in our work for the diffuse component or k PAR with other studies for the PAR Global .For example, Ferrera-Cobos et al. (2020b) obtained r 2 ranging between 0.992 and 0.998 and a rRMSE between 1.86% and 9.97%, while García-Rodríguez et al. ( 2022) found values of r 2 from 0.994 to 0.997 and rRMSE values from 4.62% to 6.87%, in addition, López et al. (2001) evaluated a wide range of ANN models with different weather and solar radiation measurements as inputs, as well as PAR Global , obtaining a high range of variation in r 2 (0.337-0.999) and rRMSE (2.0%-44.3%).The fact that our statistics in estimating the diffuse component are within the variation range of other authors predicting global PAR highlight the good performance of these models.MBE is close to zero implying that, for the average amount, there is no overestimation or underestimation of the data.The performance of the three models is very similar in all locations, with very slight differences between themselves, r 2 differs up to 0.01 between the three models in Renon, and the maximum difference between the rRMSE is also 3.5% in Renon.
ANN1 and ANN2 have quite similar results with very important implications, in those situations in which there are no measurements of PAR solar range, TSI through k t can be used to predict diffuse PAR as can be seen from ANN2.These findings from ANN models are therefore in agreements with the previous findings for the empirical models.A similar conclusion was found by Lozano et al. (2022) with a preliminary evaluation of Ridley et al. (2010) empirical model to obtain k PAR from k t .Figure 3 shows the values of k PAR versus k t,PAR for model ANN1 and k PAR versus k t for model ANN2 evaluated in Granada.Similar results are observed for Almería and Renon (not shown here), and despite the fact that ANN models have better statistic performance, it is interesting to see that for high values of k PAR and medium values of k t,PAR or k t , the empirical logistic and polynomial models better represent the scattering of the cloud points.

Modeling PAR Diffuse
A similar analysis than the developed above for k PAR has been carried out to estimate PAR Diffuse .In this section, we analyze one empirical model to estimate PAR Diffuse directly (M6) which is proposed in the literature by Foyo-Moreno et al. (2018).Table 6 shows the statistics of model M6 for both the fitting, and the validation datasets, as well as the fitting coefficient for each location.A quite good performance of this model can be observed by the similar values of the statistics between both subsets of the data.The low values of MBE suggest a slight underestimation at all the cities.However, this is the empirical model with the highest r 2 for the validation data, ranging from 0.981 in Renon to 0.997 in Granada.The estimation of the error for this model (rRMSE) is also the lowest compared with the other empirical models, ranging from 6.1% (Granada) to 15.6% (Renon).Therefore, these rRMSE values are in the same order as those for the above ANN models, highlighting the behavior of this model.In addition, it is important to emphasize that this model employs only TSI range measurements to predict the diffuse component of PAR, with the only disadvantage of needing the diffuse component of solar radiation.
Three ANN models have been evaluated by directly estimating the PAR Diffuse , and its statistics are summarized in Table 7.As expected, ANND3 has the better performance, since this model uses as an input the TSI Diffuse , with r 2 values ranging from 0.989 in Renon to 0.998 in Granada, and rRMSE between 5.1% (Granada) and 11.7% (Renon).However, it is remarkable that despite ANN's better general performance compared to empiric models, overall the ANND3 model and Foyo-Moreno et al. ( 2018) model ( M6) have as similar behavior as can be seen from its statistics.On the other hand, the uncertainty of directly modeling PAR Diffuse instead of using ratios is not observed in ANN models nor in M6 neither, where ANND1 and ANND2 are only slightly worse than the results when modeling from k PAR , furthermore as seen in the previous section for k PAR modeling no improvements were found when modeling PAR Diffuse from k t,PAR instead of k t .
Figure 4 shows the modeled PAR Diffuse from the empirical model M6 and from ANND1, ANND2 and ANND3, at Granada, including the 1:1 line as a reference.The M6 model shows the estimated PAR Diffuse values had a very low dispersion with respect to the experimental values, and a high r 2 value of 0.997 in Granada.M6 shows less dispersion than ANND1 or ANND2, only ANND3 has a similar good performance.This is a consequence of the fact that its main input variable is diffuse radiation in the TSI range, which is fundamentally affected by the same scattering processes as the PAR interval.This result shows the possibility of obtaining very precise PAR Diffuse values from the values of this component in the TSI spectrum, in a similar way to how PAR Global radiation is obtained (Alados & Alados Arboledas, 1999;Foyo-Moreno et al., 2017).

Conclusions
Most of the empirical models that estimate the diffuse component of the photosynthetically active radiation (PAR Diffuse ) are based on the relationship between the diffuse fraction (k) and clearness index (k t ) and these can be adapted to the PAR range.Thus, this work presents an exhaustive evaluation of different empirical and ANN models which estimate PAR Diffuse using the relationship between k PAR and k t,PAR , and also directly estimate PAR Diffuse .Our analysis is therefore applicable to other parts of the world after calibrating/fitting the models with   10.1029/2023JD039256 13 of 16 local inputs.However, the majority of the models are not able to reproduce the great variability that exists in the relationship between k PAR and k t,PAR .This work proposes improvements to reproduce this variability and estimate PAR Diffuse , which can be applied to optimize crop management practices, increase prediction accuracy, and optimize solar panel performance, and so on.The following main conclusions are derived from this study.
• Empirical models were accurately estimating k PAR when they were based on functional forms that reproduce the existing great variability in the k t − k relation, which can be a logistic (M2 and M3 models) or a polynomial function (M4 and M5 models), with r 2 ranging from 0.77 to 0.84 for logistic models, and from 0.76 to 0.86 for empirical ones, at our three study sites.
• This work proposes improvements adapting existing models with the TSI range to the PAR range and adding more variables to reproduce the existing variability between k t − k relation.In fact, the proposal is based on different functional forms adding variables such as  or  z , AST, k t and , at both ranges (PAR and TSI).  the atmospheric attenuation process for solar radiation.Therefore, the M6 model reached a r 2 up to 0.996 or 0.997, when evaluated from the fitting and validation datasets, respectively.• ANN models were the best models for estimating k PAR , with r 2 ranging from 0.90 to 0.98.However, the polynomial model M5 reached a r 2 quite close to these ANN models, 0.86 in Renon, which suggests that the polynomial model might be an alternative to ANN models.• From our PAR Diffuse modeling analysis, it can be concluded that there is no advantage in applying ANN instead of the Foyo-Moreno model, especially when considering the simplicity of this empirical linear model.• Finally, besides this comparative study among empirical and ANN models two main findings were found: (a) contrary to empirical models, using many variables is not related to an increase in the performance of the models, (b) although modeling k PAR or PAR Diffuse from measurements in the PAR solar range has accurate results, estimations from the TSI range has an even better performance, reaching values of r 2 up to 0.86 when modeling k PAR from empirical models, and up to 0.998 or 0.997 for ANN or empirical models modeling PAR Diffuse directly.These two findings allow us to reproduce diffuse PAR from clearness index and AST, even if PAR measurements are not available, with very precise results.

Figure 1 .
Figure1.Relationship between k PAR and k t,PAR for the experimental measurements at Granada for the 2-year period analyzed(1994 and 1995).

Figure 2 .
Figure 2. Hourly PAR diffuse fraction (k PAR ) modeled (red) and experimental (black) versus hourly clearness index for PAR range (k t,PAR ), for the empirical models (a) M1, (b) M2, (d) M4, or versus hourly clearness index (k t ), for the empirical models (c) M3, and (e) M5.The validation dataset for Granada was used to build the figures.
Jacovides et al. (2010) analyzed the versions of the models byReindl et al. (1990) andRidley et al. (2010) applied them on the PAR range, and obtained a r 2 of 0.87 and 0.90 and a rRMSE of 32.5% and 27.1%, respectively.Kathilankal et al. (2014) also evaluated the same simplified version of theRidley et al. (2010) model, in the United States, obtaining a r 2 of 0.76 and a rRMSE of 30.6%.Both obtain r 2 values close to those obtained in this work, which confirms the high percentage of variance explained by the functional forms analyzed.However, the rRMSE values are higher than those obtained in this study for models, which shows that it is necessary to introduce additional variables to reproduce the variability of the diffuse fraction derived from different atmospheric conditions.

Figure 3 .
Figure 3. Hourly PAR diffuse fraction (k PAR ) modeled (red) and experimental (black) versus (a) hourly clearness index for PAR range (k t,PAR ; ANN1), or versus (b) hourly clearness index (k t ; ANN2), with the validation dataset in Granada.

Figure 4 .
Figure 4. Modeled diffuse photosynthetically active radiation (PAR Diffuse ), for (a) the empirical model M6, and the ANN models (b) ANND1, (c) ANND2 and (d) ANND3 in Granada, represented against the experimental PAR for the validation dataset.1:1 line is also shown (red line).

Table 1
Desai & Shah, 2021;Ghritlahre & Prasad, 2018;d be highlighted that models M3, M5 and M6 allow for estimating the diffuse component in the PAR range even when global PAR irradiance measurements are not available because those models directly employ global or diffuse TSI.3.2.Neural Network ModelsIn addition to the empirical models, several ANN models have been used to estimate k PAR and PAR Diffuse directly.ANN have been widely used in many research fields (e.g.,Desai & Shah, 2021;Ghritlahre & Prasad, 2018;

Table 1
Summary of the Empirical Models Employed to Estimate k PAR (M1 to M5) or PAR Diffuse (M6) Directly

Table 2
Relative Importance of the Variables Evaluated Modeling k PAR and PAR Diffuse With ANN, for Granada (GR), Almería (AL), and Renon (RE)

Table 4
Statistical Performance of the Models for Estimating the Diffuse Fraction of the PAR Range Corresponding to the Fit Data Set, on the Left, and the Validation Data Set, on the Right, for Granada (GR), Almería (AL), and Renon (RE)

Table 5
Statistical Performance for Validation of the ANN Models for Estimating the Diffuse Fraction of the PAR Range, for Granada (GR), Almería (AL), and Renon (RE)

Table 6
Statistical Performance for M6 Model Estimating the Diffuse Component of PAR Corresponding to the Fit Data Set, on the Left, and the Validation Data Set, on the Right, for Granada (GR), Almería (AL), and Renon (RE)

Table 7
• However, only the empirical models can completely reproduce the variability of experimental measurements if the diffuse component is considered in another range (e.g., in TSI range), since it is an easy way to reproduce Statistical Performance for Validation of the ANND Models for Estimating the Diffuse PAR, for Granada (GR), Almería(AL), and Renon (RE)