Open‐source photovoltaic model pipeline validation against well‐characterized system data

All freely available plane‐of‐array (POA) transposition models and photovoltaic (PV) temperature and performance models in pvlib‐python and pvpltools‐python were examined against multiyear field data from Albuquerque, New Mexico. The data include different PV systems composed of crystalline silicon modules that vary in cell type, module construction, and materials. These systems have been characterized via IEC 61853‐1 and 61853‐2 testing, and the input data for each model were sourced from these system‐specific test results, rather than considering any generic input data (e.g., manufacturer's specification [spec] sheets or generic Panneau Solaire [PAN] files). Six POA transposition models, 7 temperature models, and 12 performance models are included in this comparative analysis. These freely available models were proven effective across many different types of technologies. The POA transposition models exhibited average normalized mean bias errors (NMBEs) within ±3%. Most PV temperature models underestimated temperature exhibiting mean and median residuals ranging from −6.5°C to 2.7°C; all temperature models saw a reduction in root mean square error when using transient assumptions over steady state. The performance models demonstrated similar behavior with a first and third interquartile NMBEs within ±4.2% and an overall average NMBE within ±2.3%. Although differences among models were observed at different times of the day/year, this study shows that the availability of system‐specific input data is more important than model selection. For example, using spec sheet or generic PAN file data with a complex PV performance model does not guarantee a better accuracy than a simpler PV performance model that uses system‐specific data.

the system.Modeling can serve as both a simulation and optimization tool and can be used at various stages of development in a PV system, for example, site assessment, design evaluation, technology comparisons, and in proving bankability of a project.Models vary according to the performance factors they consider, number of inputs, complexity of calculations, financial considerations, and scale of application [1].
The simplest models relate maximum power output to incident irradiance and operating temperature using a multilinear function.Other models rely on treating the PV system as an equivalent circuit with one or more diodes and resistors in series and in parallel [2].Others are semiempirical and require extensive module measurements be made in outdoor conditions.The simplest models require inputs that are readily available from commercial specification or specification (spec) sheets, while others require testing to be conducted on modules under controlled conditions.Many comparisons of simple, freely available models exist (e.g., Marion et al. [3]), but these comparisons usually only consider two to three models at a time or models that require the exact same inputs [4] and/or are benchmarked on a limited number of systems [5].
These comparisons rarely use module-specific measured data: meaning that instead of fully characterizing the modules prior to modeling, manufacturer supplied spec sheet or Panneau Solaire (PAN) data are used.The latter approach assumes that all PV modules with the same model number perform identically according to the spec sheets.This is not true because they can vary in nameplate and operating performance and even the rate they degrade over time (see, e.g., in Theristis et al. [6]).In such cases, a PV performance modeling comparison would be biased by uncertainties in the environmental and module specific characterization data rather than focusing on the ability of the models to predict a system's behavior.An international blind PV performance modeling comparison was recently published by the Sandia-led PV Performance Modeling Collaborative (PVPMC), involving participants from 32 institutions [7].The results demonstrated improved precision among models, but accuracy still depends on the modeler's skill and derate assumptions.These create the need for a comprehensive comparison of PV performance models against multiyear field data from well-characterized systems consisting of different types of PV modules.
This study compares all freely available photovoltaic performance models from pvlib-python and pvpltools-python.The models were tested against well-characterized crystalline silicon (c-Si) systems in Albuquerque, New Mexico (NM).These c-Si systems include technologies that were more recently established, that is, not solely aluminum back surface field (Al-BSF) modules, which was the major technology used when these models were originally defined.The models used in this study vary in their methods of calculation but do not use datadriven approaches, such as machine learning [8,9].The systems that are included in this study are limited to fixed tilt, monofacial, c-Si [10].
Furthermore, six plane-of-array (POA) transposition models and seven temperature models are compared against data measured on-site.
Seven PV systems located at Sandia's Photovoltaic Systems Evaluation Laboratory are being considered.All available data for each system, which range from 2 to 4 years, are being used.These systems have been characterized by both Sandia National Laboratories and an external laboratory using various methods to obtain PAN files, IEC 61853-1, 61853-2, and Sandia Array Performance Model (SAPM) data.Using these test data, the systems' power and efficiency will be analyzed against PV performance model predictions.An overview of the POA transposition, temperature models, and performance models is given, and the PV systems are described.The results and error calculations of each irradiance, temperature, and performance model are presented and discussed.

| OVERVIEW OF MODELS
For all models considered in this study, a more in-depth description and all defining equations can be found in the original paper establishing the model; citations to the original paper are given in the respective subsections.
These models' inputs and outputs are described in Table 2. To examine the influence of transient-state assumptions on the temperature predictions, the same models were rerun by incorporating the Fuentes [24] and additive Prilliman [25] transient temperature models.
Many of these models use empirical coefficients to describe module temperature.SAPM uses a, a coefficient to establish the upper limit of module temperature during periods of low wind speed and high solar irradiance, and b, a coefficient to establish the rate at which module temperature decreases as wind speed increases.SAPM also considers a parameter known as ΔT, which is the temperature difference between the cell and back of module surface at an irradiance of 1000 W/m 2 .Faiman uses two heat loss coefficients, U0 and U1; U1 considers the influence of wind, while U0 does not.PVSyst similarly considers two heat loss coefficients, Uc and Uv, where Uc does not consider wind and Uv does.The Prilliman model uses four coefficients, a0 through a3.These come from a bilinear interpolation matrix using minimum and maximum wind speed values.

| FIELD DATA
Data for this study consist of measured irradiance, weather, and system output data along with module characterization data.
T A B L E 1 Necessary inputs into the POA transposition models considered in this study.
T A B L E 2 Necessary inputs for the cell and module temperature models considered in this study.
Module temperature models Cell temperature models Transient models Emissivity ✔

Module unit mass ✔
Steady-state modeled temp ✔ a Fuentes model uses an installed NOCT which is determined following methods defined by Fuentes [24].DEVILLE ET AL.

| System and instrument data
Seven PV systems from the Systems Long-Term Exposure (SLTE) project (previously known as the PV Lifetime Project [6]) were used in this study for benchmarking the models.The PV systems were installed from 2017 to 2019 and their details are given in Table 4.All These models use cell temperature, which is calculated from module temperature using Equation (2).
T A B L E 4 System information of the seven SLTE systems used in study.

| Using generic specification sheet versus module-specific characterization data
When module-specific data (e.g., IEC 61853 and 61215 data from selected modules retrieved from the system under evaluation) are not available, using the spec sheet can introduce a bias due to overrating or underrating.The power measurements taken from the modules used in this study varied by up to $5% from the spec sheet [6].Such differences in power will bias the model predictions and this should be attributed to the input accuracy, and not the modeling accuracy.
To quantify this, Figure 1 compares three models using spec sheet (nameplate) and measured (IEC 61853-1) data of the most overrated system in the SLTE project.This system (Mission300) was selected because the modules' power was up to $5% (or $15 W; see Theristis et al. [6] for more information) lower than the spec sheet rating.As it can be seen, this overrating is directly reflected in the error of the models, with the nameplate Misson300 NMBE consistently overestimating power by 4-4.5% higher than its measured data counterpart.
To put this into perspective, assuming a 500 MW power plant with a specific yield of 1500 kWh/kWp/year and an electricity price of $0.05/kWh, this 4.5% overprediction could introduce a bias of $1.68 M/year in estimated revenues.Therefore, this simplified comparison can show that given accurate module data, the models are able to perform similarly and accurately with a potentially reduced financial risk.

| Module data collection
To ensure an apples-to-apples comparison between all models, all module-specific input data were sourced from the same testing procedure (i.e., IEC 61853).No spec sheet data were used to allow a fair comparison among the models without any external biases caused by possible inaccurate input data.Matrix data were obtained from IEC 61853-1 testing performed at CFV Labs [35].This testing took place between November 4, 2019, and December 13, 2019, on a single control module of each type.The control modules were placed outside for light soaking prior to being sent for testing and are not part of the system being used to evaluate the models.These test data were then used to produce SAPM coefficients, generate the PAN files, and provide the inputs necessary for the PVWatts, CEC, and Desoto models.
The original calibration method for the SAPM relies on a piecewise solution of each primary equation, using data sets tightly constrained to specific outdoor environmental conditions.Separate thermal tests were required when using outdoor data to determine temperature coefficients prior to calibrating the primary equations.In this study, however, the primary equations were solved simultaneously via multivariate regression analysis and did not require a separate thermal test [36].In this method, all coefficients of each primary equation were solved without constraint, allowing for the translation of the IEC 61853-1 matrix into SAPM coefficients with no additional inputs. To

| METHODOLOGY
Preliminary calculations were necessary since performance models require effective irradiance and cell temperature, while field measurements only provided POA irradiance (i.e., by means of pyranometer measurements) and module temperature.

| Effective irradiance calculation and module temperature conversion
Effective irradiance was calculated using the SAPM model [19] by translating the direct and diffuse POA to the irradiance "seen" by solar cells and by also considering angle of incidence (AOI) losses; no spectral losses were considered since no spectral loss coefficients for the modules were available from the measured data.
POA ground diffuse irradiance was calculated with a constant albedo of 0.189, which was the mean and median of the measured albedo data for the reporting period.The albedo is assumed to be the same for all systems, since all systems share a ground covering material of crushed gravel.The AOI losses were calculated using reference data for each module from IEC 61853-2 testing conducted at CFV Labs [37].All IEC 61853-1 [34] and 61853-2 [38] power rating and AOI data are publicly available at the PVPMC website [39].Equation (1) describes the method of calculating effective irradiance, where IAM is the incidence angle modifier interpolated linearly from the IEC 61853-2 data and f d is the fraction of diffuse irradiance on the plane of array that is not reflected away, which is set to 1.
The measured module temperature data come from RTDs on a sample module in each string.When using the models described previously, some of these needed to be translated from cell temperature to module temperature.
To fairly compare the cell temperature models to measured module temperature, they were converted to module temperature using the equation defined in King et al. [19]: where T mod is module temperature, T cell is cell temperature, POA is the plane-of-array irradiance, POA 0 is irradiance at STC, and ΔT is a parameter which depends on module mounting and front/rear material (i.e., glass/glass or glass/polymer) and is set to 3 C in this case.
These modeled module temperature values were then compared with the average RTD measurement for a given system.

| Data filtering
Weather, irradiance, and operational data were filtered based on the criteria listed in Table 5.As an example, Figure 2 shows the average amount of filtered data after each filter is applied.All systems have some initial data unavailability from periods of testing or system outages; on average, 2% of the total data were initially missing or unavail-  availability is the Mission300 system with 34% of the data remaining.
The system with the least data availability is the CSpoly270 system with 31% of the data remaining.This difference could be due to the differences in length of data collection for these systems.The Mis-sion300 system was deployed in 2019, while the CSpoly270 system was deployed two years earlier in 2017.
The filtered effective irradiance and module temperature data were used as inputs into the performance models.A flowchart is shown in Figure 3 describing the process for finding and calculating all parameters necessary for each model.All models are used to calculate string power and use the same weather and irradiance inputs.

| Performance model evaluation metrics
The results of each model were compared against the measured data.All analysis was completed at the same 1 min timestep as the measured data; any resampling shown in analysis was done after the results were generated.The normalized mean bias error (NMBE) and mean bias error (MBE) were calculated using ( 3) and ( 4) to reflect the model's prediction bias.Root mean square error (RMSE) was also used and calculated using (5).RMSE is the standard deviation of the residuals and shows how far the model's predictions are spread from the measured values.To obtain normalized RMSE (NRMSE), the RMSE is divided by the mean of the measured values.
where P M is the modeled parameter and P O is the observed parameter; i represents the string number; and N is the number of observations.The errors are calculated at each string and the average is taken.These calculations are applied to modeled temperature and power.For POA irradiance, since there is only one measured value, it is compared directly to the modeled value.

| PLANE OF ARRAY IRRADIANCE TRANSPOSITION MODEL COMPARISON
The Perez model has different submodels, and the first investigation was to run all submodels to determine the best performing one to compare to all other transposition models tested in the study.The models with the best performance were the phoenix1988 and albuquerque1988, which had the lowest RMSE and MBE, respectively.
The MBE and RMSE for these models are plotted in Figure 4. Since these locations (i.e., Phoenix and Albuquerque) are close to the systems being investigated, this result was expected.Therefore, the model selected for use in the main comparison was albuquerque1988.

| Transposition model comparison
The MBE and RMSE values for all models are shown in Figure 5.

| CELL AND MODULE TEMPERATURE MODEL COMPARISON
When comparing the steady-state temperature models, all cell temperature models were converted to module temperature.The models' performance in both steady-state and transient conditions were compared.

| Steady-state modeling
For all cell and module temperature models, the mean and median residuals ranged from approximately À6.5 C to 2.7 C when all systems were considered.Not all models performed similarly on a given system; this shows that model performance was more dependent on the model parameters and not the specific PV technology.Figure 7 shows the model performance per system, in which most of them underestimate temperature, except Ross.In the boxplots, the triangles represent the mean, the lines within the boxes show the median, the boxes extend to the 25th and 75th percentiles, and the whiskers show the furthest outliers that are still within the 1.5 interquartile range.The most accurate model was the PVSyst (cell temperature converted to module temperature) model, which had the lowest mean residual of À1.4 C when all systems were considered.Figure 8 shows 11 Perez POA models shows the similarity in performance between the albuquerque1988 and phoenix1988 models.
F I G U R E 5 RMSE versus MBE of the six POA transposition models showing that Perez-abq1988 had the lowest MBE and Klucher had the lowest RMSE.
the models' average residuals at different irradiance intervals.Similar to the performance shown in Figure 7, Ross was the only model to have consistent, positive bias.All other models continually had increasingly negative bias at higher irradiance intervals.This indicates that the models had the tendency to overpredict at lower irradiance levels and underpredict at higher irradiance levels.
For models requiring nominal operating cell temperature (NOCT), this value was tested against nominal module operating temperature (NMOT).Two cases were tested: (a) cell temperature models using NMOT and (b) cell temperature models using NOCT and then converting these values to module temperature using Equation (2).Case (b) resulted in lower errors and was the method used for this comparison.In the PVSyst cell temperature model, efficiency is a required input.The default efficiency value assumed in pvlib-python v0.9.3 is 0.1, but the model yielded higher accuracy when using the manufacturer's efficiency at STC.The highest accuracy was observed when the efficiency was calculated based on measured system performance and weather conditions.

| Moving beyond steady-state modeling
The temperature models were all then considered with transient assumptions.Figure 9 shows the Faiman model with both steady-

| PV PERFORMANCE MODEL COMPARISON
Although the 12 PV performance models varied widely in their inputs and calculations, the performance of models for a given system was very similar.Figure 11 shows the NMBE for all models and systems

| CONCLUSIONS AND LIMITATIONS
This study compared 6 POA transposition models, 7 PV temperature models, and 12 PV performance models against multiyear field data from well-characterized systems in Albuquerque, NM.Overall, the models performed similarly, but differences can be seen at various times of day and irradiance conditions.As expected, using a location-specific Perez submodel improved the model's performance.
The temperature modeling comparison indicated that using a transient temperature model improves accuracy even in Albuquerque where conditions are relatively steady state.It is hypothesized that further accuracy improvements would be observed when using these models in locations with more dynamic conditions, for example, passing clouds.The PV performance modeling comparison demonstrated that using a more complex model does not guarantee any greater accuracy.Using module-specific inputs (i.e., data that correspond to the modules under investigation and PAN files that have been generated for the specific installation) was more critical in getting accurate results rather than using a more complex model and generic spec sheet or PAN file data.When comparing the models using systems with an observable difference in measured and nameplate values, the models' error was directly correlated to the difference in module performance.This indicates that when modeling a system, a large effort should be placed on characterizing the system under investigation (e.g., via IEC 61853 and/or 61215 testing) rather than focusing only on the model selection.
All temperature and performance models in this study were compared with seven c-Si systems; no thin-film modules were available.It should also be noted that the SLTE systems that were considered are small-scale laboratory systems that are monitored closely, and therefore, common derate assumptions do not apply.For example, the shorter strings and wiring runs of typical laboratory systems differ from those seen in large-scale systems.Due to the size of large-scale systems, the potential for nonuniformities across strings and arrays would influence the modeling accuracy of both temperature and performance modeling.Furthermore, laboratory systems are continuously monitored and equipped with state-of-the-art sensors for research purposes and are cleaned and calibrated periodically.Although some of these aspects hold true in commercial installations also, others can be cost-prohibitive.
systems have the same tilt and orientation of 35 and 180 , respectively.The systems consist of varying types of c-Si technologies only; no thin film modules were included in the study.Furthermore, these systems are small-scale laboratory systems that are monitored closely and use periodically calibrated sensors, which means more accurate data, fewer data outages, and lower losses than typical commercial scale systems may exhibit.The reporting period for all systems begins at their start date and ends on December 31, 2021.Voltage T A B L E 3 Necessary inputs and outputs for the PV performance models considered in this study.
and current were measured at the string level for all systems using shunts and voltage dividers, with a combined system measurement accuracy of 99.83%.The inverter used varied throughout the systems, being either the SMA Sunny TriPower 15000TL-US or 20000TL-US models.Only DC current and voltage measurements were used in this study.Meteorological data were collected on site at 1 min average intervals.GHI was measured using a Kipp and Zonen CMP-21 pyranometer.Kipp and Zonen CH1 and Eppley normal incidence pyrheliometers (NIP) were used to measure DNI.To measure the DHI, two Eppley Precision Spectral Pyranometers (PSP) were used, one having a shade disk and the other having a shade band.The POA irradiance was measured using a Kipp & Zonen CMP-11 pyranometer.Wind speed was measured at 10 m above ground level using a Climatronics Wind Mark III Wind Sensor.Pressure was measured using a MetOne BX-597A sensor.Air temperature was measured using two Climatronics Aspirated Shield Temperature Sensors.Module temperature was measured using back of module resistance temperature detectors (RTDs) on one module of each string.
generate the PAN files, PanOPT ® , a proprietary software developed by CFV Labs was used.This process involved taking measured cardinal point values (Isc, Voc, Vmp, and Imp) over a temperature and irradiance matrix as described in IEC 61853-1 and optimizing the PVSyst single-diode model parameters to fit that data.The F I G U R E 1 NMBE of three models for the Mission300 system with input data coming from the module specific measured data and the manufacturer supplied nameplate data.The mean values, represented by the green triangle, are shown in each box.The amount of error in the nameplate values correlates directly to the increase seen in NMBE of the Mission300 system between data sources.parameter fit was bootstrapped using a proprietary CFV process similar to others described in the literature which consider the singlediode model in various conditions (Open Circuit, Short Circuit, and Maximum Power Point).

2
Pie chart displaying the average amount of data removed by each filtering criterion and remaining available data.

F I G U R E 3 5 . 1 |
Describing the different pipelines (module specific, weather, and irradiance data) for the PV performance modeling comparison.The dashed lines and boxes show at what step in the pipeline the comparisons described in this study are conducted.Choosing a Perez model All eleven Perez models were tested to determine the best one to use in the overall transposition model comparison.The models differ in the geographical location at which their coefficients were determined.

Overall, most of
the transposition models performed similarly with an MBE of ±10 W/m 2 , with King being the only exception.The Isotropic model underestimated irradiance, whereas the King model overestimated.The RMSE values of Perez and Klucher were lower (<40 W/ m 2 ) than all other models.The transposition models' performance varied at different irradiance levels, as shown in Figure 6.Isotropic, Klucher, Reindl, and Haydavies were the best performing models at very low irradiance (<150 W/m 2 ).The King model performed much worse than other models at low irradiance and it consistently overestimated irradiance until it reached similar levels of NMBE at around >650 W/m 2 .The Perez-abq1998 model exhibited better performance at irradiance levels with the highest proportion of the data in them and the most consistent performance at all irradiance levels.For these reasons, it was the transposition model chosen to be used in the remainder of the study.

F I G U R E 6
NMBE of six irradiance transposition models plotted at various irradiance intervals.Overall, the NMBE values at all irradiance levels for most models were within ±3% of the measurements when irradiance is greater than 350 W/m 2 .F I G U R E 7 Residuals of modeled module temperature and average measured module temperature.All models but Ross had the tendency to underpredict.F I G U R E 8 Residuals of six cell/module temperature models plotted at various irradiance intervals.All models overpredicted at low irradiances, and all models, except Ross, had opposite behavior at high irradiance.state and transient assumptions (i.e., by applying the additive Prilliman) for the Qpoly280 system during clear and cloudy days in January and August 2018.The steady-state model, shown in blue, is more susceptible to instantaneous irradiance changes, for example, caused by passing clouds.The transient models, Faiman with additive Prilliman model in red and Fuentes in green, more closely follow the shape and consistency of the measured mean RTD values in black.Figure 10 shows the changes in RMSE before (i.e., steady-state assumptions) and after applying the transient temperature model.These results indicate that considering transient behavior reduces the spread, even in Albuquerque, NM, where the sky conditions are relatively consistent all year round.It is speculated that locations with more dynamic conditions would show larger improvements when applying the transient temperature model.In this case, the model with the greatest reduction in RMSE was the Faiman model.Of all models, Fuentes had the lowest RMSE of 3.6 C.

F I G U R E 9
Diurnal variation of module temperature during August clear-sky (a) and cloudy (b) days as well as January clear-sky (c) and cloudy (d) days.The mean measured module temperature of the Qpoly280 system is shown in black; the Faiman model with steady-state assumptions in blue; the Faiman model with the additive Prilliman transient model in red; and the Fuentes model in green.As expected, the importance of incorporating transient temperature modeling is more significant during dynamic weather conditions.F I G U R E 1 0 RMSE of steady-state and transient temperature models in which the transient assumptions improved for all models; the best being Faiman.The Prilliman model was not applied to Fuentes due to it already having transient assumptions in the default version of the model.Fuentes exhibited the lowest overall RMSE of 3.6 C.after a flat 2% derate was applied to account for degradation, soiling, wiring losses, and so forth.All models exhibited a first and third quartile NMBEs within ±4.2%.The average NMBE for all models was within ±2.3% of the measured values.The simplest model, PVWatts, considered only two module specific inputs: the STC power and the temperature coefficient of power.Even so, this model performed on par with and sometimes exceeded the performance of more detailed models, like PVSyst.This is another indication that the module input data being sourced from the specific system (and not a generic spec sheet or PAN file) may be more important than the model itself.If a module's performance closely matches the specification sheet, this may not be the case.Figure12shows the model's average NRMSE versus NMBE.The NRMSE spread is also tight, varying from 4.4% to 4.6%.While this NRMSE range was low, it is still clear that models of similar type had the tendency to cluster together, like the matrix models or CEC and Desoto.

Figure 13
Figure13shows the NMBE for all models at various irradiance levels for the Panasonic325 system.The amount of data