Modelling and forecasting PV production in the absence of behind‐the‐meter measurements

This paper deals with hourly day‐ahead prediction of the net electricity load at nine photovoltaic installations in a Swedish regional electricity network. The objective of the study was to develop, test, and evaluate a set of methods to predict the contribution of PV power to the grid without knowing the production and consumption “behind‐the‐meter.” An indirect and a direct approach for prediction of the net load were evaluated. For the indirect approach, a model of the gross production was first estimated based on the open‐source software PVLIB. The model was then used to predict the net load given a forecast of the gross consumption. Since we lacked a model of the latter, we used a “perfect forecast,” in terms of measured gross consumption, to estimate the performance of this approach. In the direct approach, a model of the net load was estimated using either linear regression or an artificial neural network. Here, the model was used for prediction of the net load without any information about the gross consumption. Both approaches rely on information from a numerical weather prediction model together with net load measurements from the previous day. Forecasts using the indirect approach with perfect information about the gross consumption resulted in a normalized (with installed nominal power) RMSEn of 11%. The direct approach with the artificial neural network also resulted in an RMSEn of 11%, even though it did not have any information from behind the meter. Linear regression had an RMSEn of 12%.

is needed along with the NWP data. Many behind-the-meter (BTM) systems produce electricity primarily for on-site use before delivering excess energy to the grid, and often, only data about installed nominal power and the address is recorded.
Moreover, measurements of BTM PV generation are scarce. What is available is the measured net load, which does not equal actual electricity consumption/production since some portion of it may come from BTM solar PV generation. In Sweden, for example, it is only mandatory for the district system operator to make hourly measurements of the feed-in (negative net load). This causes a disconnection between the measured load and the actual electricity demand as predicted by models based on load measurements before the advent of noticeable amounts of solar PV. As an example, none of the California independent services operator load forecast models include the impact of BTM solar PV. 1 Re-estimation of these traditional load models without added explanatory variables will not solve the problem since they were not developed with BTM consumption in mind. This problem will be worsened given a growing portion of weather-dependent BTM battery storage and vehicle charging in combination with time-dependent electricity rates.
The aim of the study presented in the paper was to develop, test, and evaluate a set of methods for predicting day-ahead net load when BTM PV measurements are unavailable. It is a first step towards the goal to predict net load at different scales in a regional electricity network, eg, in terms of total production and production per entry point or per secondary substation. Previous work has tackled the problem with the general lack of BTM measurements of PV production either by trying to disaggregate the net load into gross production and gross consumption [3][4][5] or by upscaling model output from a few representative sites where BTM measurements are available (eg, previous studies [6][7][8]. Not much work has been done on the former approach as confirmed by others such as van der Meer et al 4

and
Wang et al. 5 Finding representative sites as suggested by the latter approach is difficult in Sweden where there are no requirements on measuring the gross production, and the number of sites owned and supervised by the regional grid operators are few.
Here, we describe two ways to forecast the net load. The first one is an indirect method where net load is partitioned into gross production and gross consumption. The gross production was modelled with PVLIB Python, 9 developed at Sandia National Laboratories to simulate photovoltaic energy systems.
It is an open-source software and can be downloaded from http://pvpmc.sandia.gov/applications/pv_lib-toolbox. The gross consumption can either be modelled with an existing load model, based on data prior to the PV installation, or with a new load model whose parameters are estimated together with the parameters of the PVLIB model. The latter method makes it possible to take into account possible changes in the consumption pattern following a PV installation.
The second approach is a direct approach where statistical models in terms of linear regression and an artificial neural network were used to estimate forecast models of the day-ahead net load. Neither information about gross production nor gross consumption was used in this approach. As a result, this model can be adapted to any changes in the consumption pattern following a PV installation. A possible drawback is that this approach does not provide information about the gross consumption, ie, the total energy demand at the site.
In Section 2, we present the data used for this study, and in Section 3, we go into detail concerning the estimation and use of the PV models for the two approaches mentioned above. The approaches were evaluated over an 8-month period, April to October, during which solar PV production is most relevant in Sweden. Results from this evaluation is presented in Section 4. The paper ends with Section 5 on discussions and conclusions.

DATA
In this section, we describe the data used for the study. This was done in order to end up with similar probability distributions for the time of year in the training and evaluation data sets.

Measured production and consumption
The electricity measurements were provided by Tekniska verken, who are responsible for the electrical grid in the Municipality of Linköping, parts of the Municipality of Mjölby, and large parts of the Municipality of Katrineholm in Sweden. The data came from 220 PV installations connected to the grid; see Figure 1. Hourly measurements were made of the electricity transported to (feed-in) or from (net load) the grid.
In this study, we made use of nine out of these installations that, in addition, have BTM measurements of hourly gross consumption and gross production. This allowed us to develop and evaluate methods for predicting the BTM PV power production as well as its contribution to the grid.
In this paper, we denote the gross consumption with c g and the gross PV production with p g . The feed-in is denoted with f, and here it can be positive or negative (indicating a positive net load). With this notation, we have, for any given hour t, that the feed-in is given by the difference between the gross production and the gross consumption: The measurements represent the mean values for the intervals 00 − 01, … , 23 − 24 UTC. The installations are located on the roofs of five households, four apartment complexes, and one office building.  The consumption pattern is different for weekdays and weekends.
Here, we restricted ourselves to looking at weekdays. For the general case, another model is needed to cover weekends and holidays.

LOAD FORECAST MODELS
The statistical load forecast models that are used in Sweden today are intended to describe the customers electricity consumption. With the introduction of more BTM solar PV, these models need to be extended to include PV electricity production and hence provide an estimate of the net electricity load.
Here, we studied two approaches towards this goal. The indirect approach complements an existing load model with a model for the BTM gross PV production. The net load is then given by subtracting the PV production from the original load model. The direct approach replaces the existing load model with a new one that also takes into account the weather variables that effect the consumption and PV production. As a reference, we also included a persistence model, ie, that tomorrow's hourly gross consumption pattern will equal today's.

Indirect approach
The indirect approach is divided into two steps. First, a model of the gross production is estimated, and then this model is used together with a forecast of the gross consumption in order to end up with a forecast of the net load. A number of different ways to proceed with these two steps are described below.

Estimation
For the indirect approach, PVLIB Python was used to model the BTM PV power production. This is an open-source community-supported tool for simulating the performance of PV energy systems. It was originally based on a toolbox developed at Sandia National Laboratories.
For our purposes, we chose a module and inverter from the Sandia library in PVLIB that should match common installations in Sweden; SunPower_SPR_220 and ABB__MICRO_0_25, respectively. Detailed data about these and other modules and inverters provided by Sandia can be found at https://sam.nrel.gov/libraries.
The gross production at each of the nine sites was modelled with a scaling factor times the PVLIB output from one module. This is an approximation since an installation may consist of modules in different orientations and may be affected by shading. We denote the PVLIB gross production model for a single site with p m g (w, t), where the parameter vector w consists of the scaling factor together with the tilt and azimuth angles describing the orientation of the PV module.
Besides these three estimation parameters, the PVLIB model also needs inputs in terms of global and direct normal irradiances, surface pressure, air temperature, wind speed, longitude, latitude, and time of day. All the weather-related information was obtained from archived NWP forecasts.
In order to estimate the parameter vector w, we need information about the gross production. Here, we propose the use of a cost function based on the daily load curve. This curve, with one mean value of the gross consumption for each hour of the day, should be easier to model than the full hourly time time series covering all 8 months. To our knowledge, this is a novel way to estimate a model for the gross production. Three estimation methods were studied: 1. The first method assumes that there is a daily load curve available from an existing load model or load measurements prior to the PV installation. This daily load curve is denoted where the bar indicates a mean value. The corresponding mean daily gross production curve from PVLIB is given bȳ where n D is the number of days in the training data set D, and the hourly intervals refers to The parameter vector w is then estimated by minimizing the summed squared difference between the modelled and observed daily cycle of the mean hourly feed-in using a Nelder-Mead simplex algorithm (Python function scipy.optimize.fmin): Here, the subscript o is used to indicate the corresponding daily curve for the mean observed feed-in: Note that in our case, we used observations of the gross consumption from the period March to October to calculate the daily load curve, ie, from the same time period as we have information about the net load. Hence, the results we present here based on this method will be better than what will be possible to achieve in the general case when information about gross consumption and feed-in will come from different time periods.
2. The second procedure is to look for the parameter vector that results in the smoothest daily load curve. This is a version of one of the methods from the paper by Sossan et al 3 where the smoothness is measured using a differentiated time series.
Here, we differentiate the daily load curve: We then want to minimize the sum in Equation 6 given our model of the daily load curve: Hence, estimation of the parameter vector w means solving the following optimization problem: For this minimization, we need to employ an algorithm for constrained optimization since we want to restrict the parameter vector to regions where the resulting daily load curve is nonnegative. Another alternative is to add a penalty term to Equation 8 that grows large whenever the load curve becomes negative.
In our case, we added a penalty term (a cost of 1000 for each negative parameter) and reapplied the Nelder-Mead simplex algorithm.

A third method is
The vector u is then concatenated to the vector w (save the scaling parameter) describing the PVLIB model to solve the combined estimation problem as given by The reason for removing the scaling factor from the PVLIB parameter vector is to avoid redundancy since another scaling factor is introduced via the principal component coefficients in u. Again, constrained optimization or an additional penalty term is called for to ensure that the parameter vector u results in a model of the daily load curve that is nonnegative. In our case, we added the same penalty term and used the same minimization algorithm as described before.

Forecasting
Once the optimal parameters of the gross production model have been estimated using one of the above methods, we can proceed and predict the feed-in given a forecast of the day-ahead gross consumption: We then face the problem of finding a forecast model for the gross consumption. A natural choice is to use an existing load model for this purpose. However, the drawback of such a procedure is that an existing model will not take into account possible changes in the consumption pattern after the PV installation.
Another way forward is to use the model for the gross production in order to find the corresponding gross consumption using Equation 11.
Then, a new load model for the gross consumption can be estimated, including the effect of BTM PV generation. Such a procedure is however beyond the scope of this article.
For the present study, we instead used a perfect forecast of the gross consumption. This should provide us with a bound on the performance that is possible to achieve with the indirect approach.
The perfect forecast is given by the measured gross consumption at the time when the forecast is valid.

Direct approach
As an alternative to the indirect approach, we modelled the net load from the PV installation directly. This means replacing an existing, non-PV-aware, load model with a one that also takes into account the weather variables that effect the PV production. Here, the idea is to fit a parametric model to hourly data of the net load in order to get as many degrees of freedom as possible for the regression. We considered two regression alternatives: linear and nonlinear regression with an artificial neural network (ANN).

Linear regression
The linear and the ANN model (to be described next) have different parameters but share the same input. The input vector, x(t), for these two models consists of NWP forecast data for a given time tomorrow together with measured feed-in from the given time today: The idea behind this set-up is that the model should be able to make a connection between today's weather and the feed-in. We For the linear model, we also added a constant to the input vector in order for the model to be able to add a bias. The linear forecast model is then given by We assume that the residual error is described by a normal distribution and employ a linear regression method to estimate the model parameters (Python function numpy.linalg.lstsq):

Artificial neural network
Using machine learning to train nonlinear models has a long history within the area of energy forecasting. 16, 17 Here, we used an off-the-shelf ANN from TensorFlow 18 to see if it could perform better than the linear model.

Using the standard feed-forward ANN estimator called
DNNRegressor 19 from TensorFlow, we set up a network with a three-layer feed-forward topology with one input, one hidden, and one output layer. In our set-up, it had 11 inputs, 32, 64, or 96 nodes in the hidden layer, and one node in the output layer.
Determining the number of neurons in the hidden layer(s) is a trade-off between the networks ability to generalize from the training data (not too many neurons) and its representative power (not too few). Here, we were guided by the empirical relation for the number of hidden layer neurons proposed by Kalogirou 17 : Here, n in , n out , and n train denote the number of input, output, and the size of the training data set, respectively. In our case, the number of inputs equaled 11, and we had one single output (the feed-in).
The number of cases in the training data set was 3360. Hence, the  (17) using an iterative procedure. The vectors w h and w o contains the weights for the hidden and output layers, respectively.
We ran the minimization for 10 000 iterations (saving the result at each 100th iteration) at which point the error for the evaluation data sets had started to increase for all sites. When this happens, the generalization capability of the ANN starts to deteriorate. For the optimal prediction network, we selected the parameters from the iteration for which the evaluation error had a minimum, ie, just before the ANN starts to perform worse on the independent data.

RESULTS
In order to evaluate the performance of the different models, we computed some error measures. We compared the models by looking at the root mean squared error (RMSE) normalized by the nominal installed power (RMSEn), which is a common performance measure within the PV forecasting community. 6 We also calculated the square of the Pearson correlation coefficient (r 2 ) between the modelled and observed load. Only values when the sun was over the horizon were included in the calculations.  The assumption made using method 2 was that the load curve should be smooth and free from discontinuities. This was not the case for  The daily cycle of the mean RMSEn for the gross production forecasts are illustrated in Figure 3. Again, the PVLIB model for the gross production was estimated based on the gross production from the three different methods described in Section 3.1.1. The error corresponding to a persistence model is included to show the improvement over a naive approach. The error is shown as a function of forecast length. All forecasts were initialized at 00 UTC, and the PVLIB model outperformed persistence for all forecast lengths.
In the following, we used the model for the gross production   For the linear regression and ANN models, the error was somewhat larger in the afternoon than in the morning. This can be explained by the NWP forecast deteriorating with the length of the forecast.

DISCUSSION AND CONCLUSION
In this paper, we compared how a set of indirect and a direct approaches performed on the task of predicting the net load for the coming day at nine individual sites in a Swedish regional electricity network. The input to the presented forecast models consisted of information about measured feed-in from the previous day along with a NWP forecast for the next 24 hours. The fact that the study was done using data from Sweden should not severely limit the generality of the results. However, the results are probably of limited interest for countries or regions where BTM measurements are readily available. Moreover, in a real situation, the forecast has to be available well before the electricity market closes at about midday, and hence the forecast horizon needs to be stretched to at least 42 hours starting from 06 UTC. Such considerations will be the subject of further studies along with upscaling of the forecast to an area of a regional electricity network.
The indirect approach relies on a model of the gross PV production.
Three methods for estimating such a model were described, all based on an auxiliary model of the daily load (gross consumption) curve.
This is a novel approach as far as we know, and it allows us to come up with simpler models for the gross consumption than if all hourly values need to be described. The first method uses measurements of the gross consumption prior to the installation of the PV system. A drawback is that it will not capture any changes to the consumption pattern once the PV installation is in place. A second method is based on the assumption that the daily load curve should be as smooth as possible as suggested by Sossan et al. 3 This works well for the consumption pattern associated with households and apartments, but not for installations servicing offices with steep changes in the consumption connected to the office hours. The best performing method in the previous study 3 was based on separating production and consumption in the frequency domain. However, the introduction of batteries and time-dependant electricity tariffs will strive to make the consumption fit the production as closely as possible, rendering such a strategy less promising for the future. We instead propose a novel third method.
Here Van der Meer et al 4 are restricted to predictions for the next time step (with arbitrary resolution) based on production data at the current and previous time steps. Also, they don't use any exogenous input, like NWP forecasts, to explicitly take weather variability into account. Their methods are therefore not applicable to day-ahead power prediction where forecasts with hourly resolution can be required for the next 42 hours. In Wang et al, 5 the hourly net load is decomposed using an empirical model for the gross production and an ANN for the gross consumption. However, we question if this approach can be robust.
In principle, an ANN can represent any function. Hence it should be possible to estimate ANN parameters so that the gross consumption match any gross production pattern associated with a given installed capacity and model orientation in the empirical model.
Any indirect approach needs to be complemented with a day-ahead forecast of the gross consumption in order to forecast the net load.
In this study, we used a perfect forecast of the gross consumption.
This means that both the model of the gross production and the gross consumption itself was based on perfect information. Hence, the performance of the indirect approach presented here should be seen as representing an upper bound. However, once a model of the gross production is in place, it can be used to obtain information about the gross consumption. For future work, this can in turn be used to estimate a model for the gross consumption.
For the direct approach, we used either linear regression or an off-the-shelf ANN from TensorFlow. The latter was included to see what kind of improvements a nonlinear model could offer. The direct approach estimates the model parameters based only on information about the hourly net load. It turned out that the performance of the linear regression model was similar to that of the indirect method based on perfect information about the gross consumption during most of the day. The direct approach based on the ANN even performed better than this indirect approach when the PV production peaks between 08 and 16 UTC. Averaged over all hours when the sun was above the horizon, the RMSEn for the best indirect approach using PVLIB and the direct approaches with linear regression and an ANN were 11%, 12%, and 11%, respectively. This is a substantial improvement over the RMSEn of 20% resulting from the baseline approach represented by a persistence forecast of the net load.
An ANN can theoretically represent any function, and hence it should be able to perform as well as or better than any indirect approach given that it is fed with sufficient amounts of data describing the problem. This line of reasoning seems to be in contrast with the results in Wang et al. 5 They compared an indirect and a direct approach, both based on ANNs with the same complexity, and got better results for the indirect method. A possible explanation could be that they in fact used a sequence of two ANNs for their indirect approach but only one for the direct.
Earlier studies have shown that the choice (eg, the ANN) between different nonlinear models is not critical. 20 However, better performance could perhaps be achieved if a recurrent structure is tried. Time correlations could then be exploited by allowing forecasted output up until time t to be used as input for the forecast ahead of time t. What to include in the input vector in general is another question for further investigations. Here, we picked information we thought was reasonable. No evaluation was made regarding how useful different input parameters were for the prediction. Future work should also look at using input from probabilistic NWP forecasts. This should be a way to describe and account for uncertainties in the solar radiation forecasts.
To conclude, we have presented a novel way to indirectly estimate a model for BTM gross PV production. We have also shown that forecasting the net load directly works as well, or during most of the day better, than forecasting the gross production and gross consumption separately. This is the case even if the forecast of the gross consumption is replaced with actual measurements.