A data‐driven modeling and analysis approach to test the resilience of green buildings to uncertainty in operation patterns

Green building design is a promising approach to reduce the energy intensity of the building sector. However, green buildings often show important discrepancies between their predicted and actual energy use levels, in part due to varying operation patterns that are difficult to predict during design. This paper presents a data‐driven modeling and analysis approach to test the resilience of green‐certified buildings to uncertainty in the operation of building systems. Using building energy modeling coupled with an extensive empirical Monte Carlo analysis scheme, the framework quantifies and compares the response of a building to uncertainty in key technical and operational features before and after the adoption of green building certification specifications. The framework is illustrated and validated through a case study of an archetype commercial building located in the extreme hot climate of Abu Dhabi, UAE. Results show that adopting the green building features of the local “Estidama” building code reduces energy demand by an average of 17%. More importantly, the variability in demand is reduced (P < .05), confirming the increase in building resilience to uncertainty in design and operation factors. Finally, the techno‐economic potential for solar photovoltaic (PV) adoption is also assessed, showing an estimated 16% reduction in capital costs.


| Problem statement
Despite advancements in green building design and certifications, there is a growing body of evidence that buildings are showing actual energy levels that differ from the ones that were predicted during the design phase. The mismatch can be significant where buildings have shown to consume up to 2.5 times more energy than their designed values. 17 More importantly, this mismatch is not only occurring in regular buildings but also in green-certified buildings with low-energy designs and technologies. [18][19][20] Among various factors, the performance gap can be attributed to the lack of information and knowledge of engineers and modelers during the design phase about parameters related to the actual performance of various building systems. [21][22][23] The parameters cover design-related features of the building (eg, the performance of building envelope or chiller) and operation-related patterns (eg, HVAC settings and lighting usage). 4,12,17,19,[24][25][26][27][28][29][30] The above-mentioned studies present essential steps toward better understanding the relationships between uncertainty in operation patterns, building performance, and robust design. However, little emphasis is put on the impact of uncertainty in building operation-as well as design-parameters on the performance of green-certified buildings. Put differently, the link between the studies observing a performance gap in green-certified buildings 19,20 and those exploring building performance and robust design under uncertainty (eg, Ouf et al 31 and Karjalainen 32 ) remains unclear.

| Objectives and research questions
This research proposes a modeling framework to quantify the impact of uncertainty in technical and operational parameters on green-certified building performance. Uncertainty Analysis (UA), in this context, refers to quantifying the variability in outputs (eg, building energy performance) due to uncertainty or lack of knowledge in inputs (eg, design and operation parameters). 33 The framework is comparative in nature, where a building is studied before and after the adoption of the green building labeling and corresponding building characteristics. Such an approach helps evaluate and compare building performance and resilience pre-and postgreen building certification, answering four main research questions: 1. How much energy savings are expected after adopting the green building certification specifications? 2. Are the savings maintained after subjecting the building to uncertainty in its technical and operational parameters? 3. Overall, is the building postcertification less sensitive to uncertainty in its parameters compared to the precertification stage (ie, more robust and resilient)? 4. Is the techno-economic potential of using solar PV panels higher after adopting the green building certification specifications?
The framework is illustrated and validated through a case study of an office building in Abu Dhabi, UAE, simulated with and without the characteristics of the local green building rating, the Pearl Rating System (PRS) of Estidama-a green building certification tool used in Abu Dhabi, UAE. Finally, insights are provided on the effectiveness of the studied green building certification to reduce energy consumption and mitigate performance risk, while acknowledging and discussing the limitations of current building energy modeling tools.

| Performance gap and its drivers
Many researchers have examined the performance of buildings in terms of energy use, with most studies finding a performance gap in buildings, whether green-certified or not. In the case of green buildings, many studies question whether they are saving energy or not. Menassa et al 19 conducted an energy consumption analysis for 11 United States Navy (USN) buildings that have LEED-certified ratings, comparing them with conventional buildings from the USN and the United States Marine Corps (USMC). The study showed that 9 out of 11 of the LEED-certified buildings have energy savings lower than the expected percentage (ie, 30%) for the electricity consumption sector. The analysis also found that the energy consumption of most of the LEED-rated buildings was higher than the average in the country. In a similar study, Newsham et al 20 found that "28-35% of Leadership in Energy and Environmental Design (LEED) buildings used more energy than their conventional counterparts." Such findings confirm that the performance gap is occurring in traditional and green building, 34 which can have significant consequences on the choice and sizing of building systems, such as HVAC, or renewable energy systems, such as photovoltaic (PV) solar panels. 35 The drivers of the performance gap are further explored in the following studies.
van Dronkelaar et al 27 reviewed the predicted and measured energy consumption of 62 nondomestic buildings in the UK, observing a gap between real energy use and predicted use exceeding 34%. More importantly, the authors highlighted the gap drivers for those buildings, which were uncertainty in modeling, occupant behavior, and lack of practice in the operation phase. Their direct impact on the building's energy consumption can reach 20%-60%, 10%-80%, and 15%-80%, respectively. Factors, such as early design decisions, were found to affect the gap. The authors emphasized the need for more research efforts on the performance gap drivers by understanding the causes and working to mitigate their impacts on energy use by relying on new tools rather than calibration techniques.
An analysis of 15 schools in the UK was conducted by Demanuele et al, 34 where a sensitivity analysis was conducted to track energy performance and to identify the factors affecting energy use. The results showed that the operational issues and occupant behavior were the main factors contributing to high energy use, creating a significant impact on the performance gap. The paper concluded the work with a suggestion to expand the sensitivity analysis to cover a range of inputs instead of specific values, especially for occupant behavior, which is difficult to predict. It further highlighted the importance of energy education in improving the building's energy performance.
Menezes (2012) 17 analyzed the energy performance of an existing office building to identify the drivers of the gap between predicted and actual energy performance. The study found that the unrealistic occupant parameters' inputs, the inaccurate data about the building's main component, and the lack of communication between the owner and the building's designer were the main drivers of the energy performance's discrepancies. The paper highlighted the importance of a postoccupancy evaluation (POE) in improving the accuracy of the energy prediction models, which can be increased by 3% when combining both the energy modeling and monitoring data from the buildings. The results also highlighted the need to further investigate occupant behavior inside the buildings, in addition to connecting these data to the energy modeling programs to have more realistic predictions. Furthermore, the paper highlighted the need to apply POE to improve the communication between the designer and the occupants, thereby reducing the performance gap.

| Parametric variation and uncertainty analysis in buildings
In parallel to the study of the performance gap, other researchers have studied the impact of building design or operation parameters on building energy performance. Al Amoodi and Azar 9 conducted a comprehensive study to measure the energy performance of educational buildings in Abu Dhabi and to quantify the occupant behaviors' impact. They applied parametric variation methods, including Monte Carlo (MC) analysis, differential analysis, and fractional factorial analysis. The results showed the importance of considering the uncertainty of the building's parameters, especially the occupant parameters, which led to energy consumption variations up to ±25% from the average value.
Breesch and Janssen's 36 sensitivity and uncertainty analysis examined the performance of natural ventilation in the case of summer conditions by considering the variation in the building parameters in the simulation. The uncertainty analysis showed that thermal comfort was the most uncertain parameter with sensitivity to the single-sided night ventilation. However, the results identified a number of parameters affecting the uncertainty of building comfort, such as the solar heat gain coefficient, internal heat gains, setpoint temperatures, and wind pressure coefficients. The study showed that combining the differential measure with natural ventilation reduces uncertainty, especially in warm weather. Moreover, the study examined the impact of a hot climate on the uncertainty in achieving building comfort, which was determined to be significant. The authors further suggested improving the accuracy of the energy results by using a standard weather data set and considering warm weather data in the simulation.
Gunay et al 37 conducted a sensitivity analysis of eight operational parameters of office buildings in Ottawa, Canada. The results revealed that the air handling units' start and stop times and the ventilation rate have significant impacts on energy consumption and thermal comfort. The authors then applied a mixed-integer genetic algorithm to identify the optimal operational configurations for different (i) heating-dominated climate zones, (ii) occupancy patterns, and (iii) building envelope scenarios. An uncertainty-based simulation analysis of an office building was conducted by Wang et al 38 to study the uncertainty in the building's annual energy that was caused by two factors: weather conditions and the building's parameters. As a result, a variability of −28.7% to 79.2% was recorded in the annual energy report due to parametric variation, whereas annual energy uncertainty due to weather variations was between −4% and 6%. One of the study's key findings was that a 49% to 79% increase in energy consumption could occur as a result of poor practice during the operational phase,however, good practices can contribute 15% to 29% to reducing energy consumption. The findings of the above studies confirm that operational parameters are critical factors in building performance, motivating the need for related research efforts on topics such as occupant-centric building controls, 39 occupant-related features in common building simulation tools, 40 and occupant-centric design practices and applications. 41 In the latter, research on robust design strategies is particularly relevant to the current paper and is further discussed next.

| Robust design
Recent efforts in the literature cover robust design strategies that aim to mitigate the performance risk under uncertainty in different parameters. For instance, Ouf et al 31 combined stochastic occupant behavior modeling with building performance optimization of a single-story office building in Toronto, Canada. Comparing two occupant modeling approaches (ie, deterministic and stochastic), the authors observed differences in the optimal design strategies reached, especially when design robustness was included in the optimization objectives. Also, using optimization, Bucking 42 presented an approach to quantify the economic risks of construction for a Net Zero Energy (NZE) commercial building while accounting for variability in the assumptions used in the energy modeling process. The study found that an NZE design could not be achieved given the current market conditions. It then presented the optimal design that achieves NZE performance with the best net present value.
Karjalainen 32 studied the impact of variability in occupant behavior on building energy performance using dynamic thermal simulation models. The author considered three types of behaviors (ie, "careless," "normal," and "conscious") and two types of designs (ie, "ordinary" and "robust"). The results show robust design solutions can make buildings less sensitive to occupant behavior, reducing their impact on energy performance. Along the same lines, Buso et al 43 explored how alternating occupant behavior patterns impact the energy performance of different envelop design strategies. A total of 15 building envelope designs were tested for an office building under three weather climates. It was found that increasing the thermal mass and reducing the transparent areas of the envelope increase the robustness of the building toward uncertainty in occupant behavior. Finally, Kotireddy et al 44 highlight that "studies on robustness assessment using scenarios in the building performance context are limited." To help address this gap, the authors compared different design robustness assessment methods, namely the "Max-min," "Best-case and worst-case," and the "Minimax regret" methods. The comparison aimed to guide decision-makers to select cost-optimal robust designs or to assess trade-offs with other performance indicators.
In summary, the review of the literature highlights important gaps in understanding the performance of green-certified buildings and their sensitivity to uncertainty in design and operation patterns. The following section details the methodology used to address the stated gaps.

| METHODOLOGY
The proposed methodology framework is shown in Figure 1 with three distinct stages. Stage 1 consists of developing a "base case" building energy model of a typical office building in Abu Dhabi, UAE, and another model of the same building but with Estidama's Pearl Rating criteria, representing the green building scenario. The latter is hereafter referred to as the "green-certified" building energy model. Stage 2 includes the parametric variation (Monte Carlo Analysis) conducted to study the effect of uncertainty in building design and operation parameters on the energy performance of the two buildings. Finally, Stage 3 covers the data analysis performed on the results of the previous stage, which includes data visualization, statistical tests, and the evaluation of solar PV potential. The following subsections detail each of the three stages.

| Building energy modeling
Two building energy models are developed in this study using the Design-Builder software 45 as an interface of the EnergyPlus simulation engine 46 . The base case model is representative of an archetype (ie, average) office building in Abu Dhabi, UAE. Its specifications are shown in Table 1 and are based on data from a benchmarking project led by Abu Dhabi's Urban Planning Council as part of its efforts to develop the Estidama rating system. 47 A building energy model representative of the mentioned archetype office building was built and introduced in an earlier paper by Afshari et al. 24 The model was replicated in the current work, matching the energy estimates described in the previous study. 24 In order to further confirm the validity of the base case model, its estimates are benchmarked against data collected from multiple buildings as part of a previous research effort by the authors. 48 As detailed in Lin et al 48 , the primary source of data used is a database compiling the results of a representative survey of Abu Dhabi's building stock. A total of the 13 buildings matched the main characteristics of the base case model, mainly its type (ie, office building) and size (ie, medium to high rise). Figure 2 shows the Energy Use Intensity (EUI) of the 13 buildings compared to the base case model. The difference between the two averages is 3%, reconfirming the representativeness of the model and its validity for further analysis.
Next, an alternative version of the model was developed, representing the building after the adoption of features from Estidama's PRS. The proposed changes in the building cover three main systems, including the building envelope, the HVAC system, and the lighting system. The focus on these systems and their corresponding specifications is motivated by their anticipated impact on building energy performance. Other parameters that are not necessarily connected to building operation performance (eg, use of recycled materials) were not considered in this study since the scope is mainly on building energy performance. It is also important to note that the chosen systems/parameters and their suggested values in the building PRS documentation are independent on the level of certification (eg, Pearl-1, Pearl-2), which is a points-based system that depends on the number of PRS features that are adopted rather than their values. The base model building's performance and the target performance of Estidama's rating system for each of the three building's elements are also added in Table 1.
For the Envelope elements, the wall and the roof's U-values are categorized under the opaque fabric U-values, insulation roof, and mass wall. Furthermore, the glazing U-value is under the vertical fenestration U-values and metal framing glass. The Solar Heat Gain Coefficient (SHGC) depends on the projection factor; in our case, the projection factor is less than 0.25. Moving to the second element, the air conditioning's performance accounts for "air cooling" only and depends on the system's type and size; the latter in our case is between 70KW and 223 KW based on the simulated peak cooling load. The internal lighting density is obtained from the guidance document of the pearl building rating system. 49

| Experimental design and parametric variation
The choice of parameters to include in the uncertainty analysis followed two main criteria. The first criterion is that the parameters exhibit uncertainty or potential changes in their characteristics during operation, while the second is that such changes are expected to impact energy performance. A thorough review of the literature guided the shortlisting of the parameters shown in Table 2, which cover key technical and operational features of the buildings. 9,38,50,51 . Given the focus of this study on building operation, only technical parameters that are affected by building operation are selected. For instance, while COP is typically considered as a design feature, a previous study conducted on buildings in the UAE showed that rusty or dirty AC equipment could cause significant decreases in their energy performance. 48 Similarly, infiltration rates can be affected by occupants (eg, window and door opening), while design airflow and cooling supply air temperatures are typically set by facility managers. As for the operational parameters listed in Table 2, their values were used to create simplified diversity profiles with specific values for occupied and unoccupied hours. It is important to note that variations within each period were not considered due to the increased modeling complexity that would result. Such simplification, however, was deemed acceptable given the comparative nature of the analysis between the base case and green-certified models, which cancels out effects (errors) that are present in both models. Similarly, additional parameters, such as the start/stop times of the air handling units or the zonelevel temperature setpoint settings, were not considered and can be included in future expansions of the current work.
After selecting the target parameters, ranges, and intervals are assigned to each parameter based on the best available knowledge from the literature ( Table 2). The ranges aim to reflect the real performance of the parameters in practice, covering extreme scenarios (both the best and the worst cases) that can occur in the buildings. Furthermore, probability density functions (PDFs) are assigned to ensure a realistic selection of parameters based on their probability of occurrence in the buildings. It is important to highlight that all the content of Table 2 was obtained from the references cited in the last column (ie, without making assumptions that are not supported by data). In few cases where a distribution is unavailable for a certain parameter, a distribution of a similar parameter is used instead. For instance, given the lack of data on the uncertainty in pump efficiency, this parameter is assigned the same triangular distribution as fan efficiency, with a ±0.16 variation from the mode. The final list of parameters and their distributions (shown in Table 2) is used in the next section to perform the uncertainty analysis for the base case model and the alternative green-certified model. A Monte Carlo (MC) variation is conducted on both building energy models developed in this research, namely the base case and the green-certified models. In general, MC methods are computational techniques that are commonly used to estimate expectations of an output function through repeated sampling of its inputs from probability distributions. 54 In the context of this study, MC is applied by repeatedly running the building energy models for different combinations of the input parameters, with the aim of studying the resulting impacts on the energy estimates of the models. A total of 10 000 combinations of input values are generated. For each run, values are drawn from the probability distributions of the parameters, which are supported by previous studies in the literature (refer to Table 2).
The choice of the number of iterations to test (ie, 10 000) was guided by the work of, Sun et al. 50 who also used 10 000 runs in a study of similar nature to the current one. However, acknowledging that the ideal number of iterations is case-specific, this study followed an incremental simulation approach starting with a batch of 1000 runs, followed by additional 4000 runs, and 5000 runs (ie, a total of 10 000 runs). For each batch, the results (described in the upcoming sections) were analyzed to determine whether additional runs were deemed beneficial. Interested readers could also refer to the Appendix, where a comparison between the results of the different batches of simulations is presented. The parametric variation described in Table 2 was simultaneously applied to the base case and green-certified models. Such a parallel approach helps study and compare the effects of the same uncertainty in the input parameters on the energy performance of both buildings. For each run, the following outputs of the model are recorded: The parametric variation is implemented using the jEPlus software, 55 which allows automating such parametric variations using EnergyPlus as the simulation engine. The inputs to the jEPlus software are (a) the EnergyPlus files of the two considered models, (b) the weather file for Abu Dhabi, (c) the distributions of the parameters to vary, (d) the parameters of the simulation, such as the number of iterations needed (ie, 10 000), (e) the outputs needed from the energy models (eg, total energy consumption and by end-use), and (f) how to summarize the output results (eg, Excel file). The total number of runs conducted is 20 000 (10 000 for each model), which took about 400 hours of computation using the workstation in our laboratory; Each run took 4 minutes and multiple processors were used in parallel.

| Data analysis
Prior to analyzing the results of the parametric variation, the first step is to evaluate the energy estimates of the two developed models, namely the base case model and the green-certified model. A basic comparison of the energy estimates of the two models is presented in terms of total energy consumption as well as per end-use (ie, cooling, lighting, plug-loads, fans, and pumps). Such results help estimate the expected energy savings when implementing Estidama design features when compared to the base case. They also help determine whether the distribution of energy among the different enduses changes varies between the two models.
The next stage is the uncertainty analysis, which presents the results of the parametric variations applied to the models. In this study, the energy use outputs generated from the 10 000 runs for each energy model are compared and contrasted to provide insights on how the two modeled buildings perform under uncertainty in technical and operational parameters. Histograms and box-whisker plots are firstly developed using the Tableau software 56 to compare the variability in the observed energy estimates. 57 The results of the 10 000 runs for each model are visually presented side-by-side for the different output metrics that were generated by the models (the list of outputs was provided in the previous section). Such comparison helps provide initial insights on potential differences in the models' response to uncertainty in their inputs.
The visual analysis is followed by hypothesis testing using the RStudio software 58 to confirm, statistically, whether the observed differences are significant or not. Two tests are applied for each of the outputs generated by the model. The first is a "two-sample hypothesis T-test" used to compare the two averages of the two samples (obtained from the two models). More specifically, after the differences in the means are computed, one-tailed T-tests are conducted to confirm that the observed differences (either positive or negative) are statistically significant, with a 95% confidence. The second test is a "two-sample hypothesis F-test" used to compare the variances Note: * refers to the green-certified building energy model. of the samples, which is a measurement of the variability or spread in the energy outputs of the models. Here again, once the differences in the variances are computed, one-tailed F-tests are conducted to confirm the statistical significance of the differences at 95% confidence level. The tested hypotheses and evaluation criteria 59 are shown in Table 3. The last stage of the analysis consists of comparing the two buildings in terms of their techno-economic potential for solar PV generation. A techno-economic analysis is conducted using the System Adviser Model (SAM) from the US Department of Energy's National Renewable Energy Laboratory. 60 SAM is a computer model developed to facilitate decision-making for people involved in the renewable energy industry. 61 SAM simulates the performance of different renewable energy systems, including photovoltaic, wind, geothermal, and biomass power systems. SAM model is equipped with financial analysis and simulation tools that facilitate financial, parametric, and sensitivity analyses for different scenarios, such as when the user wants to study a standalone system or a system that is connected to the grid.
A standalone photovoltaic/battery system was assumed for both cases (base case and green-certified buildings). The PV system was sized based on the average demand profile so that the PV can generate enough electricity to meet the demand during the day and to store excess electricity in the battery to meet the building's demand at night. The battery of choice was NMC li-ion battery, which is made from a stack of cells that have nickel-manganese-cobalt in the cathode layer, graphite in the anode layer, and solid lithium salts in the electrolyte layer. The PV size in kW was determined based on the total annual electricity generated that should match the total annual demand of the building.

| Energy consumption before the parametric variation
The energy estimates of the two developed models are presented in Table 4, including the values for each energy end-use and the differences between the two models. The results indicate that the green-certified building has a better performance in terms of energy consumption than the conventional buildings with energy reduction by 17% in "Total end-use" and an observed reduction in most of the building's components, especially the cooling load with 19% reduction. It is also noticed that fans and pumps have the least energy consumption for both models. In addition, plug-loads and comfort metrics have approximately the same levels for both models, hence no notable changes after implementing Estidama's specifications.
Moreover, Figure 3 illustrates the breakdown of energy by end-use, showing that the cooling load has the highest share of energy consumption among the different end-uses for both the conventional and the green buildings with a percentage around half of the building's total energy consumption. Overall, the distribution is similar for the two buildings except for lighting, which shows a reduced share of the load in the green-certified building due to the low-energy specifications that are adopted in the building (Refer to Table 1 for more information about the differences in the characteristics of the two buildings). In contrast, there were no specifications for low-energy plug-loads installations, hence, the increase in the share of this energy end-use compared to the others.  Figure 4 presents the energy results of the 10 000 runs of each of the models. The x-axis represents the total energy consumption in kWh/year, and the y-axis shows the frequency of occurrence for each bin. The first observation from the figure is that the green-certified model estimates mostly appeared on the left side of the graph, implying lower energy use levels from those of the base case. Therefore, under the simulated uncertainty in model parameters, the energy estimates of the building postcertification were lower than those of the base case model. This observation is consistent with the energy levels of the models discussed in the previous section. The second observation from Figure 4 is that while the results of the two models followed a bell-shaped (normal) distribution, the curve of the green-certified model is narrower, indicating a lower sensitivity to uncertainty in input parameters. Put differently, while the two models were exposed to the same parametric variation in their inputs, the green-certified one showed a higher resiliency to such change. This can be attributed to the higher efficiency of various systems of the green-certified building, which exhibit lower variations F I G U R E 3 Breakdown of energy consumption by end-use F I G U R E 4 Probability distribution of total end-use (kWh) for both models in their performance when the inputs to those systems are varied.

| Uncertainty analysis
The final observation from the figure is that the two curves overlap in a small area in the middle. The finding means that for few cases of the 10 000 runs for each model, the energy estimates of the green-certified model are higher than those of the base case. Given the stochastic approach followed in the parametric variation, some combinations of parameters might lead to extreme instances in which the green-certified model is underperforming while the base case model is over-performing, causing the latter to be more efficient than the former. Figure 5 sheds light on such extreme cases by showing the input parameters' values of the 10 worst-performing green-certified scenarios and 10 best-performing base case scenarios. The goal of the figure is to determine which varied input parameters contributed to the observed performance gap, where the green-certified model underperformed the base case model. Whenever clear patterns are observed in the values taken by each group of scenarios, "partitional clusters" are manually added to the figure to visually discern such grouping effects. 62 Five of the 11 varied input parameters exhibited clustering in the values of the two models. Starting with the lighting and plug-load input parameters, the green-certified scenarios had consistently higher values than the base scenarios, as highlighted in Clusters A-D in Figure 5. An even stronger clustering effect is observed for the cooling setpoint during occupied hours (Cluster E), where all green-certified scenarios had temperatures at the extreme low end of the spectrum. It can be concluded that the 10 worst-performing green-certified scenarios were characterized by low cooling setpoints during occupancy hours and high percentage usage of lighting and plug-loads. In contrast, the other parameters in Figure 5, which are mostly design-focused, did not exhibit discernible trends in their data. Such parameters appear to have a lower influence on the performance gap of the green-certified model.
The results for the Peak total end-use are shown in Figure 6 and follow similar patterns to those of total energy consumption of Figure 4. Namely, the green-certified model shows lower peak loads compared to the base model. In addition, the two curves for the two models are following normal distributions with approximately similar bell-shaped, indicating no apparent difference in terms of the sensitivity of the two buildings to the uncertainty in the building's parameters. However, the same extreme cases were observed for the peak loads where the two curves overlap in a small area in the middle, in which the green-certified model is underperforming while the base model is over-performing; this is the result of the same parameters' combinations discussed earlier. Figure 7 presents the results of the runs by end-use using box-whisker plots in which each of these box-whisker represents the 10 000 runs' outputs. Such a representation provides a way to compare the spread/distribution between the end-use of the two models, providing further insights on how sensitive the model outputs are to variations in their inputs. Each of the box-whisker plots consists of five values, which are the upper-whisker, lower-whisker, upper-quartile (75th percentile), lower-quartile (25th percentile), and the median (50th percentile). For example, for the total end-uses values of the base model, the median (50th percentile) is 7063 kWh, which is the middle value of the 10 000 points represented in the figure. The 75th percentile is 7368 kWh, showing a 4% difference from the mean value. The 25th percentile is 6,766 kWh, reflecting a 4% decrease from the mean. The interquartile range (IQR) is also shown in the figure, which is the difference between the 25th percentile and 75th percentile, F I G U R E 5 Input parameters for the 10 worst-performing greencertified scenarios and 10 best-performing base case scenarios | 4261 including 50% of the data. This range is used to obtain the upper-and lower-whisker values that are at a ±1.5 IQR distance from the median.
In general, it can be noticed that the total end-use (kWh), peak total end-use (W), cooling load (kWh), peak cooling load (W), lighting load (kWh), and pumps load (kWh) of the green-certified model are significantly decreased compared with the base case model, as highlighted in the lower box-whisker plot levels. Other end-uses, such as plug-loads (kWh) and fans load (kWh), do not show statistically significant differences in their values. That is mainly due to the lack of specifications in Estidama for more energy-efficient related systems than the ones already adopted in the base case model (Refer to Table 1). The above results confirm that the postcertification model is outperforming the base case model in most of the end-uses, especially the ones with high contributions to total energy use, such as cooling and lighting.
In parallel, it can also be seen from Figure 7 that the ranges of uncertainty for various end-uses differ between the models. Overall, the sizes of the boxes and whiskers of the postcertification graphs are smaller than those of the base case models, which is confirmed by the smaller size of their IQR values. For example, for the total end-use, the IQR for the base model is 601.6, while it is only 452.8 for the green-certified model. This indicates that the systems of the latter witnessed a smaller variation in energy consumption when subjected to the parametric variation that was simulated in the 10 000 runs of the model. The findings confirm that the resilience of the green-certified model is consistent across most of its energy end-uses. Figure 8 presents the results of two metrics related to the indoor environmental conditions of the buildings and occupants' thermal comfort levels. The first is the "Time Setpoint Not Met During Occupied Hours," which represents the number of hours per year where the HVAC system was unable to maintain the indoor temperature at the specified setpoint setting (ie, 22°C during occupied hours and 24°C during unoccupied hours). The second metric is the "Time Not Comfortable Based on Simple ASHRAE 55-2004," which is the summation of the number of hours where at least one building zone fails to satisfy ASHRAE's comfort criteria. 63 Overall, the results of both models are quite comparable, and no definite conclusions can be drawn on which model performs best. On the one hand, the green-certified model shows, on average, a lower amount of time where setpoints are not met during occupied hours (left side of Figure 8). On the other hand, it shows a slightly higher variability than the base case model in terms of the time that occupants are expected to be uncomfortable (right side of Figure 8).
An important trend to notice in both models is the high number of hours for the "Time Not Comfortable Based on Simple ASHRAE 55-2004," which are exceeding 2000 hours per year for all scenarios (right side of Figure 8). This is in large parts due to the cooling setpoints used in the buildings, particularly 22°C for occupied hours (refer to Table 1), while ASHRAE 55's thermal-neutrality zone is mostly centered F I G U R E 6 Probability distribution of Peak total end-use (W) for both models around the 24-25°C range. 63 Although a cooling setpoint of 22°C is often considered low in most countries, it is common practice in the UAE, as documented by multiple studies. 9,24,48,64 Nonetheless, the comparative nature of the analysis conducted on the results of Figure 8 helped mitigate the above-mentioned limitation, mainly by focusing on the relative differences between the two models rather than their absolute values.
In order to confirm the above observations and patterns statistically, Table 5 summarizes the results of the Ttests and F-tests conducted on the differences in the means and variances observed between the models for the different outputs (ie, energy-related and comfort-related metrics). As shown in the table, all differences showed P-values that are lower than 0.05, hence rejecting the null hypothesis and accepting the alternative that the differences are statistically significant at the 95% confidence level. As for the directions of the differences, the analysis of the means confirms substantial improvements for all measured metrics except equipment and discomfort according to ASHRAE 55-2004. The differences in the variances, which are also all statistically significant, confirm the reduction in variability (ie, spread) for the total energy consumed by the building as well as the total peak loads. This result confirms, with a 95% confidence, that the green-certified model is showing a higher resiliency than the base case model when subjected to uncertainty in input parameters. The end-uses that were not targeted by Estidama specifications (eg, plug-loads, fans, and pumps) showed higher variability. While such patterns are not desired, they do not compromise the importance of the overall results as the postcertification model showed lower averages and variances in total (and peak) building energy consumption levels. It can, therefore, be concluded that the green-certified model is outperforming the base case model by showing lower energy use levels, which are also less sensitive to changes in the models' inputs.

| Solar PV techno-economic analysis
The estimated PV sizes for the base case building and greencertified are 945 kW and 800 kW, respectively, as shown in Table 6. These sizes accounted for about 4.4% losses between DC and AC conversion. Battery size in kWh and discharge rate in kWh per hour are two important design parameters that determine the duration of discharge and the allowable power that can be pulled at any given time. Overall, the values in Table 6 highlight the smaller size of the PV-battery system that is needed for the building postcertification due to its lower overall energy demand. Table 7 provides more details on the techno-economic potential of the PV-battery system, generated by the SAM models. Both systems show promising performance levels, as highlighted by the important savings on electricity bills, negative net present values (NPVs), and good battery efficiency rates. The latter is reflected in the utilization and performance values of the storage system, with less than 10% losses between charging/discharging modes. 65 The levelized cost of electricity (LCOE) is lower for the green-certified building but not significantly different, indicating a rather comparable performance to the base case building. Finally, a clear advantage of the green-certified building is the net capital cost of the system, which is 16% lower than for the base case building. The observed difference is a direct consequence of the lower PV-battery system size for the building postcertification due to its reduced demand for energy.

| DISCUSSION
This section reflects on and answers the three research questions presented in the "Introduction" using the results observed in the previous section.

| Research question #1: How much energy savings are expected after adopting the green building certification specifications?
As observed in the comparative results conducted before and after parametric variation, the green-certified model consumed 17% less than the base case model in terms of total energy and peak loads, including a 35% reduction in lighting loads. The observed savings are within the range of those of previous studies summarized in MacNaughton; 66 the savings, however, vary significantly between different studies and based on building types, locations, and certification mechanisms. It is important to note that the answer to the abovestated question holds as long as the design specifications of both buildings are maintained, which is an assumption that engineers and modelers often make when simulating a F I G U R E 8 Variability in thermal comfort metrics under uncertainty few instances, a low-performing rated building can perform worse than a good-performing nonrated one. While such extreme cases are not necessarily frequent, the findings confirm that a poorly operated green building can exhibit energy consumption levels that are comparable or even worse than similar noncertified buildings. Such cases were also observed and documented in previous studies. 19,20 This motivates the need for continual monitoring of green buildings to ensure that their various systems are performing up to their design specifications. This step is essential to ensure that the investments in green buildings achieve the expected returns both in terms of energy and monetary savings.

| Research question #3:
Overall, is the building postcertification less sensitive to uncertainty in its parameters compared to the precertification stage (ie, more robust and resilient)?
The answer to this research question is "yes." As shown in the histograms and box plots of the uncertainty analysis results, while both the base case and green-certified buildings were subjected to the same parametric variation experiment, the energy estimates of the latter showed lower variations than those of the former. The differences were all significant at the 95% confidence level. The findings indicate increased robustness and resilience of the building postcertification when faced with uncertainty in its technical and operational parameters. Such observation confirms that in addition to its expected energy-saving benefits, adopting Estidama specifications reduces the potential variability in building energy demand. The results are confirmed by those of previous studies by Karjalainen 32 and Buso et al. 43 While not explicitly investgating the performance of green-certified buildings, the authors observed similar instances of robust design strategies effectively reducing the sensitivity of buildings to uncertainty in operation patterns.

| Research question #4
: Is the technoeconomic potential of using solar PV panels higher after adopting the green building certification specifications?
The answer to this question is "no," except for capital costs. As shown in the simulation results from SAM model, a comparative cost analysis between standalone PV/battery systems for the base case and green-certified buildings does not show significant differences in terms of the levelized COE values. However, the reduced energy demand of the greencertified building reduced the needed PV and battery sizes. Consequently, another important financial parameter, which is the estimated capital investment, yielded important savings (16%) for the green-certified building.

| CONCLUSION
Green building rating systems have gained significant interest in recent years as potential solutions to the high and growing demands for energy in the building sector. In parallel, a growing number of studies highlight the presence of a performance gap between the predicted and actual energy use levels of buildings, including ones with the highest levels of green building certifications. Acknowledging this gap, this study proposed a modeling framework to assess the performance of any green building in terms of its energy-saving potential as well as the robustness of its performance under uncertainty.
The contributions of this work are significant both in terms of the proposed methods and the results of the case study. To our knowledge, the proposed framework is unique in its focus on green-certified buildings. It helps bridge the gap between research efforts on green building certification and their performance gap (on the one hand) and uncertainty analysis and robust design (on the other). The framework was also developed and presented in a generic manner to ease its applicability to other buildings and certification schemes.
The framework was illustrated through a case study of a typical office building in Abu Dhabi, which was simulated pre-and postadoption of the local green building code, Estidama. The findings confirm that Estidama can play a crucial role in reducing the energy intensity of commercial buildings in the UAE, as documented with the lower energy consumption values observed when Estidama features were adopted. More importantly, the green-certified model showed a lower variability in its energy use levels when subjected to uncertainty in input parameters. This confirms that Estidama not only reduces the energy consumed in buildings but also increases its resilience and robustness to uncertainty in technical and operational parameters. Moreover, the techno-economic assessment of the potential for solar PV adoption indicates that the base case and green-certified buildings are comparable to each other in terms of the levelized cost of electricity. However, the latter has a clear advantage when it comes to the initial investment costs.
Overall, the findings of this study illustrate the role that green building design can play to reduce the energy intensity of the building sector and make it more resilient, hence less prone, to the performance gap commonly observed in buildings ( 18,19,20 ). However, the results also show that green building design practices are not enough to ensure low-energy performance; efficient-or at least not wastefuloperation patterns are needed to guarantee performance. This was confirmed in the analysis of extreme cases where the benefits of Estidama green design features were negated in some instances by inefficient operation patterns of building systems. Such observations reconfirm that the performance gap risk is real and needs to be further studied and mitigated in buildings, including green-certified ones.
To conclude, green building rating systems often follow prescriptive paths to green building certification, overlooking sources of uncertainty that can compromise building performance. Additional research is needed to integrate specifications that directly aim to increase the resilience of buildings to uncertainty and mitigate performance risks. In the case of Estidama, simple climate-adapted features could include (i) strict ranges of thermostat setpoints to avoid the overcooling of spaces, or (ii), imposing continual maintenance and commissioning of energy-intensive building systems (eg, chillers). Along the same lines, it is important to gather data and analyze the current stock of Estidama-rated buildings to benchmark their performance, reward over-performing buildings, and help improve the performance of underperforming ones. Such efforts can be integrated into the certification process through a postoccupancy certification step that complements and validates the "promised" savings made during design.