Software‐in‐the‐loop simulation for developing and testing carbon‐aware applications

The growing electricity demand of IT infrastructure has raised significant concerns about its carbon footprint. To mitigate the associated emissions of computing systems, current efforts therefore increasingly focus on aligning the power usage of software with the availability of clean energy. To operate, such carbon‐aware applications require visibility and control over relevant metrics and configurations of the energy system. However, research and development of novel energy system abstraction layers and interfaces remain difficult due to the scarcity of available testing environments: Real testbeds are expensive to build and maintain, while existing simulation testbeds are unable to interact with real computing systems. To provide a widely applicable approach for developing and testing carbon‐aware software, we propose a method for integrating real applications into a simulated energy system through software‐in‐the‐loop simulation. The integration offers an API for accessing the energy system, while continuously modeling the computing system's power demand within the simulation. Our system allows for the integration of physical as well as virtual compute nodes, and can help accelerate research on carbon‐aware computing systems in the future.


INTRODUCTION
The rapidly growing demand for computing power and the resulting increase in energy consumption have led to major discussions about the current and future impact of IT infrastructure on the environment. 1,2In fact, a recent study 3 by the International Energy Agency estimates the global energy usage of data centers (excluding crypto mining) to be at 220-320 TWh in 2021-an increase of 10%-60% compared to 2015.Considering the ongoing expansion of cloud computing and the emergence of new computing paradigms, such as fog and edge computing, there is a pressing need for the industry to prioritize sustainable practices.
As further improvements in energy efficiency become increasingly challenging, 4 it is key to actively encourage the adoption of low-carbon energy to reduce the emissions of computing systems.However, renewable energy sources such as solar and wind are highly variable by nature, which requires computing infrastructure to adaptively adjust its resource usage to the availability of clean energy.While this new paradigm, often called carbon-aware computing, [5][6][7] is a promising idea in theory, today's energy systems are not designed to expose their underlying complexities to consumers: The unreliability of on-site renewable energy or the state of energy storage is hidden behind the simple abstraction of an unlimited and reliable power supply. 8This means that carbon-aware applications which by design require signals from the energy system, can usually not access this information.
To enable carbon-aware applications and systems, we need to design new abstractions and interfaces that provide visibility and control over energy-related and carbon-related metrics.A prominent example are the recently proposed ecovisors, 9 which aim to virtualize the energy system: Similar to computer hardware virtualization, each application (which can be anything from a simple process to a set of virtual machines) gets assigned virtual resources like a carbon budget, a share of on-site produced energy, or a virtual battery.The application must then operate within these constraints.For example, by determining when virtual batteries should be charged using grid energy, or by setting power caps on servers and containers by limiting their resource usage.Such virtual energy systems would allow applications to define their own abstractions for managing energy and carbon emissions based on their own requirements. 8owever, research and development on novel interfaces and abstraction layers, as well as carbon-aware applications themselves, remain complex due to the scarcity of available testing environments.For example, Souza et al. 9 implemented a hardware testbed to demonstrate the feasibility of their proposed ecovisor which extends to a couple of microservers, each consuming up to 10 W under full CPU and GPU load.Despite this small scale, the testbed is intricately designed and features several expensive components, like a solar array emulator for approximately $10,000.While hardware testbeds are essential for testing applications under real-world conditions, purchasing, building, and operating such systems is only feasible for few institutions.Furthermore, especially during early stages of research and development, hardware testbeds can impede rapid ideation and experimentation due to their high maintenance costs in terms of both time and money.

Contributions
To provide a widely applicable approach for developing and testing carbon-aware applications and systems, we are working on a versatile co-simulation testbed, called Vessim. 10 Vessim enables cheap, configurable, and reproducible experiments by integrating arbitrary simulators related to energy systems, such as for power production, energy storage, or power flow analysis.In this paper, we focus on the challenge of integrating real applications and compute infrastructure into such an energy system simulation through software-in-the-loop (SiL).Hence, the main contributions of this work are: • We present an approach that enables real applications to exercise visibility and control over an energy system simulation.This is achieved through a message broker that (i) provides realistically behaving APIs for applications while (ii) always ensuring that the underlying real-time simulation executes reliably.
• We present an approach that periodically measures or models the power demand of the computing system under test, which can consist of physical as well as virtual nodes, and communicates this information to the energy system simulation.
• We implemented these approaches in Vessim and demonstrate their usefulness by executing and analyzing an exemplary scenario.The scenario consists of a physical node (Raspberry Pi) and a virtual node (cloud instance) that are managed in a carbon-aware manner based on on-site solar power, battery state of charge, and grid carbon intensity * .
There are different methods for calculating the carbon intensity of power systems (e.g., average vs. marginal) that are relevant to the type of carbon reporting/accounting. From a modeling perspective, they make no difference.
* Carbon intensity (gCO2/kWh) describes the grams of CO2-equivalent greenhouse gases emitted per killowatt-hour of consumed energy.There are different methods for calculating the carbon intensity of electric grids (such as average vs. marginal) which should be chosen carefully by the testbed user depending on the desired type of carbon reporting/accounting. Form a simulation perspective, it makes no difference which metric is in use.

Outline
Section 2 provides background and related work on SiL simulation in the context of computing and energy systems.Section 3 defines the assumptions, goals, and challenges of this work.Section 4 proposes an approach for enabling visibility and control over the energy system simulation to applications.Section 5 explains how we measure and model the computing system's power demand within the simulation.Section 6 presents our prototype.Section 7 defines an exemplary scenario and analyzes the execution of two different experiments, where one involves carbon-aware decisions.Section 8 concludes the paper.

STATE OF THE ART
2][13][14] However, in many cases the installation and operation of physical testbeds is impractical, unsafe, or simply too expensive.In these cases, SiL simulation can be an effective alternative, where real software is executed in a partially simulated domain environment.SiL simulation can offer a high degree of realism while significantly reducing costs, speeding up development cycles, and improving the configurability and reproducibility of experiments. 15n this section, we present an overview of the current state of the art in SiL simulation for computing and energy systems.We highlight the need for an integrated simulation testbed that bridges these two domains, and demonstrate how and where our approach can effectively address this gap.

SiL simulation for computing systems
In computing systems research, co-simulation and emulation became especially popular in the context of novel computing domains such as the Internet of Things (IoT), edge, and fog computing, where the behavior of applications often depends on external factors like mobility. 16By testing real software in simulated domain environments, researchers and developers can reproducibly test edge cases at a low cost 17 and detect system-level defects early on, significantly reducing the costs of troubleshooting in later stages.For example, Mockfog 18 is a framework for testing and benchmarking fog applications in the cloud by emulating the behavior of the fog infrastructure.Other tools like Fogify 19 and Héctor 20 even enable the combined testing of physical and emulated devices.
Although there exist various energy-aware simulators, [21][22][23][24] almost no tools with SiL capabilities cover the modeling of power usage. 11Furthermore, to the best of our knowledge, none of them can accurately model the energy system of computing environments and provide software-defined access to it.A main reason for this is, that there are few tools for the testing of computing systems that are easily extended to integrate with custom domain environments, like an energy system.Beilharz et al. 17 identified the need for a more comprehensive framework, especially for testing IoT applications.Their prototype, called Marvis, supports the co-simulation of arbitrary domain environments by letting users integrate domain-specific simulators.For instance, the current version of Marvis allows the execution of real applications interacting with traffic and network simulations.A similar philosophy is found in tools like Eclipse Mosaic, formerly VSimRTI, 16 which started off as a simulator for vehicular communication systems but evolved into a flexible co-simulation framework with SiL possibilities.Using our approach, tools like Marvis or Eclipse Mosaic can directly integrate energy system simulations to facilitate research on carbon-aware applications.

SiL simulation for energy systems
SiL and and hardware-in-the-loop (HiL) testing are highly common in energy systems development. 25For example, companies like Typhoon HIL 26 or dSPACE 27 offer dedicated solutions for testing virtual controllers and software components in partially simulated environments.However, these solutions are targeted at testing power electronics and other energy technologies, and not the interaction with the powered target system, such as a datacenter.Similarly, also existing solutions for the testing of grid-connected equipment in real-time grid simulations, for example by the National Renewable Energy Laboratory (NREL) 28 in the United States, prioritize addressing electrical engineering questions.F I G U R E 1 Required interfaces between the real computing system and the energy system simulation.
Hagenmeyer et al. 29 investigate the interplay of different forms of energy on various value chains in Energy Lab 2.0.The focus is on finding novel concepts to stabilize the volatile energy supply of renewables through the use of storage systems and the application of information and communication technology.The smart energy system simulation and control center is a key element of Energy Lab 2.0 and consists of three parts: a power-HiL experimental field, an energy grid simulation and analysis laboratory, and a control, monitoring, and visualization center.While this smart energy system simulation is similar to the one in our approach, their control center is the only entity with software-based control over the energy system.Our integration, however, opens the possibility to define entirely new abstractions and interfaces to energy systems and enables the testing of arbitrary carbon-aware applications.
Mosaik 30 is a co-simulation framework for large-scale smart grid scenarios that allows for the integration of arbitrary simulation models.Based on SimPy, it allows executing simulations in real time, thereby enabling SiL or HiL testing.Mosaik is therefore a great candidate for implementing energy system simulations that must interact with real computing systems.It is a core component of Vessim, 10 the energy system simulator on which we implemented the SiL approaches presented in this paper.

INTEGRATING COMPUTING SYSTEMS WITH AN ENERGY SYSTEM SIMULATION
We want to enable the interaction of real computing systems with an energy system simulation.To make the approach applicable to a wide range of scenarios, computing systems can consist of physical as well as virtual nodes.Furthermore, in this work we adopt a very broad definition of what an application can be, ranging from simple processes to containers to virtual machines.We leave this decision to the testbed user.
Since our focus lies on the SiL integration and not the actual modeling of the energy system, we assume the availability of an energy system simulator like Vessim 10 : A discrete-event simulation that replicates all relevant aspects of a real energy system.Discrete-event simulation is a method for modeling and analyzing the behavior of complex systems over time by describing them as a series of sequential events.The simulation progresses independently of the simulated time, processing event by event.Each event can change the state of the system and also trigger new events in the future.The underlying complexity of the simulation and, thus, the degree of realism can be determined by the user and depends on the purpose of the intended experiments.For example, our assumed energy system simulation can describe anything from a simple module that walks over solar production CSV files to compare power production and consumption, up to very sophisticated co-simulation frameworks 30 consisting of state-of-the-art power flow solvers like MATPOWER 31 and realistic battery simulators like PyBaMM. 32For simplicity, in this paper we assume that the energy system is fully simulated.Applying our approach to a simulation with HiL is possible, but outside the scope of this paper.
Figure 1 describes the two required interfaces that must be established to allow full integration of a real computing system into an energy system simulation.First, we want to provide visibility and control over the simulation to applications.Visibility means, that authorized applications should be able to access some of the energy system's internal state, such as the amount of on-site produced renewable energy, battery state of charge, or the current carbon intensity of power from the the public grid.Control means, that authorized applications should be able to actively configure properties of the (virtual) energy system, for example by instructing to charge batteries through grid power, as well as the computing system, for example by applying power caps to certain nodes or containers.
Second, the energy system simulation has to be informed about the power demand of the computing system, which is especially challenging if the computing system under test is fully virtualized.Besides reporting the demand of the computing system as a whole, which is important to correctly determine power flows, the energy system should also have access to more granular information such as the power usage of individual nodes or applications.This information can be exposed by an ecovisor to enable carbon-aware scaling or shifting of workloads.We cover three main challenges within this paper: Realistic interfaces: We need to provide a realistically behaving API for visibility and control of the energy system.
While the exact modeling and level of realism of the energy system simulation depends on the requirements of the testbed user and is not covered in this work, the realism of interfaces is within our scope.This means, for example, that any changes made to the energy system configuration should be accurately and promptly reflected in the simulation.
Real-time simulation: Real-time simulation describes the execution of a discrete-event simulation synchronous to the wall-clock time, which is an apparent requirement when interacting with real-world applications.However, when performing real-time simulations, we need to ensure that all simulated steps require less time to compute than the actual wall-clock time.As we want to provide programmatic access to the energy system, we need to guarantee fast response times without overloading the simulation, which can fall behind schedule when processing large numbers of requests.

Measuring and modeling power demand:
To correctly calculate power flows within the energy system and to provide detailed power usage information of nodes and applications within the computing system, we need to properly measure and model the power demand of these components.Depending on the type of system under test and the requirements of the user, this can encompass real power measurements using external hardware, estimates of hardware power usage, or purely mathematical power modeling based on metrics like CPU usage.

PROVIDING VISIBILITY AND CONTROL OVER THE ENERGY SYSTEM SIMULATION
For applications to interact with the energy system simulation in a realistic but safe manner, we introduce a message broker which manages the communication flow between the real and simulated systems.Figure 2 describes, how the message broker serves as the interface between the energy system simulation and applications.It exposes two REST APIs: The application-facing API (left side of Figure 2) can be freely configured by users according to their needs.Any application that wants to interact with the energy system has to implement this interface.The simulation-facing API (right side of Figure 2) is used to pass requests originating from applications to the energy system simulation.This API is fixed and has to be called periodically by the energy system simulation.This section describes the two APIs and information flow in detail.

Application-facing API
The message broker exposes a REST API to grant applications access to energy system-related information and configuration options.Users can freely configure the application-facing API, like the names and types of endpoints as well as their underlying logic.Users can furthermore configure authentication and authorization in the API to specify which applications can access certain endpoints.Hence, applications that interact with the API can be distributed across different computing systems and operated by different users as long as they are granted the necessary access rights.9][20] This can be achieved through the use of tools such as the Linux traffic control (tc) or more advanced network emulators like NetEm.Souza et al. 9 describe a possible API for ecovisors in their paper, which exposes various endpoints for getting information like solar power production, battery state of charge, grid carbon intensity, or the current power usage of containers.Notes: PUT is the equivalent HTTP request method for the SET requests described in this paper.Depending on the possibilities of the underlying computing system and energy system simulation, a full API may consist of more or different endpoints, see the original Ecovisor 9 paper.
On the other hand, the API also exposes endpoints for setting configuration options like the charge rate of an application's virtual battery or power caps for specific containers.Table 1 shows a reduced version of this API that we implemented in our current prototype.The goal of our SiL integration is to make interactions with this user-defined API as realistic as possible while ensuring correct execution of the energy system simulation.The main objective of the message broker is to protect the discrete-event simulation from request congestion, which can (i) cause it to fall behind schedule in the real-time simulation and (ii) lead to unpredictable response times.Hence, to avoid that any GET or SET requests by applications interfere with the simulation execution, the message broker contains an internal database.This database comprises a key-value store that always contains an up-to-date version of relevant state within the simulation, such as the current state of charge of the virtualized battery or the latest forecast for renewable energy availability.Additionally, it can be equipped with a time-series database for time-based queries, such as "how much energy did application X consume in the last 20 minutes."In particular, the database TA B L E 2 Configurations, mean power consumption, and mean events/s (sysbench 52 ) of the two nodes at different power modes.Note: The physical node's power usage is controlled via DVFS, while the virtual node uses rate limiting on the executed process.

Physical node
• acts as a cache for GET requests: For example, in-memory databases like Redis are extremely fast and scalable, and commonly used for request caching.As long as the energy system simulation takes care of keeping them up-to-date, they can reply to all GET requests without ever disrupting the simulation.
• acts as a buffer for SET requests: Instead of forwarding and executing SET requests directly, they can first be stored in the key-value store.The simulation periodically collects pending SET requests and can therefore, for example, reduce the number of energy system operations by aggregating or discarding obsolete requests that have already been overridden.
Moreover, the message broker can offer read/write atomicity, horizontal scalability, and improves decoupling the simulation model from the concrete requirements of applications under test by providing a unified interface.Example endpoints are listed in Section 6, where we present our current prototype.

Simulation-facing API
While applications, which can send GET and SET requests at any time, the energy system simulation only interacts with the message broker at discrete simulation steps.Depending on the concrete implementation of the energy system simulation, these steps can occur either at fixed or variable intervals.To accommodate this, the message broker exposes a second REST API, which is only accessible by the simulation.This API exposes two simple endpoints: (a) an endpoint for collecting any open SET requests and (b) an endpoint for pushing the current simulation state to the message broker's database.The underlying energy system simulator has to implement this interface, by executing the following steps on each simulation step: 1. Collect all pending SET requests from the message broker.2. Execute energy system-related SET requests directly within the simulation.
3. Execute computing system-related SET requests by translating and forwarding them to the hypervisor/container orchestrator in use.4. Measure the current simulation state, such as state of batteries, energy production and demand, as well as forecasts, and update this information on the message broker.
SET requests related to the energy system typically target the operation mode of virtualized batteries, as most other factors such as on-site power production are determined by the environment.Common battery settings include configuring a minimum state of charge to maintain a backup reserve or configuring a specific charge rate to charge the battery from the public grid during periods of clean energy.Modern battery management systems, like Tesla Powerwalls, 33 can also directly expose high-level APIs for configuring their behavior.To avoid unnecessary load on the energy system simulation, if there are multiple requests regarding the same resource, users should define a logic for aggregating these requests before handing them to the responsible simulation module.For example, multiple requests for updating the (dis)charge rate of virtual batteries, should be aggregated into a single (dis)charge rate before informing the battery simulation module.
Examples for SET requests related to the computing system include specifying power limits for particular nodes or applications.For instance, users could manually configure the dynamic voltage and frequency scaling (DVFS) mode on a device to throttle its CPU power usage during times of high carbon intensity.As shown in recent works, 34,35 power capping can also be performed on an application or container level.However, if the underlying computing infrastructure is not managed by the user but rented from a public cloud provider, capping CPU utilization does neither reduce costs nor the emissions reported by cloud providers like Google Cloud Platform, 36 Microsoft Azure, 37 or AWS. 38Therefore, power capping is currently not a sensible strategy for carbon-aware operation of public cloud infrastructure.
Any energy system simulator which implements this procedure, can make use of our proposed SiL integration.

SIMULATING THE COMPUTING SYSTEM'S POWER DEMAND
The second part of the SiL integration is concerned with correctly modeling the computing system's power usage within the energy system simulation.In our approach, a computing system can consist of one or more nodes, which are all attached to an individual power meter.Power meters are responsible for periodically measuring and or modeling the node's power usage and are explained in detail in Section 5.2.Nodes can either be real devices or virtual infrastructure like containers, virtual machines, or cloud instances.This enables the integration of actual production systems as well as testing frameworks such as fog computing emulators like Mockfog, 18 Fogify 19 or Héctor. 20igure 3 describes the interaction of power meters and their corresponding computing system: To model a computing system's power demand within the simulation, computing system adapters are responsible for periodically collecting and aggregating the power usage measurements of all power meters.For example, in power system simulators like PyPSA 39 or MATPOWER 31 a computing system adapter would represent a certain load.If the scenario under test requires the modeling of multiple computing systems, each must be represented by a separate adapter.Examples for this are geo-distributed scenarios that require the explicit modeling of energy availability at different sites.
Power meters, which are usually installed directly on the physical or virtual host, can either expose their measurements to the computing system adapter directly (e.g., through a webserver or MQTT) or make use of proper monitoring systems such as Prometheus to store and query power measurements.The different types of power meters and their operation are described in more detail in the following subsections.Although the power demand of the entire computing system (Section 5.1) is the only information required to correctly model power flows within the simulation, we are additionally interested in reporting the power usage of individual nodes (Section 5.2) and applications (Section 5.3).This information can be exposed by the energy system to create transparency and allow for energy-aware and carbon-aware scheduling decisions.

F I G U R E 3
The computing system adapter collects and aggregates the individual measurements of all node-specific power meters.

Power metering for the computing system
The most straightforward way to determine the power demand of the computing system is to simply aggregate the current power demand of all its nodes.Yet, users can also decide to additionally model overheads such as cooling systems, uninterruptible power supply, or lighting.For example, a commonly used metric for describing the energy efficiency of data centers is called power usage effectiveness (PUE), which is defined as the PUE = energy consumed by the entire data center energy consumed only by IT equipment.
Following, if a data center as has a PUE of 1.55, the average reported value in 2022, 40 it uses an additional 55% of the energy required for operating the IT equipment.Hence, a simple way to include these inefficiencies in the model is by multiplying the sum over all node power demands with the computing system's PUE.However, by simplifying the underlying dynamics of the entire facility including its cooling system to a single scalar, the PUE has frequently drawn critique as it "only conveys an understanding of the minimum possible energy use." 41As it is usually measured when infrastructure is at close to 100 % utilization, it is a nonideal simplification for research on carbon-aware computing systems, where we often assume significant opportunities variability in utilization that are required for load shifting.If possible, users should therefore try to model (or even better measure) the actual power demand of their computing system depending on its utilization as well as external factors such as temperature and operating time.As the energy system simulation can constitute a co-simulation framework, this can also be done by integrating domain-specific simulators that, for instance, model the cooling system in high detail.
If the entire computing system is available in real hardware, users can also directly measure the overall demand instead of aggregating the demand of individual components.As with all parts of the system, the exact functionality to be implemented depends on the needs of the testbed user.

Power metering for individual nodes
For each node in the computing system, users have to configure a power meter, that measures and communicates the node's current power demand, as well as other relevant metrics like resource utilization of applications, to the computing system adapter on request.The implementation of the power meter highly depends on the type of the node.
Physical nodes: Although many devices provide functionality for estimating their current power usage, like the Intel's Running Average Power Limit (RAPL) technology, most devices cannot accurately measure power consumption using only software.Therefore, if applications under test are deployed on physical nodes like single-board computers, it is recommended to use dedicated hardware for measuring the power usage of these devices.For example, in our experimental testbed we monitor the current and voltage of a RaspberryPi 4b with a USB to USB measuring device equipped with an INA219 DC current sensor.

Virtual nodes:
To adequately test applications running in virtualized environments like virtual machines, containers, or cloud instances, it is necessary to (i) establish one or more metrics that are considered significant indicators of the node's power consumption (such as CPU usage), and (ii) develop a power model that can convert these metrics into power usage values.These power models can either be based on real benchmarks like SPECpower 42 or mathematical models.For example, in our prototype we implemented a power meter for AWS cloud instances which periodically queries the instance's CPU utilization and converts it to power usage estimates using a linear power model.There also exist more comprehensive software-based power models that additionally take other consumers like GPU or memory into account. 43,44 measuring/modeling a node's power demand takes considerable time, for example due to network latencies, the power meter must provide for asynchronous polling and caching of this information to ensure that the real-time simulation is not delayed.

Power metering for individual applications
There exist a variety of techniques for attributing hardware power consumption to applications using power models, like SmartWatts, 45 BitWatts, 46 or WattsKit. 47The field receives a lot of attention to this day, as detailed information on application power usage is essential when performing energy-aware scheduling and placement decisions, for example when optimizing for efficient VM consolidation.The use of power models is unavoidable, as there is no way to attach a physical power meter to software.Since we adopt a very broad definition of what an application can be, also the method used for application power metering has to be implemented by the user.It depends on various factors, like the type of application, its resource usage characteristics (e.g., whether it uses GPUs) as well as the host's operating system and its capabilities.While most applications can be viewed as black-box systems for scheduling decisions, very fine-grained scheduling on, for example, heterogenous hardware, might require more complex modeling and benchmarking.
Lastly, not all use cases require the explicit modeling of application power usage.This is especially the case when not testing virtualized applications on powerful hardware but embedded systems that often only run only one energy-hungry process at a time.In these systems, load shaping is likely rather performed on a device level, for example, through DVFS.

PROTOTYPE IMPLEMENTATION
We implemented a first prototype of the proposed system using Vessim, 10 including power meters for physical and virtual nodes.
Energy system simulation: Since the energy system simulation itself is not the main focus of this work, we went for a relatively simple model: Our energy system consists of (i) a solar power simulation that iterates over solar irradiance measurements and converts these values to solar power production, (ii) a simulator that exposes the current grid carbon intensity, (iii) a simple battery simulator, and (iv) a computing system adapter as described in Section 5. We furthermore extended Vessim to implement the required interface to the message broker, to periodically push the internal simulation state and to query pending SET requests.All components within Vessim interact through the Mosaik 30 co-simulation framework.
Message broker: Our message broker uses FastAPI 48 to implement the API described in Table 1 and utilizes a Uvicorn 49 web server to handle access by applications and the energy system simulation.The API server is connected to a Redis 50 database, which is a high-speed, in-memory, key-value data store with additional support for time series data structures.However, Redis could be replaced with any other key-value stores like Mongodb or etcd, which can be useful when integrating the simulation into a production-grade cloud environment like Kubernetes or Openstack.The API server and database run in separate Docker containers that can be duplicated for redundancy and horizontal scalability.

Physical power meter:
We implemented an exemplary power meter for physical nodes using a USB to USB measuring device equipped with an INA219 DC current sensor.The sensor is attached to a RaspberryPi 4b, which is running an agent that locally performs current and voltage measurements.The agent exposes these measurements through a simple REST API to the power meter, which is running within the energy system simulation as part of the computing system adapter.

Virtual power meter:
We implemented an exemplary power meter for virtual nodes that works for AWS EC2 instances.The power meter periodically queries the cloud instance's CPU utilization u via AWS CloudWatch and applies a linear power model P static + u ⋅ (P max − P static ).P max describes the node's power usage when fully utilized and P static the power usage in idle state.

EXAMPLE SCENARIO
To demonstrate how our integration enables real applications and devices to integrate with an energy system simulation, we conducted two experiments on an exemplary scenario.This scenario is based on real data and a realistic system configuration, but was cherry-picked to display impactful use cases of carbon-aware decisions and their effect on the experiment.

Setup
Our exemplary computing system comprises a physical node (a RaspberryPi 4b whith 4 cores) and a virtual node (AWS EC2 t2.micro instance with a single core) connected to the above presented power meters.The virtual node's power model is based on AWS EC2 power usage estimates. 51For simplicity, when determining the computing systems power demand, we do not define a PUE or complex cooling system model, but only aggregate its node's individual power usage.Both nodes are are running a sysbench 52 stress test on all available cores over the course of two days.
The energy system simulation entails a battery of 32,000 mAh capacity which is initially charged at 60% and has a C-rate of 1∕5 (meaning that a full charge takes 5 h).A simulated solar panel of 0.2 m 2 produces power based on a dataset of solar irradiance measurements conducted in June 2020 in Berlin.Lastly, we simulate the average grid carbon intensity in Germany during the same time period using a public dataset. 7In this scenario, our forecasting API provides perfect predictions for future solar production and grid carbon intensity.
We conducted two experiments: The baseline experiment does not perform any requests on the energy system's API, but is otherwise fully integrated into the energy system simulation.If there is not sufficient solar power generation to meet the demand, the energy system will first try to draw energy from the battery.However, it is is not allowed to discharge the battery below 60%, as a safety buffer.If the battery reaches this threshold, the energy system will draw energy from the public grid.Whenever there is more energy produced than consumed, the energy system will charge the battery.Once the battery is full, any additional energy will be curtailed or fed back to the public grid.The carbon-aware experiment operates under the exact same rules and constraints, but additionally entails a carbon-aware control unit, which periodically performs GET requests on the energy system and, under certain conditions, sends • SET requests related to the energy system: The control unit adaptively adjusts the battery's minimum state of charge and grid charge rate over time.In particular, in case of promising forecasts for solar power production or low carbon intensity, it is able to temporarily deplete the battery to 30%.If the battery state of charge is below 60% and the carbon intensity below 250 gCO2/kWh the battery charges at maximum speed from the public grid.To allow for a fair comparison, both experiments start and end with the same state of charge.
• SET requests related to the computing system: The control unit performs temporal workload shifting by adapting the physical node's CPU frequency via DVFS and by limiting CPU usage on the virtual node.Power modes are explained in Table 2: While the first experiment always uses the normal mode, the second experiment switches to the power-saving mode if the battery is below 70% state of charge and the carbon intensity is above 250 gCO2/kWh.Similarly, it switches to high performance mode if the battery is charged above 80% or if the carbon intensity falls below 200 gCO2/kWh.In all other cases, it also uses the normal mode.We manually chose these values so that both experiments perform roughly the same amount of work (i.e., sysbench 52 events) throughout the 2 days.

Experiment analysis
The results of both experiments are presented in Figure 4.Although the carbon-aware experiment uses 2.4 % more energy than the baseline (which is because not all power modes have the same energy-efficiency), its associated carbon emissions through grid power consumption are 35.9% lower.In the following, we will briefly analyze the two experiments and to demonstrate how our integration enables research and development of carbon-aware applications.

Battery management
As stated above, both experiments start at the initially configured minimum state of charge of 60%.The baseline experiment directly has to draw power from the grid to meet its demand at night.The carbon-aware experiment, however, first queries the latest solar production forecasts from the energy system.Since there is a very high probability for abundant solar power throughout the next day, the control unit permits the battery to temporarily discharge to 30%.This allows most of the power demand at night to be covered by the battery, which reduces the grid power usage until sunrise from 64 Wh to only 7 Wh.Furthermore, by anticipating a surplus of locally produced renewable energy, in the carbon-aware experiment the battery never gets charged to full capacity, avoiding curtailments.During the second night, the carbon-aware control unit again queries forecasts for the next day once the battery reaches 60% charge.Although solar forecasts are not very promising this time, it again permits to discharge to 30%, as the carbon intensity during the next day is expected to be especially low.Instead of drawing carbon-intensive grid energy at night, the demand is thereby shifted to the next morning where the batteries are charged to 60% using cleaner energy.

Temporal load shifting
Besides improving the utilization of available energy storage, the carbon-aware control unit also aims to better align the power demand of the computing system with the availability of clean energy.By switching the power modes of the physical and virtual node based on carbon intensity and battery state of charge, we can observe that the node's power usage is usually reduced in the evenings or at night.Yet, the effect of load shifting in our experiment is comparably small, as the different power modes introduce inefficiencies.For example, in the physical node requires 6% more energy per event when in high performance mode and almost 12% more when in power-saving mode.The virtual node, however, consumes 19% less energy per event in high performance mode and even 25 % less in power-saving mode, where its only utilized to 50%.This shows, that the impact of load shaping heavily depends on the power proportionality 53 of the underlying hardware, and that it is not a reasonable measure per se.
In summary, our prototype enables the co-simulation of a real computing system and an energy system simulation.It offers applications the possibility to actively manipulate the energy system, as well as the power demand of other nodes or applications, and realistically reflects these changes within the simulation.

2
Through the message broker, applications obtain visibility and control over the energy system simulation.

F I G U R E 4
Visualization of energy system-related metrics of both experiments over time: (i) Power consumption and production within the energy system; (ii) required and surplus power; (iii) battery state of charge; (iv) grid power usage (after charging/discharging the battery) next to carbon intensity; (v) accumulated carbon emissions over the course of the experiment.