Exploratory Multicriteria Decision Analysis of Utility-Scale Battery Storage Technologies for Multiple Grid Services Based on Life-Cycle Approaches

Herein, a multicriteria decision-making analysis (MCDA) of eight different utilityscale battery storage technologies for four different application areas, involving 72 relevant stakeholders from industry and academia for criteria selection and weighting, is presented. The assessment is conducted for economic, environmental, technological, and social criteria using a combination of the analytic hierarchy process and technique for order preference by similarity to ideal solution. It includes a full life-cycle costing and life-cycle assessment using current data. Indicative rankings show that most lithium-ion batteries can be recommended for all application areas. Lead-acid batteries achieve rather low scores depending on the viewed application, but including a recycling scenario for this technology might lead to significant changes in final scores and rankings. This is also true for the redox flow battery. Furthermore, the weights provided by the stakeholders are very dispersing, leading to a low consensus about the relevance of the used criteria. In particular, social criteria are not well differentiated in the current state and do not add significant distinguishing features between different battery technologies.


Introduction
The energy-transition process is characterized by increasingly fluctuating renewable energy system (RES) capacities, leading to a higher demand for flexibility options. [1][2][3][4] Energy storage systems, and among them especially battery energy storage systems (BESSs) are the most frequently installed flexibility options, allowing to integrate higher shares of RES on multiple grid levels to enable a stable decarbonized electricity system. [5][6][7][8] Consequently, the global demand for stationary BESSs is predicted to increase from 2 GWh in 2015 up to 47 858 GWh until 2050. [9] However, the choice of a certain storage technology depends on different and often competing requirements, which depend on the targeted application field (short to midterm storage services) and the individual expectations related to technical, environmental, economic, social, and other aspects (e.g., high power, long life vs low cost, excellent safety, abuseresistance, environmentally friendliness, etc.). As in most complex problems, there is no silver bullet available that meets all these goals. [10] The selection of a suitable utility-scale BESS is a complex decision problem wherein tradeoffs, different application areas, and multiple stakeholder interests have to be considered. Multicriteria decision-making analysis (MCDA) represents a way for integrating all these multiple aspects, considering stakeholder preferences and solving possible tradeoffs between different aspects (e.g., technoeconomic vs environmental impacts) of a technology.
A few studies are available that aim at providing decisionmaking aid regarding BESS choice under sustainability aspects via MCDA methods. [11][12][13][14][15] These studies consider multiple assessment dimensions such as social, environmental impacts, or technoeconomic performance, but often rely heavily on existing literature without considering the influence of varying performance requirements in different applications, e.g., different energy-to-power ratios E/P. [5,16] In some cases, no application field requirements are defined at all, leading in some cases to context-free rankings. [15,17] In contrast, more technologyoriented studies that assess BESSs within different applications have a more narrow scope and focus on the technoeconomic and/or environmental performance [5,16,[18][19][20][21][22][23][24] but disregard wider aspects such as social acceptance, availability of regulations or varying stakeholder interests. However, these "wider" aspects often also have a certain impact on the technoeconomics of energy technologies, e.g., missing acceptance or regulations might lead to delays that subdue to overall project cost. [25] At the same time, the environmental impact of technology also strongly impacts its acceptability. [26] Despite this, most existing works include only stakeholders with a strong technological perspective, and in only two cases, a more diversified picture of stakeholder preferences is provided. [27] This is problematic as a high number of stakeholder interests have to be considered when it comes to the implementation of a utility-scale storage project. [28] In consequence, there is a lack of comprehensive assessments that consider not only different BESS technologies but also the individual target application for the BESSs, a wider set of sustainability aspects, an equilibrated set of stakeholders and a consistent technology and MCDA model. [27] This study aims at tackling these gaps by providing an explorative, highly interdisciplinary MCDA assessment of eight different electrochemical BESS technologies considering their economic, environmental, technological, and social performances. A life-cycle assessment (LCA) and life-cycle costing (LCC) is conducted for determining economic and environmental performance parameters, and a wide set of stakeholders is actively involved in the assessment, providing input for modeling and weighting of criteria.

MCDA Method
The field of MCDA is widely applied in the field of energy planning. [29][30][31] Different methods are available for this purpose, which all have their individual limits and advantages and are more or less adequate for different decision problems. [32][33][34][35][36] The MCDA approach used here (see Figure 1) is based on four steps, which are not followed strictly sequential, but rather conducted in a parallel and iterative way [27] : 1) Involvement of stakeholders through interview and surveys for the problem definition, selection of criteria and weights attribution for considered criteria, 2) use of analytic hierarchy process (AHP) [37] for weighting in line of the survey including the measurement of consensus of conducted weights, 3) performance aggregation via the technique for order preference by similarity to ideal solution (TOPSIS) [38] using AHP weights to calculate rankings, 4) performance measurement of considered criteria and the different technologies within four different application fields using LCC and LCA.
The MCDA is based on a hybrid multiattribute decisionmaking model using a combination of AHP and TOPSIS. Some examples for the combination of these two methods can be found in the studies by Goh et al. and Zaidan et al. [39,40] Combining these two methods allows overcoming a major limitation of TOPSIS, the lack of a procedure for determining the importance (weights) of considered criteria. The AHP represents such a weighting procedure but is less efficient in dealing with tangible attributes and number of alternatives to be addressed. [41] Consequently, weights from stakeholders are obtained here by the use of the AHP while performance measurement (quantification of selected criteria) and weight aggregation is conducted with TOPSIS. Within the AHP, stakeholders attribute an individual preference to each criterion by pairwise comparisons on a scale from one (equal importance) to nine (extremely more important), which is seen as an intuitive way of elicitation. The pairwise comparisons are checked for consistency [32,42,43] using the geometric consistency index (GCI). More information about TOPSIS and related calculation steps can be found in previous studies [38,[44][45][46] and the Supporting Information. Criteria selection and corresponding weighting are based on a  broad set of relevant stakeholders from industry and academia in the area in Germany and Austria (72 participants), which are classified into groups for obtaining a more precise picture of weighting preferences among stakeholder groups. The robustness or power of group decision making is dependent on the degree to which attributed priorities are shared among actors within a certain group. [47] The degree of "sharedness" for AHP weights mirrors whether priorities expressed by individual group members are align with the group priorities or not. [48] Here, the concept of diversity in biology and ecology is used, which allows deriving a consensus indicator S* to elicit if group preferences are shared within in a continuum between 0% and 100%. [49] A consensus up to 100% (totally equal preferences) indicates an absolute agreement on priorities and >75% a moderate one. Around 50% represents modest consensus, and <30% can be considered as very low. [49] A modest-to-low consensus serves as an indicator that an in-depth discussion regarding diverging preferences is required (see Supporting Information for details). The performance measurement (quantification) of considered criteria is realized by a combination of different methods such as LCA, [50,51] LCC, [52] and other methods such as expert judgment and literature review. LCA represents a standardized approach [44,45] that documents a product's or system's environmental impact over the complete life cycle (considering direct emissions and also upstream processes such as electricity production). In contrast, LCC allows comparing total expenditures related to the entire economic lifetime of a product (e.g., initial investment, replacement, and energy cost). Part of the LCA and LCC methodology builds up on a previous published work about the carbon footprint and life-cycle cost of different BESSs. [5] However, LCA and LCC models and related inputs are updated using the most recent data available. Uncertainties regarding input data for LCA and LCC are considered through a Monte Carlo simulation with n ¼ 1000 through the variation of key parameters [5] (see Supporting Information). Details on the approach are provided in the following sections.

Stakeholder Input
Utility-scale BESSs offer various services including generation, network, and demand within all voltage levels. [4,19,53] In consequence, they affect a high number of potential users and business areas distributed within the entire electricity system. Figure 2 shows an overview of involved stakeholder categories within the power system for utility-scale BESSs. [19] Identified stakeholders were involved though an online survey and semistructured interviews.
First, a pretest phase was conducted with the objective of including at least one representative of all stakeholder groups (34 experts contacted via individual e-mails). A further condition was to focus on principal investigators, higher management, and project leaders. In total ten of these pretesters were interviewed via semistructured interviews with duration between 20 and 120 min to get in-depth feedback to the online survey. After some alterations, 106 persons were contacted directly and 30 e-mails addressed relevant organizations [utility companies, non-governmental organizations (NGO), battery storage manufacturers, etc.]. In total, 72 experts finished the survey as shown in Figure 2, which also provides a brief description of the groups (69 of these are valid samples, due to GCI 64 valid for the AHP). More details about the process are provided in the Supporting Information and the study by Baumann. [54]

Analyzed Battery Storage Alternatives and Model Inputs
Four principal battery chemistries are investigated in the frame of the MCDA: Li-ion batteries (LiBs), lead-acid batteries (PbA), high-temperature batteries (sodium-nickel chloride [NaNiCl]), and vanadium redox flow batteries (VRFBs). LiBs represent the most used battery in stationary, a multitude of applications nowadays. [5] Consequently, five different types of LiBs are included in the study: lithium-iron-phosphate (LFP), lithium-iron-phosphate/lithium titanate (LFP-LTO), lithiummanganese oxide (LMO), nickel-cobalt-aluminum oxide (NCA), and nickel-cobalt-manganese oxide (NMC). [5] The PbA battery is a valve-regulated lead-acid battery (VRLA) deep cycling system optimized for stationary applications and considered as the reference system. [55] This type of lead-acid BESSs is the most mature electrochemical storage technology, used for a high quantity of power system applications such as local power quality, grid extension, and frequency stabilization. [56,57] For the VRFB, electrolytes are stored in external tanks, whereas a stack contains the electrodes where the electrochemical reaction takes place. [58,59] In contrast to other BESSs, this allows scaling the  power and energy capacity of the VRFB independently. All the technoeconomic inputs use the most recent data from a comprehensive and continuously updated database for energy storage (the BattDB [60,61] ). Older technoeconomic data (before 2015) is only used where no data points are available for a technology (starting from 2010). More information on the different BESS technologies and an overview of relevant input data is provided in the Supporting Information.

Considered Application Fields
Battery storage systems can provide a wide set of different grid services in a continuum from short term (e.g., ancillary services and power quality management) to mid-term storage (e.g., renewable energy support, load leveling, and self-consumption) on every voltage level. [47,62] A proper definition of the application is a crucial factor for the choice and design of a suitable BESSs. [27] The following representative utility-scale application fields are selected for the evaluation by summarizing data from [19,24,57] : 1) Primary regulation (PR): conjunction of measures for shorttime reconciliation of supply and demand. 2) Energy time shift (ETS): ETS is also referred as "arbitrage." Energy is stored during periods of low electricity market prices and discharged during times of high prices. [16] 3) Wind energy support (WES): energy is stored by wind park operators when producing excess electricity and dispatched during high demand times. [63] 4) Decentralized grid (DC): energy storage is used to increase the degree of self-consumption in a small town with 1000 inhabitants. The model is based on a previous work published and was rescaled for this work [6] (see Supporting Information). An overview of the different application cases with different requirements on power, capacity, and cycles per day is shown in Table 1. It has to be mentioned that no taxes or grid fees are considered in the assessment as these aspects are highly dependent on the specific electricity market. The work of Schmidt et al. [24] provides an overview of the impact of different wholesale market prices on the LCC of BESSs.

Choice of Criteria for Performance Measurement
In general, the success of an MCDA is extremely dependent on the effectiveness of the used criteria that correspond to the problem and the fulfillment of a decision objective. [70] Literature provides a high magnitude of indicators for energy storage which can be adapted and combined regarding specific objectives. [27] The choice of proper criteria is based on a literature review and was an integral part of the survey and the semistructured interviews. As a result, the four dimensions of environment, economy, social aspects, and technology [27] are integrated into the assessment with 11 subcriteria. A summary of selected main and subcriteria and a brief overview of the performance measurement methods are shown in Table 2.
All selected criteria, their relevance and of the corresponding quantification methods are described in the following. Within these, the LCA and LCC approaches for the evaluation of environmental and economic aspects are described in detail due to the higher modeling complexity. Major changes in the initial criteria are also shown in Table 2. The criterion "socioeconomic performance" is removed from the final set of criteria based on stakeholder input (colored in gray).

LCC and Investment Cost Calculation
The LCC model is based on a previous publication [5] and is not explained in detail here. The annuity method is used to calculate the cost associated with every kilowatt hour converted within a BESS. The desired operation period for the entire BESS is assumed to be around 20 years for all applications. [16,71] A depreciation rate of 8% is considered for utilities for three applications (ETS, PR, and RS). The DC case is assumed to be conducted on a municipal utility level with a lower depreciation rate of 6%. [16] There is a crucial difference to the previous assessment [5] where the system boundary included the entire business case (e.g., the entire energy flow including losses of the BESSs). Here, only the losses of the storage system are considered for the evaluation of the different BESSs. In addition, the decay of storage capacity is considered by oversizing the initial storage system by 30% (equivalent to an end-of-life capacity of 70%). Only the VRFB is assumed not to suffer from degradation. [72] A fixed minimum state of charge (SoC) is set for BESS with lower cycle life to minimize cell exchange rates (increasing cyclic lifetime due to a lower depth of discharge). [5] A decreasing battery module price is considered for the case of replacements by applying different learning curves for each considered technology. Similarly, cyclic lifetime is considered to increase over time due to technological Table 1. Overview of used cases for the assessment. Abbreviations of Application cases, see in earlier sections. [5] Application RWP ¼ Random Walk Price mode-See Annex l; b) Adopted to German market conditions, 34 small cycles with an average depth of discharge of 5%, equivalent to 1.7 full cycles per day; c) Levelized cost of energy for an onshore wind turbine with operation times about~2000 h a À1 ; d) Own optimization model-see Supporting Information for detailed information.
progress (see Supporting Information for details). The end-of-life phase is not directly considered, but a linear accounting depreciation is used instead, which is then credited to each BESS. [5] A BESS requires further infrastructure and auxiliaries such as power conversion system (PCS) and balance of plant (BoP). The latter include hardware components (thermal management and energy management systems) and soft components (construction cost, taxes, overhead, development cost, etc.) that can have a share of up to 50% to overall system cost. [73] Inverters, heating, ventilation, and air conditioning (HVAC), construction, and overhead cost are considered in the BoP cost calculations, with the detailed methodology provided in the Supporting Information. Table 2. Summary of used criteria for technology evaluation and related performance measurement with corresponding software; the criterion "socioeconomic performance" was removed from the final set of criteria as it was not possible to gather robust results (colored in gray).

LCA of Battery Storage Technologies
Providing environmental impact categories that can be understood and eventually be prioritized by all involved stakeholders and that allow a comprehensive assessment of potential environmental impacts is a challenging task. ReCiPe [74] is used as lifecycle impact assessment method as it provides endpoint indicators (endpoints and single score) that consider many relevant potential impacts and allow a comprehensive communication of the results also to nonexpert stakeholders. Detailed information about ReCiPe is provided in the Supporting Information.
The ReCiPe endpoints are summarized as follows [74,75]  The system boundary of the LCA is equivalent to that of the LCC (no end-of-life considered); the functional unit is 1 kWh stored and withdrawn from the grid or RES (depending on the application) and converted in the BESS. Only energy that is consumed based on the alternating current (AC)-AC round trip efficiency is attributed to the BESS. The life-cycle inventory (LCI) is based on the most recent literature, [76,77] adopted to the considered BESS. An "all in one" container solution is considered where all relevant components are situated within an adopted 40 ft International Organization for Standardization (ISO) shipping container (steel reinforcement, insulation, steps, lights, etc.). All LiB-based BESSs are rescaled to 26 kWh modules built out of standardized 18 650 cells with corresponding rack sizes. [22] All components are then rescaled based on the gravimetric energy density of the reference system using a scaling factor (details are given in the Supporting Information). The VRFB system is also considered to be situated in a corresponding container containing tanks and stacks as modeled in a previous publication. [8] In the line of this work, the power stack and the V 2 O 5 electrolyte stored in tanks can be scaled against each other in a (simplified) linear way. The VRLA is based on inputs from literature, [55,78,79] wherein the module housing and racks are based on information withdrawn from the study by Alotto et al. [58] When battery modules have to be exchanged due to insufficient lifetime (as described in the previous section), it is assumed that all modules are exchanged considering again their increased performance (cycle lifetime). More information about the different used LCIs is provided in the Supporting Information.
An overview of all major components of the BESSs (mass balance) is shown in Figure 3. It is worth mentioning that the mass shares can vary strongly depending on the system design. For example, using 40 ft ISO container systems with around 4 MWh with the PCS being placed in an additional container system (also available on the market) would lead to considerable different component shares. [76] The detailed overview and description of the different share of components and used LCIs for every application scenario is provided in the Supporting Information. The electricity withdrawn from the grid (ETS and PR) is modeled based on an LCI for the years 2015 and 2030, [80] using an average to cover the considered project lifetime of 20 years. The LCIs for the WES and DC cases are provided in the Supporting Information.

Technology Aspects
Numerous technological criteria are available for energy storage evaluation in the literature. [81][82][83] Here, the main criterion "technology aspects" is based on the subcriteria 1) performance, 2) maturity, and 3) flexibility. The first subcriterion "performance" includes the cycle-and calendric life, charge/discharge efficiency, and energy and power density derived from the BattDB. [5,60,61] "Maturity" includes the globally installed capacity, [84] maturity (based on literature data [83,85,86] ), and patent life-cycle stage. [34,87] The relevance of this subcriterion stems from the assumption that investment decisions are often in favor of established technologies. [34,88] The third subcriterion "flexibility" represents the ability of a technology to be built up without geographical Figure 3. Overview of estimated shares in kilogram using an NMC stationary container-based system provided by Westlake et al. [77] www.advancedsciencenews.com www.entechnol.de and infrastructure-related restrictions, to provide a high magnitude of different services and to adapt to new market situations through modularity. The ability to provide different services is determined by a technologies response time, [89,90] which can vary in dependence of the technology from milliseconds up to several minutes or hours. Details on the performance measurement of the different criteria can be found in the study by Baumann [54] and the Supporting Information.

Social Aspects
Social aspects represent a crucial factor for the success or failure of a distinctive technology. [91] The quantitative impact assessment and identification or measurement of these is difficult [92] and only a few studies regarding social aspects and their operationalization exist. [34] Here, two social criteria are included: 1) "technology acceptance," and 2) "regulation and policy aspects." Both factors have a highly qualitative character [32] and are based on expert judgments realized within the stakeholder surveys and interviews. The results presented here should thus be seen as purely indicative and are combined with results from the literature [93] for increasing their robustness. The indicator 1) "technology acceptance" represents the opinions of the local population related to energy systems (community acceptance). This criterion is relevant as the opinion of the population and of interest groups may profoundly influence the time needed to go ahead with and complete an energy-related project. [25] In general, the field of acceptance is highly complex, [27] and it would surpass the scope of this work to discuss it here. The indicator 2) "regulation and policy aspects" describes possible rules, specifications, policies, or laws affecting a particular actor group related to technology development, diffusion, and investment. The interviewed stakeholders rate the degree of available regulation (i.e., energy regulation/legislation and construction, environmental and immission laws) on a Likert scale.
These include frameworks related to recycling, water protection, and fire safety regulations. [93] More information about the different criteria and their definition is given in the Supporting Information and the study by Baumann. [54] 3. Results

Stakeholder Weights and Related Consensus
The AHP weights derived from all stakeholder inputs and the corresponding consistency values are shown in Figure 4 (here the criterion socioeconomic value is shown here for the sake of completeness). Total weights for further assessment are calculated as median values (geometric and arithmetic mean values are also given form completeness reasons). In total, 64 out of 72 datasets are consistent. A high relevance is attributed to economic and environmental aspects. Technology performance is ranked third and the least importance is attributed to social aspects (corresponding to previous findings [27] ). Within the considered environmental criteria, human health shows a clear dominance, followed by damage to ecosystems and finally resource use. Social acceptance and socioeconomic value are perceived as equally important within the field of social aspects, whereas regulatory frames received the lowest priority in this main criteria category. Regarding technology aspects, the criteria maturity and technology flexibility are seen as highly relevant, whereas technology performance receives a slightly lower weight. Results for economic criteria clearly show a higher weight of LCC in relation to investment costs. The calculated consensus (0 none, 50% low, and 100% total consensus) can be considered as low for all main and subcriteria. This indicates a need for further research on how to achieve better alignment among the considered stakeholder groups to increase the robustness of the inquiry.  To better understand the low degree of consensus among the stakeholder groups, the different priorities/weights related to the main criteria of all considered groups are discussed in detail in the following. The weights are rescaled to fit into a 4-field matrix (see Figure 5). This is realized by adding up preferences as vectors based on normalized values (e.g., equal importance is 0.5) for environmental versus economic preferences (y-axis) and social versus technology performance (x-axis).
Consequently, the position of the bubbles indicates the preference of a specific group (starting from 0.5 or "equal importance" based on the AHP scale), whereas the bubble size indicates the number of participants. The color of the bubbles indicates the consensus in percentage in a particular group. In addition, the degree of consensus, as well as the number of participants and type of groups, is indicated for each particular bubble. The preferences of all groups are highly dispersed, and the consensus is low in all cases. Exceptions to be named are the energy storage business, regulation, and municipal utilities. Naturally, the group public body and policy making has a consensus of 100% as only one participant took part. Here, a higher number of participants would provide a more representative picture.
Based on the visualization two intermediate results can be derived for the weighting process via the AHP in the survey: 1) Very different group trends can be identified that either share a stronger environmental preference or economic interests (environmental, social preference vs technological economic performance) that are not visible in the total weight shown in Figure 4. These distinct group weights are also characterized in most cases by a low consensus that might stem from a very different understanding of BESS or the criteria itself. However, there are some exceptions with high consensus like the energy storage business, regulation, and municipal utilities (the size of the latter two is very small). 2) Carrying out the survey online in an unguided way (i.e., without a discussion about ranking and possible open questions) inhibits the danger that stakeholders provide unreflected weights. It is hardly traceable whether all survey participants understood the given criteria and whether they are perceived in the same way by each of them. A different interpretation of the criteria might lead to very distinct weights.
Here more research effort is required to understand the implications for weighting. A comparison of a survey and workshop format to gather weights for the same MCDA project would be highly interesting for future research.

Economic Performance
The quantified two subcriteria LCC and investment cost which serve as input for the MCDA are shown in Figure 6 for the different BESSs under the four considered application cases. The energy-to-power ratios (E/P) are also indicated in the figure as these highly affect initial investment costs (Figure 6, right) and consequently LCC (Figure 6, left). The investment costs contribute the highest share to the LCC for all analyzed BESSs. The operation and maintenance costs are characterized by the round trip efficiency of the BESS and have the highest share in case of the VRFB and the VRLA. Replacement costs are highly dependent on the particular BESS technology and the application case due to the relevance of the cycle and calendric lifetime of each particular BESS. Especially the VRLA and the LiB-LMO are characterized by a comparably high exchange rate of battery modules in relation to other technologies. Uncertainties for LCC are comparable for WES, DC, and ETS. Only in the case of PR, a significantly higher uncertainty can be observed. For this (power-related) application, the energy throughput is comparably low, leading to a higher impact of varying parameters related  to the battery design itself (battery size, life-cycle time, and energy throughput).
Regarding the investment costs, the BoP and battery module cost contribute the highest share in all cases. In general, the BoP contributes a considerable share to the total costs and even surpasses cell costs. Especially, the VRFB is very sensitive to changing E/P ratios as shown in the case for PR with an E/P ratio of 1. In all cases, the LiB-LTO (a battery designed for high-power applications) has the highest investment cost among LiBs. The uncertainties related to the investment cost are considerably lower in relation to those related to the LCC. . Left: LCC results for all considered technologies and application cases including major cost shares for replacement, initial investment; Right: total cost per kilowatt hour (includes oversizing due to efficiency losses and capacity decay over time). The indicated whiskers represent the 5% and 95% percentiles. For ETS and WES, LiB, NaNiCl, and VRFB provide comparable LCC results. VRLA and the LiB-LTO share the last two ranks. Results for PR are slightly different from the other cases where VRFB scores the lowest. Here, LiBs except the LiB-LTO dominate again the DC application followed by NaNiCl. The case of DC is calculated for an E/P of 3.33, which is unsuitable for VRFB (results can be very different for a higher E/P ratio in this application). VRLA and the VRFB have both comparably lowefficiency grades which in combination with high electricity cost (e.g., from photovoltaic (PV) and small wind turbines) makes them lose ground to other BESS for WES and DC.

Environmental Impacts
The three ReCiPe endpoint indicators 1) DE, 2) DHH, and 3) DRA for all BESSs and the different application cases are shown in Figure 7. As in the case of the LCC, median results are provided including positive and negative whiskers (depicted in red) for the 5% and 95% percentiles. The outliers for the PR use case are not depicted here due to graphical reasons (the other bars for the other applications would be very small) and are available in a numeric form in the Supporting Information. The graphs include the shares for all system components as broken down in Figure 3.
Some components have been merged (e.g., the two HVAC units, the piping system, and coolant) due to graphical reasons (not visible otherwise due to low environmental impact). In general, the battery modules, the container housing including reinforcement measurements and the HVAC contribute the highest share to the impacts from the BESSs itself.
Regarding the use phase, a clear distinction can be made between two types of applications: 1) systems that use renewable electricity (DC and WES) and 2) systems based on grid electricity (ETS and PR). [5] This indicates the importance of the use phase (characterized by energy consumption during operation due to internal losses) for the final LCA results. DC and RS show very similar profiles. Both, electricity from wind turbines and photovoltaics has a low environmental burden, and the contribution of internal energy consumption due to inefficiencies of the different BESSs is therefore almost negligible. ETS and PR have different impacts even though the charged electricity is assumed to be based on the same German electricity mix (year 2015-2030) which has a considerably higher environmental burden in relation to the renewable-generated energy for DC and RS (see Supporting Information for details).
As in the LCC, the differences between PR and ETS stem from the varying amount of operation hours per year and the corresponding amount of annually delivered energy. In ETS, a considerably higher throughput is assumed, leading to lower overall impacts in relation to PR. The same can be transferred to DC and WES where electricity is generated exclusively by renewables. Here, the impact per kilowatt hour would be significantly lower with a higher amount of operation hours. [5] It becomes also clear that the analyzed technologies have very different impacts in the three ReCiPe end-point indicators, making it difficult to identify the most favorable technology. However, it can be noticed that LCA results are comparable with those obtained for economic criteria. LiBs, except LiB-LTO performs well within all application areas, which is in sharp contrast to previous results. [5] This can simply be explained by the updated cycle lifetime leading to a lower amount of battery module exchanges and the consideration of other impact categories. VRLA has the highest environmental impact for WES and DC. The VRFB has the highest impact for PR and ETS over the use phase due to a comparable low efficiency degree.

Technology Aspects
The results for the three subcriteria 1) technology performance, 2) maturity, and 3) flexibility are shown in Table 3. These criteria are evaluated generically, i.e., independently of the considered application cases. A detailed explication of the proceeding for determining the performance factors is in the Supporting Information and the underlying literature. [54] The results for (1) technology performance was weighted through all stakeholders within the survey as shown in Figure 8.
The results show that calendric and cycle lifetimes are considered as the most important technical parameters for stationary storage in general. Efficiency and power density are also seen as relatively important, whereas energy density is perceived as less important for stationary BESSs. Here, the LiB-LTO scores the best due to its favorable lifetime properties which is in contrast to results from the LCC and LCA. The VRLA (low lifetimes) and NaNiCl (low power density of about 200 W l À1 ) have the lowest scores in this category.
The maturity (2) of all LIB chemistries is assumed to be identical due to missing data regarding the particular electrode chemistries. The VRLA receives the highest score here, as it has a high track record in the field of stationary BESSs. The lowest maturity degree is attributed to the VRFB which is considered to be on a demonstration level. The flexibility of all BESSs is, in general, very high in comparison with other storage technologies, e.g., compressed air energy storage. [54] There are only minor differences in the case of NaNiCl and VRFB, whereas the other technologies have identical scores (12 points). The E/P ratio for VRFB is freely scalable making the technology available for a magnitude of different applications (12.5 points). The NaNiCl has to be heated up to about 300 C before it can be used, which is credited with a lower score (11.5 points).

Social Aspects
The results of the considered social indicators 1) technology acceptance and 2) available regulatory frames are shown in Table 4, based on the expert survey and the results from the study by Elsner and Sauer. [93] The score for both social indicators is evaluated in a general way independently of the different applications fields. All battery types receive the same score of 8.9 for the subcriterion of regulatory frame due to missing data. One exception is the VRLA battery where a well-organized recycling framework exists (score of 9.9). [93] In general, there is a lack of regulations for battery storage in the countries of origin of the interviewed stakeholders (e.g., service stacking and billing), [95] which was also confirmed by interview partners.
Results for technology acceptance are also shown in Table 4. Due to limited data available for this category, literature values [93] Table 3. Scores for all considered aspects for technology performance based on previous studies, [89,90,94] (results from TOPSIS are normalized).   are included in the assessment, supporting the characterization sheets regarding "societal acceptance" (see also Supporting Information). Furthermore, stakeholders reported that it is hardly possible to estimate technology acceptance issues of, e.g., high-temperature batteries versus VRFBs. Thus, every BESS option receives the same score for technology acceptance, why this criterion can be considered of little added value for BESS evaluation in this context. However, it should be maintained for future research, especially when comparing also with other (nonbattery) storage technologies such as hydrogen, pumped hydro storage, or adiabatic compressed air storage. The work of Emmerich et al. [96] provides a first insight into the potential technology acceptance of stationary LiBs.

Indicative Scores and Rankings
Indicative results for all analyzed BESSs with the finals scores and rankings for the four application areas (WES, DC, PR, and ETS) are shown in Figure 9. Aggregation is conducted via TOPSIS based on the weights for the four main criteria as well as all subcriteria from AHP and the corresponding performance measurement values presented in the previous section. The LCA and LCC inputs are based on the median values from the Monte Carlo simulation. Also, only median values are displayed in the following for readability purposes. The bandwidths for all application fields are given in the Supporting Information. The top row in Figure 9 shows the most suitable application field for the considered battery technologies (the higher the bar the better). Following this, a comparison among the technologies is provided for each application field including indicative rankings. The dependency of the results on the considered application case can be seen clearly. This indicates the importance of evaluating a storage technology always under explicit consideration of the foreseen application, especially when comparing conceptually very different technologies (e.g., batteries and compressed air). LiB-LFP, LiB-NMC, and LiB-NCA dominate all application cases with very similar scores, closely followed by the NaNiCl battery. A clear recommendation among these battery technologies can therefore not be given due to the closeness of the scores (rankings are thus indicated in gray). In contrast, VRLA, the reference technology, obtains notably lower ranks for almost all application cases considered. Unlike the LIB, the VRFB and the LiB-LTO scores are highly dependent on the application field.
For the VRFB, rather promising results can be obtained especially in applications with high E/P ratios such as ETS or WES, while dropping significantly for high power applications such as PR. Future research should thus include a use case with very high and low E/P ratios (e.g., E/P ratio of 20 vs 0.5) to analyze potential changes of LiB-LTO and VRFB scores and rankings. The performance of the other considered LiBs and the NaNiCl is less sensitive on the application field and can thus be considered "allrounder." In general, all technologies achieve their highest results for ETS. This can be simply explained by the relatively high amount of two long-lasting cycles per day (each 4 h), which results in a high amount of operation hours of %730, to max %2.920 h per year (the latter would be very optimistic). However, reality shows a different picture, with a Figure 8. Relative weights for technical parameters for the subcriterion "technology performance" with n ¼ 69. Table 4. Resulting evaluation of sociopolitical aspects of different storage technologies based on own survey with n ¼ 69 and results from the study by Elsner and Sauer. [93] *Results from [93] LIB-LFP LIB-LTO LIB-NMC LIB-NCA LIB-LMO NaNiCl VRLA VRFB significant share of BESSs being installed for PR. [97] Here, especially the economic criteria would need to be extended under consideration of the specific stakeholders to better capture the drivers for their investment decisions for a particular application as PR (e.g., include net profits stemming out the application). So far, a direct comparison of considered technologies does not provide a suitable basis for decision support at this stage. It also has to be considered that an MCDA is highly dependent on the taken assumptions, analyzed storage alternatives, and used weights. This will become more obvious in the sensitivity analyses where varying weights and potential impacts of recycling on final rankings are evaluated (see Section 3.7).

Sensitivity Analysis
A sensitivity analysis is conducted regarding two key aspects of any MCDA: [27] 1) variation of weights, and 2) potential impact through the variation of performance measurement data through the consideration of recycling.

Variation of Weights
The sensitivity analysis regarding the criteria weights is done for the environmental and economic main criteria, which are the ones weighted the highest by stakeholders. All pairwise comparisons are set to equal, and only the weight for the criterion of interest is changed in relation to the other, i.e., environmental versus economic aspects. The results are shown in Figure 10 where rankings are plotted in dependence on the varying AHP scale starting from the highest possible order of affirmation (9) (extremely more important) to 1 (equal importance). Depending on the application case, the ranks can change noteworthy with varying weightings. For example, the LiB-LTO in the ETS case changes its rank from four to seven when economic aspects are given higher weights. The contrary case can be observed for the NaNiCl BESSs which changes its rank from four to six in case of a stronger environmental perspective. However, the two top-ranked BESSs (LiB-LFP and LiB-NMC), while switching their rankings from the first to the second, maintain the two best scoring technology options in all four applications, Figure 9. The top row "application" represents the analysis of each considered technology within the four application fields (E/P ratios change from WES to PR in decreasing order). The higher the value the better the overall performance in the named application. Indicative scores (in white in the corresponding bars) and related indicative rankings among the different BESSs in the considered four application areas (1 represents the best and 8 the worst rank) are given in the rows given later(ETS ¼ energy time shift, DC ¼ decentralized grid, WES ¼ wind energy support, PR ¼ primary regulation). Ranking values are indicated in gray as in most cases differences among scores can be considered as marginal. however, economic aspects are weighed against environment. Further sensitivity analyses for subcriteria are provided in the Supporting Information.

The Potential Impact of Recycling on Final Results
The end-of-life stage, i.e., the impacts and/or benefits associated with the treatment or recycling of spent batteries or other components of the storage installation is not considered in this assessment. This simplification might be valid for comparing similar technologies such as different LiBs, which, despite using different cathode materials, show similar layouts and levels of integration. However, for other battery types, e.g., the VRFB, NaNiCl, or VRLA, potential recycling benefits would differ and might change the outcome of the comparison significantly.
For instance, a VRFB consists majorly of industrial process equipment, which can be dismantled comparably easy by mechanical processes, allowing recovering monofractions on a macroscale, which can be recycled easily. [8] In contrast, the mechanical separation of all components of the highly integrated LiB cells is extremely costly, why these are usually shredded entirely, yielding a heterogeneous mixture of all cell components (and materials). Separating the individual materials in a pure form from this mixture is difficult, why usually only the most valuable substances are recovered, lowering the overall recycling efficiency.
To illustrate the potential implications of recycling for the conducted MCDA, a simplified estimation of the impact of recycling on the LCA results is shown in Figure 11. The analysis is limited to environmental criteria (LCA results) and three selected BESS technologies, i.e., LiB-LFP (high ranks), LiB-LTO (intermediate ranks), and the VRFB (low ranks last), based on published  . Illustrative example of the potential impact of recycling in relation to a base case w/o recycling calculated using data and assumption from ref. [98] www.advancedsciencenews.com www.entechnol.de Energy Technol. 2019, 1901019 literature data. [98] It has to be mentioned that BoP components are not fully considered here and that recycled materials are assumed to substitute the primary demand for materials such as copper, nickel, vanadium, or steel. Despite the simplified approach, the results show that considering recycling leads to significant changes of the environmental performance of the VRFB in relation to the two LiB alternatives, and would probably do so also under economic and sociopolitical aspects. Considering this in the MCDA would be a highly relevant task for future works in the field.

Discussion
The strength of MCDA, its capacity of combining numerous heterogeneous and often fundamentally different criteria to obtain a meaningful recommendation is also one of its major weaknesses. Different, often rather qualitative, evaluations are need to be quantified to calculate a score, which is often a difficult task bringing in some cases along significant uncertainties, which has to be considered when interpreting results. Some of them are discussed exemplarily in the following, namely those related to the environmental assessment, and its use within an MCDA including multiple stakeholders.

Uncertainties to the Environmental Impacts
The environmental scores are obtained via a detailed LCA conducted explicitly within this study, using the ReCiPe endpoint methodology for quantifying the environmental impacts. As previously mentioned, the endpoint assessment has the advantage of aggregating results (potential environmental impacts) to just three key impact categories that are intuitively comprehensible to any stakeholder (human health, environment, and resources), easing the weighting of the three categories substantially. However, in this case, the aggregation has been made in a previous step that is part of the impact assessment methodology. In LCA, environmental impacts are usually first calculated on "midpoint" level, corresponding to a broad set of heterogeneous impact categories that each represent a single environmental effect [e.g., greenhouse gas (GHG) concentration in the atmosphere, emission of acidifying substances and many more; 18 in sum for ReCiPe midpoints], without considering potential damages caused by it. Significant modeling effort is made in the impact assessment model to aggregate these effects into actual environmental damages: GHG concentration in the atmosphere needs to be mapped to temperature increase, temperature increase then to, e.g., increase in drought periods and sea-level rise, and these then to potential loss of species or of biodiversity (environment), and to impacts on human health due to, e.g., reduced agricultural yields leading to malnutrition, sickness due to water shortage and insufficient drinking water quality, spread of tropical diseases such as Malaria and so on). This is a highly complex modeling framework and contains huge uncertainties in many aspects of the cause-effect chain. In fact, different impact assessment methodologies can yield very different final results for the same process, already on midpoint, but much more on endpoint level. Due to these drawbacks, the use of solely endpoint indicators is usually not recommended in scientific works. [99] However, the numerous midpoints are hardly comprehensible and thus not weighable for a nonexpert, why in this case using the endpoint approach can be considered the better choice nevertheless (reduced uncertainty in the weighting process due to better understanding of the impacts).

Inherent Uncertainty of MCDA
There is a magnitude of critical literature available for the field of MCDA as highlighted in the study by Baumann et al. [27] It has to be considered that every decision maker is influenced by the decision environment and vice versa. [35] Furthermore, involved stakeholders may have an individual perception about certain criteria as a result of the sociotechnical regime, they are situated (e.g., utility company vs NGO). Consequently, there is always a certain degree of uncertainty implicitly inherent in any MCDA approach, [100] which should be addressed accordingly (e.g., highlighting that results are dependent on the consulted stakeholders). This was conducted in the line of this assessment. It is, however, highly important to point out that no MCDA approach allows to derive general recommendations. Rather, each MCDA has to be seen within its individual context. The ranking results of the assessment are thus contrasted to existing other literature in the field of energy storage using the results from the study by Baumann et al. [27] ( Table 5) to provide a broader picture. In addition, information about the used multi attribute decision making (MADM) within the MCDA, the analyzed applications, the system design (meaning if crucial information about power and capacity is provided by the corresponding source), and information about stakeholder participation are also provided for all analyzed studies. LiBs are summarized to one category, NaNiCl is attributed to group "other." In addition, the different methods, participation of stakeholders, and considered application fields are also shown in Table 5. It becomes clear that rankings can vary heavily depending on the weights, methods, technologies, and application fields considered in an MCDA, why the results obtained here can hardly be compared with other assessments. Furthermore, BESSs do not represent the only energy storage technology available for the considered applications, why an interesting amplification would be including also alternative energy storage technologies (e.g., flywheels) in future assessments.

Conclusions
This study provides an explorative MCDA for different utility-scale battery storage technologies in four different application areas, based on the input of 72 stakeholders that participated in expert surveys for defining and quantifying the considered environmental, economic, technological, and social criteria. Environmental and economic main criteria are considered as almost equally important by stakeholders and dominate the technical and social criteria. Calculating the consensus among participants unveils that there is no common preference available and that there is further discussion required regarding the robustness of the weighting process. This issue becomes more evident when the 13 single stakeholder groups are analyzed. Also here, the consensus remains moderate among most groups. The quantification of all considered criteria is a challenging and time-consuming step. Here, significant discrepancies are given between the data quality within the four main criteria categories. Environmental aspects are quantified by LCA, where the robustness and quality of results is highly dependent on the availability of data. Although own recent inventory data is used for the LCA, there are still some major gaps and uncertainties related, e.g., to recycling potentials, the ReCiPe assessment method and the battery models themselves. In contrast, the results obtained for the economic aspects seem to be relatively robust and are widely in line with other authors. There is, however, still a need for better understanding the role of "BoP" cost, which contributes significantly to the total economic performance. The results provided for the social aspects are based on expert estimates and may only partially reflect the acceptance of different technologies within the public. While in general, acceptance is considered a key for the successful deployment of larger technology solutions, this aspect does not provide decision support for the considered battery technologies, being the acceptance values equal for all technologies. This might be attributable not only to lacking data but also to the similarity (out of a citizen perspective) of the assessed technologies (battery storage systems), where no significant difference in risk perception is given at the moment. Robust results can be achieved when including also other storage technologies, or when there is more information available about, e.g., the VRFB). Results should thus be taken with care and should rather be used as a base for further assessments in the field of social aspects (especially technology acceptance requires significant research efforts). In consequence, the obtained rankings for the eight considered battery technologies within the four application cases should be seen as rather indicative. VRLA is ranked low in all cases because of relatively low efficiencies and cycle lifetimes. Ranking of VRFB is highly dependent on the considered use case and is favored by high energy to power ratios. LiBs seem to be the most recommendable technology among the evaluated BESS for most application areas (with exception of the LiB-LTO, which is only suitable for low E/P ratios). Interestingly, for all assessed BESSs, ETS arises as the most promising application field, in sharp contrast to actually installed BESSs that are often targeting PR. This can be explained by a missing indicator describing the profitability of all considered applications. In any case, rankings should not be seen as final but rather as a base for a further discussion about technology impacts, use, and design in the face of sustainability. The low rankings of some technologies do not indicate that they are "nonsustainable" or worse than other options per se, but might simply be more suitable for other applications. In general, the suitability of BESS technology for a particular business case should be evaluated thoroughly in every case and under consideration of the specific stakeholders' groups, as it is not possible to provide a generic ranking. This becomes evident also in the sensitivity analyses and the comparison with other MCDA literature. [27] Table 5. Comparison of results with studies in the field of MCDA related to energy storage based on the study by Baumann et al. [27] Some previous studies [11][12][13][14][15]17,29,34,[101][102][103][104] are contrasted to this study regarding the used MADM, the applications, the system design, and information about stakeholder participation and final rankings. This study is indicated in bold letters at the bottom of this