Sustainable decision making for chemical process systems via dimensionality reduction of many objective problems

Recent global events and the rise of sustainable investing have made clear that the chemical and energy industry must consider sustainability goals beyond profit maximization to remain competitive. Multiobjective optimization provides an ideal framework for analyzing sustainability tradeoffs, but when four or more objectives are considered, the ability to rigorously solve problems and interpret results is lost. This necessitates an approach to systematically reduce the dimensionality of many objec-tive problems to three or fewer objectives. In this work, an algorithm to group objectives based on their correlating nature a priori to solving the full space problem is proposed. It utilizes community detection on a novel weighted objective correlation graph to identify two or three groups of correlated objectives. Results from three representative case studies demonstrate that objective groupings obtained from this algorithm minimize the amount of tradeoff information lost and outperform intuitive groupings by economics or the environment.


| INTRODUCTION
In recent times, the idea of sustainable investing has come into the mainstream. 1 In this scheme, shareholders and board members judge industrial performance not just by the traditional economic bottom line of profits, but also emphasizing environmental, social, and governance (ESG) issues.Indeed, looking at the events of the last 5 years illustrates the pitfalls of making decisions solely based on economics, as a pandemic has exposed vulnerabilities in global supply chains, 2 social unrest and lawsuits have occurred due to inherently inequitable decisions that disproportionately harm minority communities, 3 and the effects of climate change, largely driven by the use of economically preferred fossil fuels, are becoming more apparent. 4The realities of today make it clear that a modern chemical industry cannot remain competitive by solely maximizing profits as is traditionally done, and that decisions made at all levels in the chemical enterprise, including process design, strategic planning, and real time operation and control, must weigh the tradeoffs of a large number of different objectives within the scope of both economics and ESG.
One approach to sustainable decision making is to attempt to monetize ESG outcomes to arrive at a single, profit-based objective. 5,6is approach is commonly applied for carbon emissions via either a carbon tax [7][8][9] or through a social cost of carbon. 10A few approaches within the life cycle assessment framework attempt to be more systematic about how monetization is performed, usually by attempting to define a marginal or opportunity cost for each sustainability objective. 11An analogous approach to monetization is the use of multiattribute decision making methods, 12 which attempt to quantify the objective preferences of a decision maker by generating weights that correspond to these preferences.These approaches are beneficial in that the optimization algorithm returns a single solution that can be directly implemented.However, a key limitation of any monetization or multi-attribute approach is that the solution obtained is typically highly sensitive to how different objectives are weighted: for example, considering social costs of carbon which may vary over orders of magnitude depending on the source, 13 or using different LCA-based approaches, can result in very different decisions being made.As such, the question of what makes a solution sustainable can vary wildly between decision makers who may hold different values, including between different organizations as well as individuals at the same organization.
An alternative approach to monetization is to solve a multiobjective optimization problem.Here, the solution is not a single decision but is a manifold of possible decisions representing tradeoffs between objectives (the Pareto frontier), where each point along the Pareto frontier represents the best one can do for one objective without making any of the others worse.While this approach does not give a single implementable solution, obtaining and visualizing the full Pareto frontier of objective tradeoffs enables making fully informed sustainable decisions, the responsibility of which ultimately lies with human actors and can be supported by preference ranking algorithms. 14ltiobjective approaches have been applied quite broadly in chemical process systems research. 15With respect to sustainability, it is most common to analyze the tradeoffs between a single environmental and a single economic objective, with example applications in plant design, [16][17][18] supply chain management, 19,20 and process operations, 21,22 to name a few.
A limitation of multiobjective approaches for sustainable decision making is that for problems of four or more objectives (manyobjective problems, or MaOPs), visualization of the objective tradeoffs becomes unintuitive and rigorously generating a complete set of solution points becomes computationally prohibitive.A common approach to addressing this challenge is to lump different sustainability goals into intuitive groupings, such as economic, environmental, and social groupings based on the three pillars of sustainability. 23Examples of this include work by Santibanez-Aguilar et al., 24 who combine metrics for damages to human health, the ecosystem, and resource extraction using a life cycle assessment derived tool in the planning and site selection of biorefineries.Garcia et al. 25 propose combining the effects of agricultural wastes, land use, and ecosystem services into a single environmental green GDP objective when designing foodenergy-water-waste nexus systems.Wheeler et al. 26 propose an environmental objective that takes into account the potential for a process or supply chain to cause key outputs, such as ozone depletion or ocean acidification, to exceed "planetary boundaries" or upper bounds beyond which deleterious impacts on the planet are observed.
Mota et al. 27 propose a social objective that combines the goals of creating jobs and limiting inequity by weighting job creation activity in lower GDP regions, which they used in analyzing possible locations for expansion of an electronics manufacturer.Alternatively, one can also try to aggregate all outcomes into a single objective: a popular method for doing this is the eco-efficiency concept which normalizes the various economic and environmental objectives and assigns weights based on social relevance and a process' impact on a specific outcome relative to known global parameters. 28 an example that demonstrates the shortcomings of the aforementioned approach, consider the design of an ammonia production system in a water scarce region where one could choose either to produce requisite hydrogen from fossil fuels (resulting in inherent carbon emissions) or from electrolysis of water (using large quantities of water).For such a system, there is a clear tradeoff between the two objectives of carbon emissions and water usage, but information about this tradeoff would be lost if both quantities were aggregated into a single environmental objective.This example demonstrates the importance of choosing objective groupings more systematically based on their correlating (i.e., both objectives point to similar solutions) vs. competing (i.e., a large tradeoff exists between objectives) nature to preserve tradeoff information.While there exist methods to achieve this which utilize principal component analysis, 29 aggregation trees 30 or dominance preservation strategies, 31 these methods require the generation of at least part of the high-dimensional Pareto frontier for the original MaOP.As such, they may be susceptible to bias based on which solution points are generated and are not particularly helpful in reducing the computational burden of solving the problem.
In this work, we assert that systematic objective dimensionality reduction for (mixed-integer) linear MaOPs can be performed a priori to obtaining any part of the solution of the MaOP on the basis of problem structure.We propose a graph structure to represent variable-constraint-objective connectivity, which we use to develop an objective correlation graph with edge weights corresponding to the competing vs. correlating nature of the two objectives.From the objective correlation graph, we apply a community detection approach to identify 2-3 groups of objectives, such that objectives in the same group are correlated and those in different groups are competing.We also present an information loss metric and demonstrate that our approach is able to choose groupings that preserve as much information about objective tradeoffs as possible.The remainder of this article is structured as follows: Section 2 will provide background on solving multiobjective optimization problems, as well as on identifying and exploiting optimization problem structure using graph theory.Section 3 will provide the details of the proposed algorithm and how it utilizes the problem structure of MaOPs to determine the strength of links between objective functions.Section 4 will examine three case studies adapted from the sustainable process systems literature and utilize the proposed algorithm to analyze the structure of MaOPs adapted from the original formulations.Finally, Section 5 will include some concluding remarks and discussion of areas for future work.

| Multiobjective optimization
A general multiobjective optimization problem can be written as follows: where f i represents the function for objective i, x are the decision variables, and X is the set of all values x that are feasible to the problem.
Typically, it is impossible to optimize each objective f i simultaneously, such that no single solution exists for problem (1).Instead, the solu- In words, this means that solution b x performs no worse than x in all objectives, and strictly better in at least one objective.Using this concept, the Pareto frontier can be defined as the set of all feasible solutions that are not dominated by any other feasible solution.
Solution approaches to multiobjective problems seek to find the Pareto frontier by identifying nondominated solutions.The most common approaches can be broken into two categories.The first is scalarization approaches, which generate Pareto optimal points through solving a set of single objective optimizaiton problems.Examples of methods in this classification include weighted sum, 32 epsilon constraint, 33 and Chebyshev scalarization approaches. 34While rigorous deterministic global optimization can be used to solve each single objective problem and guarantee that each point found is Pareto optimal, the number of single objective problems to solve scales exponentially with number of objectives, such that these approaches are impractical to implement for MaOPs.The second class of approaches is evolutionary algorithms, which use biological principles of natural selection such as mutation and recombination to drive a population of feasible points toward optimal solutions. 35While these methods tend to work reasonably well in practice and are more scalable than scalarization approaches, they are ultimately stochastic and heuristic approaches which do not provide any guarantees of solution quality.
As such, an important goal of this work is to systematically reduce the dimensionality of MaOPs to three objectives or fewer, in order to apply a rigorous scalarization method for the determination of the Pareto frontier.

| Graph representation of optimization problems
For complex optimization problems that cannot be readily solved by off-the-shelf solvers in relevant amounts of time, such as the MaOPs of interest in this work, it is often useful to identify and exploit problem structure to derive an approach that makes solving the problem easier.A natural way to achieve this is by representing the optimization problem as a graph, or a set of nodes and edges that capture the connectivity of different objects (i.e., variables and constraints) within the optimization problem. 36Once such a graph is developed, an effective approach for structure identification is community detection, which identifies subgroups within a graph on the basis of maximizing a quantity called modularity, effectively generating subgroups such that nodes within the same subgroup interact strongly, while minimal interaction occurs between nodes in different subgroups. 37ile modularity maximization is a known NP-hard problem, several well-known greedy algorithms give good heuristic solutions such as spectral partitioning, 38 the Louvain algorithm (or fast unfolding), 39 and the Leiden algorithm, 40 the latter of which is used in this work.
Community detection has been shown to be a powerful tool for identifying structure in optimization problems amenable to decomposition.Early work in this area looked at identifying distributed optimization structures for augmented Lagrangian solution approaches, 41 as well as structure within a model predictive control problem for obtaining distributed controller architectures. 42This approach was later generalized to identify communities that correspond to optimization subproblems with minimal complicating variables or constraints, and thus amenable to various decomposition approaches. 43This approach was extended using a stochastic block modeling approach to identify both community and core-periphery structure in optimization problems that can be exploited using various decomposition solution approaches. 44Other recent work in this area developed a new overlapping Schwarz type decomposition rooted in a problem's graph structure and the exponential decay of sensitivity propagation through graphs. 45Beyond identifying structures for decomposition, graph-theoretic methods can also be used to identify symmetry in optimization problems, 46 which can degrade performance of global nonconvex solvers.In this work, we seek to build a graph structure relating multiple objectives with weights corresponding to their correlating vs. competing nature.From this, a community detection approach can be applied to determine subgroups of objectives such that objectives in the same subgroup are correlated, while those in different subgroups are competing.

| PROPOSED ALGORITHM
This section provides the framework and details of the mathematical algorithm to reduce objective space dimensionality in many objective optimization problems.The algorithm assumes an optimization problem with I objectives, V variables, M inequalities, and N equality constraints.In particular, we consider the linear many objective optimization problem formulated as follows: Here, c i are known cost vectors of size V for the different objectives,

| Edge weight determination
To determine the edge weights of the objective correlation graph, we recall that for a linear programs like the MaOP (4) we are considering in this work, the optimal solution is guaranteed to lie on the boundary of the feasible region.Thus, to determine if two objectives are likely to be competing or correlating, it is useful to consider the projections of their cost vectors onto active constraint hyperplanes of the optimization problem.If these projections point in (nearly) the same direction along all surfaces, it is likely the objectives will be correlated, while if the projections point in different directions on at least some surfaces, conflict and tradeoffs are likely to occur.We formalize the approach mathematically as follows: consider two objective cost vectors c i and c j interacting along inequality constraint with normal vector a k (a row of the matrix A).For each cost vector, we obtain the vector components normal (c N ik ) and projected (c P ik ) onto the constraint surface, using the following equations: A graphical representation of these various vectors is displayed in Since conflict along any constraint surface can cause a tradeoff between objectives, while correlation requires overlap on all constraint surfaces, it makes sense to weight findings of conflict more heavily when combining the interactions along different constraints.
As such, we weight each interaction (W ijk ) using a logistic function that provides high weights when conflicts are found, and lower weights when correlation is found: In this equation, α and β represent hyperparameters to the algorithm.
The hyperparameter α should vary between 0 and 1 and represents a maximum "discount rate" for correlated objective-constraintobjective triplets to ensure that many correlated constraints do not overwhelm a smaller number of more informative competing constraints.The hyperparameter β should be positive, and governs the smoothness of the logistic curve, with larger values making the curve more step-like at S ijk ¼ 0. Empirically, we have determined that values of α ¼ 0:9 and β ¼ 100 tend to work well in practice.
Equality constraints will always be active, so the component of the cost vector normal to the constraint surface is unimportant.Equations ( 5)-( 6) are used with d k (a row of the matrix D) in place of a k to determine the component of the cost vector projected onto the constraint surface.Strengths along equality constraint are then found again using (7).Since the Pareto solution will always lie along the equality constraint surface, weights W ijk are set to 1 for all equality constraints.
To determine the total correlation strengths, which we denote S A ij , we calculate a weighted average of strengths along all constraints determined to be possibly active, and rescale values such that they are between 0 and 1: Note that the matrix of S A ij values is the adjacency matrix of the objective correlation graph, that this matrix will always be symmetric (i.e., S A ij ¼ S A ji ), and that by convention, diagonal elements of this adjacency matrix are always set to zero (i.e., S A ii ¼ 0).Values of this adjacency matrix near zero correspond to an objective pair with a large expected amount of conflict, while values close to one imply that the two objectives are expected to be correlating.
Note that for large-scale problems, it can be inefficient to compute the effect of every constraint on every objective pair.In such cases, we provide in this algorithm the option of being more systematic about which values of S ijk are obtained by using a variable-constraint-objective graph.This graph is a tripartite graph where nodes correspond to variables, constraints, or objectives in the original optimization formulation, and edges exist between variable and constraint/objective nodes if the variable appears within the constraint/ objective.From this, we can identify primary linking constraints, which contain one or more variable shared by two objectives, and secondary linking constraints which contain at least one variable unique to both of the two objectives considered.An example of the variable-constraint-objective graph of a simple problem is shown in Figure 3, displaying examples of both primary and secondary linking constraints.
As primary and secondary linking constraints comprise the shortest paths from objective to constraint to objective in the graph, it expected that they capture the most important objective interactions.
This argument aligns with recent findings that for large scale, structured optimization problems, the sensitivity of the optimal solutions at one node with respect to perturbations at another decays with respect to the distance between nodes, 47 although this approach can be obfuscated by formulations with a large number of "auxiliary" variables or constraints.

| Determination of objective groups
Now that a way to systematically determine edge weights in the objective correlation graph has been presented, groupings of correlated objectives are obtained by performing community detection using the Leiden algorithm. 40This algorithm takes as input the adjacency matrix of the objective correlation graph, as well as a hyperparameter which governs the resolution of communities.In most cases, it is desirable to identify two or three communities of objectives, since two or three-objective optimization problems are not computationally prohibitive to solve and give interpretable Pareto frontiers.To achieve this, we begin by setting the resolution hyperparameter so high that each objective is in its own community, and then gradually decrease this value until the desired number of communities are achieved.Once the community structure is obtained, the dimensionality of the original MaOP is reduced by combining objectives within the same community into a single objective.There are a variety of ways that this can occur: two popular approaches are to simply add all of the objectives in the same group together, or to neglect all but one objective from the group. 29Different approaches can be more beneficial depending on the application and how the Pareto frontier will be used; however, the remainder of this article will consider grouping by adding together objectives, and comparison of different grouping approaches will be considered beyond the scope of this work.

| Information loss metric
Assuming grouped objectives are combined either by neglecting all but one objective, or using a weighted sum, Pareto frontiers obtained using the reduced-space formulation will be a subset of the full-space Pareto Matrix K p is created that contains all Pareto optimal values for objectives in C at all points j J p where the objective values for objectives in ℛ are held constant.In each K p the values are scaled using: where b K p,i,j is the scaled value from objective i at point j, K p,i,j is the original value, l i is the lower bound or minimum value found for objective i through the entire full space Pareto, and u i is the upper bound or maximum value found for objective i through the entire full space Pareto.Equation 10gives us scaled values for each objective in C, such that they range between 0 and 1.With these scaled values, we can determine the total information lost at each point p.Total information loss for each point p is determined by: where B p is the information lost at unique point p, b is the maximum scaled value for objective i, and b is the minimum scaled value for objective i.A B p value of 0 tells us that no tradeoff exists between grouped objectives at point p, while a larger value of B p , up to a maximum of j C j indicates that a wider range of tradeoffs between combined objectives is being neglected by combining them.
Finally, the average of all B p values is taken to obtain the total average information loss.This is a single value that describes the information lost by grouping the chosen set of objectives together.Its utility is in comparing different choices of grouping objectives together to determine which group results in a lower average information loss and a more valuable and informative Pareto frontier.

| CASE STUDIES
The proposed algorithm will be demonstrated on three representative studies adapted from the sustainable process systems literature.Each of the studies has been chosen to represent MaOP formulations which include varying numbers of variables and constraints as well as integer and binary variables.These cases will demonstrate the ability of this method to identify and group objectives which are most strongly correlated in a range of problem formulations.All optimization and calculating were completed using an Intel i9-10900 CPU with 64 GB of RAM using CPLEX 20.1, 48 JuMP v0.21.10, 49 and Julia v1.5. 50

| UK energy mix
This case study is included as it is a relatively simple problem formulation that has only three objectives with a full three dimensional Pareto frontier presented in the original work, Limleamthong and Guillen-Gosalbez. 51This work studies the energy sector in the UK, analyzing total energy mix including conventional sources and a variety of alternatives.Energy technologies including nuclear, wind, natural gas, coal, and biomass are studied.The objectives within the formulation of this study are cost, global warming potential, and worker injuries.Variables model the total electricity generated by each of the studied technologies.This leads to a 6-variable, 3-objective study with the only constraints being that the total generation is equivalent to the demand of the nation and bounds Example variable-constraint-objective graph.In red, f 1 is a primary constraint link between objectives 1 and 2, as it contains x 1 , which appears in both objectives.In blue, f 2 is a secondary constraint link between objectives 1 and 3, as it contains x 2 from objective 1 and x 4 from objective 3.
on the possible values for each variable.The full optimization problem formulation is: where G j is the amount of energy generated using technology j, c j is the cost in per kWh, w j is the global warming potential in kg CO 2 equivalent per kWh, i j is worker injuries in number of injuries per kWh, G min j and G max j are the lower and upper bounds for energy generated with each technology, and D is the total energy demand which must be satisfied.Pareto frontiers presented in the article and shown in Figure 4, with each objective rescaled to lie between 0 (the single objective minimum) and 1 (the worst case for each objective observed on the Pareto frontier), indicate that there is no strong agreement among the three objectives.However, our goal for this case study was to see if our algorithm could successfully identify the objective pair resulting in the least tradeoff information lost.After running this problem formulation through the proposed algorithm, the correlation strength weights shown in Figure 5 were found between each pair of objectives.
Running the Leiden algorithm to detect the best grouping into two communities trivially groups the two objectives with the largest edge weight, giving one community with only the cost objective and one community with both the global warming potential and worker injury objectives.The complete algorithm of determining edge weights and objective groupings takes 0.36 s for this case study.
Physically, the results suggest that there is some agreement between how much an energy technology emits and how many worker injuries it typically incurs.Looking at the input data, we can confirm this finding: the lowest emitting technology (nuclear) also has the second fewest worker injuries, while the highest emitting technology (coal) also has the highest worker injuries.Similarly, the second strongest pair also makes sense to have high correlation, as the natural gas is the technology with both the lowest cost and worker injuries.However, the constraints limit the ability of natural gas to meet all of the demand, and the second lowest cost technology (coal) is the worst with worker injuries.Finally, it makes sense that cost and emissions are the most competing constraints, as the two lowest cost technolo- also matches results of using the principal component analysis method from Saxena et al., 29 which identified the emissions objective for removal, by, for example, grouping it with worker injuries.

| Sustainable ammonia supply chain in Minnesota
To demonstrate the application of the proposed algorithm to a MaOP with many variables and few objective functions, the optimization of the ammonia supply chain as presented in Palys et al. 9 will be used.Specifically, we adapt the formulation for modular production units by considering the additional objectives of carbon emissions and water usage, while separating capital and operating costs into their own objectives.For a full listing of the notation for this problem, we refer the reader to the original work.We review the problem formulation below, with the operating cost objective function given by: where ζ is the capital cost of constructed renewable plants, σ is operating cost factor for renewable plants, x r is installed renewable capacity at site r, τ i,j is transportation cost and y i,j is amount transported from site i (renewable production, r, conventional production, p, distribution facility, d) to j (distribution facility, d, consumption site, f ).The capital cost for modular, renewable-powered production is given by the following: where ρ m is the cost of one module size m, z n,m is the a binary variable which is one when n is modules of size m are built and zero otherwise, and γ m is the mass production factor.An emission model is also included as an objective and is drawn from a study using a related optimization study. 20Emissions of the supply chain are given by: emissions where ε i,j are emissions resulting from transporting ammonia from i to j, and η p are the emissions from conventional production of ammonia at site p.Creating the bounds for this problem are the following constraints.
Linearizing the capital cost objective yields the two constraints: where w m,r is the number of modules size m at renewable production site r.The production at each site is defined from this variable: where π m is the amount of ammonia produced by a module of size m.
A requirement that the supply chain meets the demand at each site: where δ f is the ammonia demand at site f.At each conventional production site, there is an upper bound on the production capacity: where ξ p is the capacity at each production site.A mass balance is imposed on distribution facilities: Additionally, there is an upper bound placed on the amount of renewable ammonia production at each site which arises from available wind-power: where ξ r is the maximum ammonia production at site r.An additional objective of water usage for each production site was modeled: water where ω i is the water consumed in ammonia production at site i.A simplifying assumption was made in the modeling of this objective that the only significant difference in water consumption among production methods was the stoichiometric amount of water needed for hydrogen production with either an electrochemical technology for renewable production or a fossil fuel-based technology.Additionally, capital cost and operating costs were split into separate objectives.The resulting problem formulation is a MILP with four objectives: capital cost, operating cost, carbon emissions, and water consumption.Implementing the proposed algorithm on the linear reformulation of this case study yields the objective correlation graph shown in Figure 6.
The algorithm identifies operating cost and carbon emissions objectives as the best pair to combine, with the total algorithm requiring 3.34 s to determine graph edge weights and best objective groupings.While this finding suggests to combine of an economic and an environmental objective which traditionally is not done, we note that this grouping intuitively makes sense for this particular problem: when deciding to reduce carbon emissions by building new distributed wind-powered ammonia facilities, we are inherently reducing operation costs as the feedstocks of onsite wind power, water, and air are essentially free, and transportation costs are significantly lower as these facilities are built closer to the farms which use the ammonia produced.As combining these objectives is not a traditional grouping, the information lost was compared with the conventional grouping that combines both economic objectives using the classical notion of net present value.Using a single dimension resolution of 40 steps, a total of 64,000 optimization iterations, the information lost by combining economic objectives was found to be an average of 0.0883.Following the combination suggested by community detection on the correlation strength graph, the average information lost was found to be 0.0159.This is an 82% decrease in the information lost on the Pareto frontier.As such, using the algorithm's identified groups in place of the conventional groups results in additional information about the tradeoff between emissions and capital cost by grouping operating cost with emissions, which it is correlated with, rather than capital cost, which it competes with.This additional information can provide decision makers with a more complete view of tradeoff options when determining sustainability preferences and making a final design decision.

| Energy technology selection optimization
The multiobjective energy storage system selection optimization study presented in Li et al. 52 where z ij is a binary variable representing the selection of technology i in use case j, TE i is the number of deployments of the technology i, EC ij is the levelized cost of storage, RE i is the combined environmental impact of each technology, D i is the discharge duration, and P i is the rated power.Practically, this problem can be simplified by identifying infeasible solutions by calculating the values of coefficients on the inequalities and eliminating any technology that is infeasible prior to optimization.This calculation and removal of infeasible technologies will be performed prior to implementation of the proposed algorithm.
Within the original environmental objective, fossil-fuel depletion Future work will attempt to extend this algorithm to nonlinear MaOPs, which provide the additional challenges that single-objective optimal points are no longer guaranteed to lie on the boundary of the feasible space, and constraint normals and objective gradients are no longer constant.We also intend to apply this framework to distributed and decomposed optimization problems in order to identify correlation between subproblem objectives, which can be useful in building minimal communication architectures which still give convergence of the decomposition solution algorithm.Finally, we plan to examine more deeply the parametric sensitivity of objective correlations, which will be important for moving horizon scheduling and control problems which must be solved repeatedly with different initial conditions over time.

NOTATION Objective Reduction Algorithm
known coefficients, and e is a known N dimensional vector.The decision variables x can be either continuous or integer.Without loss of generality, objective functions are formulated as all minimization problems.The only assumption made about the problem is that the inequality constraint matrix A fully bounds the problem, such that no decision variables or objectives can feasibly diverge to AE∞; we note that for most problems of practical interest, this assumption is not restrictive.The algorithm takes the three coefficient matrices, C ¼ c 1 jÁÁÁjc I ½ T , A, and D, as input and identifies groups of objectives with expected correlating behaviors with respect to the optimization/decision variables.To do this, we propose a weighted objective correlation graph, whereby nodes correspond to different objectives in the MaOP and edges connecting the different objectives are weighted between 0 and 1, corresponding to strongly conflicting to strongly correlating objectives.The general weighted objective correlation graph structure and proposed weighting scheme are depicted in Figure 1.The challenge of constructing this graph is intelligently and systematically determining the edge weights, as once these are obtained, community detection can be used to identify groups of objectives such that objectives in the same group are strongly correlated, while those in different groups conflict with each other.

Figure 2 .
Figure 2.For normal and projected components, the vectors are normalized to b c N ik and b c P ik, respectively, such that they point in the same direction as the original vectors but are of length 1.In the case where there is no normal or projected component, the normalized vector is frontier.A good performing objective reduction will retain a larger proportion of full-space Pareto optimal solutions in its Pareto frontier, or equivalently, each point on the reduced space Pareto frontier will have embed only a small amount of tradeoff information from grouped variables.Here, we propose an information loss metric to quantify this performance.First, a solution set that is a representative sample of the full-dimensional Pareto frontier is required.Two sets of objectives are identified: the set C is all objectives that are being combined and the set ℛ is all objectives that are kept as individuals.From the Pareto frontier solution set, unique solution values are recorded for the objectives in set ℛ, this gives a set of P unique points that are each a combination of values for objectives in ℛ.At each unique point p, one can often find multiple different Pareto optimal solutions that vary the values of objectives in C that are grouped together, giving another set of points J p .
gies are the highest emitting, and vice versa.The results of the algorithm are aligned with the calculated average information lost for each of the possible pairs from the full space Pareto frontier.If cost and global warming potential are grouped, the average information loss is 0.736.Combining cost and worker injury objective results in an average information loss of 0.213.The algorithm's identified group of global warming potential and worker injury objectives together results in an average information loss of 0.105.The results of our algorithm are also supported by Figure 4, where the red points are the parts of the full space Pareto frontier obtained for each of the three possible groupings of two objectives.It is evident that the selected grouping of global warming potential and worker injuries retains points that capture the largest range of values of the three objectives.Based on these results, the algorithm has successfully identified the best objective grouping if one were to be chosen.Furthermore, the identification of the worker injury-global warming potential grouping as best F I G U R E 4 Three dimensional Pareto frontier for UK energy case study.Points highlighted in red indicate solutions retained in the twodimensional Pareto frontier when (left) cost and global warming potential, (center) cost and worker injuries, or (right) global warming potential and worker injuries, are grouped together.F I G U R E 5 Objective correlation graph with correlation strengths as edge weights and identified communities in each color.Objectives are cost (EC), global warming potential (GW), and worker injuries (WI).
contains three objectives.The objectives are to minimize both levelized cost of energy storage (EC) and an environmental impact metric, ReCiPe, (RE), while the technology deployment number (TE) objective is maximized.The problem is formulated to have technologies' feasibility for different use cases determined by discharge duration and rated power parameters.These parameters are varied in order to explore what technologies are best for large-scale energy management, transmission and distribution support, customer energy management, and distributed energy systems applications.The full optimization formulation is:

( 5 |
FD), particulate matter formation (PM), human toxicity (HT), and climate change impact (CC) are the individual factors that are added together.Combining these metrics gives an understanding of the overall behavior of the specific solution with respect to the environment, but separating them allows a more detailed understanding of how the different energy technologies have varied impacts on certain environmental metrics.In Oliveira et al.53 the underlying data from these four metrics is presented for several of the studied technologies.Extracting this data and expanding the objective formulation of the optimization problem in Li et al. gives a 6-objective optimization with the original technology and cost objectives and the newly separated four environmental measures.This case study neglects technologies whose ReCiPe component values were not given by Oliveira et al. but were assumed in Li et al. giving energy storage technology options of pumped hydro storage (PHS), compressed air energy F I G U R E 6 Objective correlation graph with correlation strengths as edge weights and identified communities in each color.Objectives are capital cost (CC), water use (WU), operating cost (OC), and carbon emissions (EM).storage (CAES), lead-acid batteries, sodium sulfur batteries (NaS), sodium nickel chloride batteries (NaNiCl), and lithium-ion batteries.Two use cases from the original study are examined.The first use case is in transmission and distribution support, use case A3 in the original work.Running the algorithm takes 0.38 s and gives correlation strength weights and objective groups shown on the objective correlation graph in Figure 7.The groups that result from community detection are one with only the TE objective, the next with the FD and CC objectives, and the final group contains EC, PM, and HT objectives.This grouping indicates that combining all environmental objectives into one gives misleading results that a single technology, pumped hydro, is superior in this use case when instead there is an embedded tradeoff such that other technologies perform better for specific environmental metrics.Next, the case where the energy storage technology is being used for customer energy management is explored; this is use case A8 in the original work.Figure8shows the weighted objective correlation graph and the communities identified by the algorithm, which takes 0.41 s to run.Groups that result from community detection are one with only the TE objective, one with only the CC objective, and the final containing EC, PM, FD, and HT objectives.Aside from correlation strengths that link the technology objective to any other, the lowest correlation strength is 0.68.This indicates that the CC objective could be reasonably switched with the economic objective, returning the original RE objective.Since only a small selection of technologies are available for each of these use cases, the entire Pareto frontier can be examined to understand the resulting objective groups and demonstrate the physical insights generated by our algorithm.Figure 9 shows the optimal technology choices using the objective groups determined by the algorithm, each individual objective, and the original objective functions for both use cases.In the transmission and distribution support case, the original objectives all result in PHS as the optimal choice.Looking at the individual objectives shows that in order to minimize both the FD and CC objectives, NaNiCl is chosen.The algorithm's resulting objective groups capture this behavior by combining the EC objective with the PM and HT objectives, leaving FD and CC in their own combined objective.The CAES technology also gives optimal performance for the individual HT objective; however, this technology is only weakly Pareto optimal as it is dominated by PHS, which performs equally well in HT and better in the other objectives.Overall, it is apparent that our algorithm is able to preserve tradeoff information that is lost when all environmental objectives are grouped together using the ReCiPe formulation, and gives results that make physical sense based on the environmental impacts of the various technologies considered.In the customer energy management use case, original EC and TE objectives result in the choice of PHS while the RE objective results in choosing NaS.Examining individual objectives, it is apparent that the NaS technology optimizes all four environmental metrics, but FD, PM, and HT perform equally well when choosing PHS.HT is also optimal using CAES.The algorithm assigned FD, PM, and HT to a group with the EC objective.The overlap in technologies optimizing many objectives explains the high correlation strengths in the objective-strength graph for this use case excluding the technology objective.CONCLUSIONS AND FUTURE WORK In this work, we presented a novel algorithm for reducing the dimensionality of linear MaOPs a priori to generating points which are Pareto-optimal in the original full space problem.This was achieved on the basis of a weighted objective correlation graph, where weights were determined based on the overlap of cost vector projections on constraint surfaces.From this graph, community detection using the Leiden algorithm was performed to identify two to three groups of objectives whereby objectives in the same group are correlated, while objectives in different groups are competing.The algorithm provides a F I G U R E 7 Objective correlation graph with correlation strengths as edge weights and colors representing detected communities for transmission and distribution support use case.F I G U R E 8 Objective correlation graph with correlation strengths as edge weights and colors representing detected communities for customer energy management use case.method to reduce an intractable, uninterpretable high dimensional MaOP to a tractable two or three objective problem that generates Pareto frontiers which are interpretable and retain as much information as possible from the original full-space problem.Through the analysis of three representative case studies relevant to sustainable chemical and energy production, we demonstrated the efficacy of our algorithm to systematically generate objective groups which preserve more information about tradeoffs than "intuitive" groupings of all economic or environmental objectives.As such, decision makers analyzing the Pareto solutions will be able to make more informed decisions based on full knowledge of the scope of tradeoffs among various sustainability outcomes.

AC
coefficient matrix for inequality constraints a k the kth row of matrix A b vector of upper bounds for inequality constraints C matrix of all objective cost vectors c i cost vector of ith objective c N ik component of cost vector c i normal to constraint surface defined by a k c P ik component of cost vector c i along the plane of the constraint surface defined by a k b c N ik component of cost vector c i normal to constraint surface defined by a k , normalized to length 1 b c P ik component of cost vector c i along the plane of the constraint surface defined by a k , normalized to length 1 D coefficient matrix for equality constraints d k the kth row of matrix D e vector of constants for right hand side of equality constraints S ijk strength of correlation between objectives i and j along constraint surface k S A ij total correlation strength between objectives i and j W ijk weight of contribution of S ijk for calculating total correlation strength set of all objectives combined into a new grouped objective J p set of Pareto optimal points found when keeping objectives values for objectives in ℛ constant at point p value.K p matrix of objective values at point p K pij value of objective i at point p in the reduced space Pareto, and point j in the full space Pareto with constant objectives values in ℛ b K pij objective value K pij rescaled to be between 0 and 1 l i minimum value of objective i in entire full space Pareto frontier P set of points on the reduced space Pareto frontier ℛ set of all objectives retained as individual objectives u i maximum value of objective i in entire full space Pareto frontier UK Energy Mix Case Study c j cost of generating energy by technology j D total energy demand G j amount of energy generated by technology j G min j minimum amount of energy generated by technology j G max j maximum amount of energy generated by technology j i j worker injuries caused by technology j J set of energy technologies w j global warming potential of emissions from technology j F I G U R E 9 Results of many-objective energy technology selection.Technologies that optimize a combined or individual objective are highlighted and those deemed infeasible due to inadequate discharge duration and rated power are grayed out.Lighter blue highlighting indicates that the solution is weakly Pareto optimal.Individual objectives listed multiple times are optimal for multiple technologies.In both cases, O1, O2, and O3 represent the green, orange, and pink objective groups, respectively, on each objective correlation graph.