A method for designing minimum‐cost multisource multisink network layouts

Systems engineers are equipped to design complex networked systems such as infrastructures. A key goal is cost minimization over a vast solution space. However, finding a minimum‐cost system while comprehensively satisfying different stakeholders is challenging and lacks proper methodological support. Stakeholders often employ their own expert estimations for lack of suitable decision‐support methods. In these settings, systems engineers typically require mid‐fidelity, easy‐to‐use methods. We present a rigorous method that quickly finds minimum‐cost solutions for networks with multiple sources and sinks, focusing on pipeline topology, length, and capacity. It can serve as a discussion tool in multiactor design processes, to demarcate the design space, indicate sources of uncertainty, and provoke further analyses, different designs, or contractual negotiations. It is applicable to a wide variety of cases, including many prominent infrastructures needed to mitigate CO₂. We prove that the optimal layout is a minimum‐cost Gilbert tree, and develop a heuristic based on the Gilbert‐Melzak method. We demonstrate the method's efficacy for a case set regarding solution quality, computational time, and scalability. We also show its efficiency and usefulness for systems engineers in real‐world settings. Systems engineers can use the generated cost‐optimal system designs to benchmark any design changes in real‐world negotiation processes.


INTRODUCTION
Networked infrastructures such as roads, telecom, gas, and water pipelines or power grids provide essential utilities and services to society. Common characteristics of such infrastructures include high initial capital costs, generally long lifetimes and as a consequence irreversibility once the construction of such networks has finished. 1  • Biogas, produced on farms in rural areas, can be used to partially replace natural gas. The volume of the produced biogas often surpasses the demand on the farm. Farms could transport the biogas to a network, connecting various farms. Such a network is a determining cost factor. Therefore, farmers, or their intermediates, search for a cost minimum architecture or topology of the biogas network to reduce the cost of biogas distribution. This case is further explored in this paper in Section 4.4.
• A prominent option for meeting Europe's CO 2 targets is to capture CO 2 at various large-capacity point sources such as power plants or steel mills, and transport this CO 2 to subsurface or subsea storage facilities such as depleted gas fields. Realizing carbon capture storage (CCS) is crucial as it is the largest individual measure in terms of CO 2 reduction until 2030 in the Netherlands (as agreed upon the Dutch government agreement, being 18 Mton out of 56 Mton 2 ).

However, there is still a huge gap between what is expected and
what has been realized. 3 As to date, no CO 2 network exists that connects sources and sinks in a wide geographical spread. Such network investments need to be borne by different actors, and we do not yet know what actors, and such investments have a strong publicprivate characteristic, given the societal value of CO 2 reduction. 4 It is therefore of crucial importance to find a cost-minimal solution for the design of such network.
• Many large-scale wind farms on the North Sea Germany, the UK, Norway, Denmark, Sweden, and the Netherlands will be developed in the next decades. Research 5,6 has shown that depending on the technologies chosen and the governance framework that is put into place, different topologies of the network will emerge, the theoretical minimum-cost network would give the actors a sense of how far away they are from this minimum.
The systems engineering discipline has traditionally developed methods to minimize the overall cost of complicated engineering systems (eg, airplanes) and also yielded important methods and tools for developing and scanning the trade space, for multiattribute decisionmaking and for dealing with complexity. 7 In the past decade, the attention has broadened to the design of not only highly complicated systems, but rather to complex sociotechnical systems. The design of large-scale sociotechnical systems, such as the infrastructure systems presented above, is a multiactor process, 8 pulling systems engineering into the social science domain. This means that new disciplines and approaches are needed to explore those problems and solve them; approaches that can deal with the sheer size of the problem in a multiactor context. 9 Infrastructure networks have a large number of degrees of freedom, 10 which makes it hard to intuitively compare the cost effectiveness of alternatives. This paper develops a method that supports a systems engineer by quickly finding least-cost network topologies. The method developed in this paper is particularly useful in a multiactor process, since it enables a quick analysis of low-cost networks, while real-world alternatives are discussed among stakeholders on the negotiation table.
Garber et al 11 recently developed a framework to capture such decision-making by diverse stakeholders. Our infrastructure network design problem is, in their terminology, a cooperation game, especially in those cases where network externalities increase when more sources and sinks are connected. Scholars argue that the decision process needs rigorous models to estimate values for alternative designs, to explore the trade space, find theoretical minimum-cost targets, and find the main values drivers. 12 This requires a deterministic, unbiased, and traceable calculation of those values. Supplying the stakeholders with an agreed upon, efficient (fast) calculation process, will allow them to execute several sensitivity analyses to explore how their stakes change with changes in assumptions or parameters. Such approach might go against some developments in the systems engineering discipline where more and more details from social processes and values are being incorporated into increasingly opaque models.
Our proposed approach, using mid to low fidelity, relatively simple models, has shown to contribute to trust in the negotiation process and therefore enhances the possibility of a reasonable outcome of the systems engineering process. 8 Finding adequate models and processes that produce agreed upon values for the huge trade spaces in the sociotechnical systems of systems that we are dealing with is therefore one of the key challenges in systems engineering.
Systems engineering processes are inherently combinatorically complicated, given the many design variables and their combinations into solutions. One important design variable in the design of networked systems is the topology of the network. However, the topology of networked systems, under uncertainty and in the context of many actors is an issue that has received relatively little attention in the systems engineering field. This is, on the one hand, due to the fact that in many design cases, the network topology was more or less fixedfor example, by street patterns to lay out city infrastructure-and, on the other hand, due to the fact that topological design is a mathematically complex problem in itself. For infrastructure developments, like the one we introduced earlier, the location of splitting nodes and junctions is still undecided, and capacities of links are often uncertain, rendering the topology challenge even more daunting.
For our examples, biogas, CO 2 , grids at sea, such decision-making is inherently multiactor, and examples of intrinsic uncertainties are: capacity of sources and sinks (how much biogas will be produced, how much CO 2 can the aquifer store, how many wind farms will be built and connected), the cost of right of way (licensing, buying out of property), and willingness to participate by actors (veto rights, political uncertainties, societal acceptance in the CO 2 case). Exploring these uncertainties in a the multiactor design process will help to demarcate the design space, and to pinpoint main areas of uncertainty, and thus provoking further sensitivity analyses, or different designs or move actors into new contractual negotiations. The sensitive and dynamic nature of such design and negotiation processes requires methods that are able to quickly but accurately scan the trade space, so that different solutions can be explored collectively by the stakeholders, for example, by experimenting with parameter values, or with assumptions.
This paper describes such an efficient method to assess the relative cost of different network topologies. Our method is to be used by systems engineers and is especially suitable to explore the trade space in an objective manner in a multiactor setting. The method enables systems engineers to explore the range of most cost-effective physical networks for those cases where topology and capacity are still uncertain and encompass large trade spaces. For the scope of this paper, we disregard control and institutional layers in our algorithm, as well as those laws of nature that may govern the more detailed (multiphase) physical flow within the pipes of those networks, even though they may be mildly dependent on the network topology. We generalize away from those details to develop and test different algorithms, while assuming that the required capacity per producer or consumer can be determined.
It is important to accentuate that we focus on developing an adequate rational and straightforward tool that is to be used in social settings of negotiating actors. Following Refs.13 and 14, we explicitly and intentionally chose to not include such social factors into our method, as this more than often leads to unrealistic models using simple weight factors for different actors' goals, ultimately leading to opaque model outcomes, and reduced trust in the model's outcomes by actors. By excluding much of those socioeconomic speculations and assumptions from our method, we trust in the shared use of the method in a real-life multiactor setting to account for many of such social values.
Our method determines a network layout that minimizes the initial investment costs that depend on both the length and the capacity of the pipelines, satisfying the demands of the consumers, for multiple sources (suppliers) and multiple sinks (consumers). For ease-of-reading, we will use the term pipelines in this paper to denote various infrastructure connections. We will show in this paper how our algorithm is able to find optimal network topologies, or trees, by explaining how we used the theory and how this leads to feasible and cost-minimum designs in reasonable computing time. The latter performance criterion of our approach is important, given that the algorithm needs to perform in a context where a systems engineer would want to redesign and test solutions as quickly as possible. Also, since we target real-world multiactor settings, table top drawing sessions, or serious-gaming sessions, computational performance of our algorithm is of crucial importance.
The next section will start with a literature overview of energy network design methods. We will root our approach in operations research, this being an important contributor to many systems engineering optimization problems. After that section, we will formulate the mathematical design problem and discuss our assumptions, the latter being of importance to the users of the approach. We go into detail in explaining the underpinnings of our approach and algorithm.
Detailed descriptions can be found in the Appendices and are relevant for users who want to understand the algorithm. Also, it allows systems engineers to use the optimal topologies in discussion with other stakeholders and adjust parameters in the algorithm during these debates.
We develop our solution method, and we will show its efficacy and efficiency in Section 4 of this paper. Finally, we test the usability and usefulness of the method on a number of case studies executed by groups of systems engineering students who were challenged with a network design problem under uncertainty in a multiactor setting.

MODELS AND METHODS FOR NETWORK SYSTEM DESIGN
In the previous section, we established the need, from a systems engineering perspective, for an accurate, yet quick optimization method for network topologies for infrastructures. We therefore conducted a thorough review of recent literature from the operations research and systems engineering fields, and this revealed operations research papers on the design of new energy networks, like CO 2 networks 15-21 or heating networks, [22][23][24] but also new networks to transport hydrogen, 25 animal waste, 26 water, [27][28][29][30][31] or oil. 32,33 Table A2 shows a short summary of the optimization problems discussed in the papers 15-33 and the methods used to solve these prob-lems. However, none of these papers incorporates the system design variables in the way that is core to our paper: exploring systems options that address the development of a network with uncertain sources and sinks, with an undetermined topology, and executed in a multiactor setting. The papers do contribute various interesting (parts of) answers on how to model particular design variables and constraints that span up our trade space, and how to solve the design problems. We discuss them below.

Design variable: system topology
Most authors design the network layout by minimizing investment costs, although some take into account the operational costs on the longer term as well. Only Bietresato et al, 26 Ivić et al, 31 Liu et al, 33 and Steele et al 30  Other approaches that look for minimum-length networks do allow extra splitting points to make the network shorter. When these newly introduced nodes can be located on optimal positions, they are called Steiner points as they were first defined by Jacob Steiner. The minimum length tree is then called a Steiner minimal tree (SMT). 36 For a general number of nodes, the ratio between the length of the SMT and the length of the MST has a lower theoretical bound equal to 0.82. 37 It therefore makes sense to find the Steiner tree when looking for a mini-

Design variable: link capacity
However, in general, the investment costs do not only depend on the length of the new pipelines, but also increase when larger pipeline capacities are needed. The extra capacity costs can be taken into account by using a cost function that depends both on the length and the capacity. This capacity cost function is often formulated as a concave function f(q) with q the capacity of the pipeline. 15,17,25,36,[38][39][40] Some, however, use only pipelines of specific discrete sizes and adapt their cost function accordingly. 27,28,30,32,41,42 Zhang and Zhu 41 first determine the pipeline capacity on a continuous scale after which they round it to one of the available sizes, in order to allow the use of an Non-Linear-Program (NLP) solver.
In this paper, we use the continuous concave function 40 f(q) = q , 0 ≤ ≤ 1. If = 0, f(q) indicates that the capacity does not influence the investment costs. In that case the problem translates to finding an SMT. If = 1, there is no cost reduction (ie, economy of scale) for combined pipelines. The function 17 for costs of pipelines in the CO 2 network corresponds to a -value of around 0.6. The same holds for the cost function 25 for a hydrogen network. The discrete cost table for the pipelines in a water network 28 is best fit with around 0.7. In most practical cases, will probably be somewhere around 0.6.
If the capacity does not influence the investment cost, then there is no difference in minimum-cost network between a multisource multisink and one-source multisink case. Although demand or supply requirements will also in that case determine the needed capacity of the pipelines, these capacities have no influence on the final investment costs. If capacity does play a role in the investment cost, then this will also influence the optimal network topology. The models in the literature do not endogenously take into account this effect of capacity increases or decreases of links as a result of adding new nodes.

Design variable: node function
The demand and supply functions can be formulated in different ways.
Thomas and Weng 40 and Trietsch 38 determine in advance for each pair of network nodes the flow between these nodes that the network should be able to transport. They do not explicitly distinguish between the role of the nodes, sources, or sinks. Because of this, the minimumcost network could be no longer a tree (or forest) but could also contain cycles. These networks are called Gilbert networks (GNs) 40 or G-Steiner trees 38 as they were first mentioned by Gilbert 36 and Gilbert and Pollak. 43 In our paper, we will assume an explicit division between sources and sinks, but we do not require flows between specified pairs of nodes. We only require that the network should be able to satisfy the minimum demand from sinks and/or the maximum supply from sources. This requirement is often used in multisource multisink networks to determine the optimal flow through an existing network 44,45 but as far as we are aware, it has not been used to design a minimumcost network. We assume that this is a realistic requirement for new infrastructures, and will therefore include it into our system problem formulation.

Design constraint: operations and regulation
In many infrastructures, the network and its operation are regulated.
The supply or demand itself often is not regulated or only partly. For example, in the case of the CO 2 networks, regulation may be applied to the price setting of the sinks and of the network, as these will be monopolies or oligopolies at best. For offshore grids or biogas networks, the networks are regulated for the same reasons: they constitute a monopoly. There might also be requirements on minimum quantities to be delivered when, for example, biogas producers go into a contract with gas distributors and resellers. Large wind farms would typically contract out a certain capacity to the market, and any under-or overestimations of that power (due to wind fluctuations) would be bought or sold at the spot market or other reserve markets. Existing methods in literature have practically disregarded these set of constraints for large networked system designs. In developing our algorithm and model for solving the network topology problem, we have ensured that such constraints can be implemented via the (un)certainty profiles of the sources and sinks.

Solving approaches
In general, there are three main approaches to solve networked system design problems. The first one uses heuristics and algorithms from graph theory and geometry 17,25,26,28,39,40,42 , like the Gilbert-Melzak method [46][47][48] to find the optimal topology and the optimal location of Steiner points (added nodes to minimize path length).
The second approach formulates the problem as an Mixed Integer (Non-)Linear Program (MI(N)LP). 15, [20][21][22]30,32,41,45,49 They find (sub)optimal networks using existing solver tools like modeling system for mathematical programming and optimization. 20 In this paper, we will formulate the multisource multisink network design question as How to find a minimum-cost network topology that connects sources and sinks with sufficient network capacity while guaranteeing that all demand of the sinks can be delivered by the supply of the sources?
Our algorithm will follow the geometric graph theoretical approach, as this is computationally fast, and is comprehensible for actors and systems engineers in infrastructure design settings. After we have explained our approach in more detail in the next chapter, we discuss a few related approaches from the literature in Section 3.4.4.

NETWORK DESIGN ALGORITHMS
For the design of an energy network with uncertain sources and sinks, in line with the literature explored above, we need to develop a costefficient network of connecting sources and sinks, and allowing the introduction of new splitting of connecting nodes.

General approach
We describe our general approach on the basis of an example for the development of a biogas infrastructure. In many rural areas, farmers operate manure and organic waste digesters to turn the waste into biogas. This biogas is partly reused on the farm, but the overproduction can be fed back into the main gas grid. However, in order to reach this main gas grid, the farmers need to cooperate and build a cost-efficient network that connects their multiple sources (the digesters) to one F I G U R E 1 Example with two sources (red, supply per source: 11) and 11 sinks (blue, demand per sink: 2), respectively, (A) with no pipelines, (B) minimum spanning tree, (C) = 0, (D) = 0.5, and (E) = 0.99 sink (the main gas network). In recent real-world cases, the designer of such network would make an inventory of the sources and their locations, and would then develop a topology for a shared network or grid, based upon expert estimations of the most efficient layouts. However, adding additional collection points for the biogas, before transporting the gas further downstream to the main gas pipeline is often disregarded. When the topology is constructed by such qualitative approaches and negotiations among farmers, adding such additional collection points would make the design space of the design problem too large to handle manually. However, such additional collection points would ultimately lead to a more cost-effective network as the total length of the pipelines would typically decrease.
In our approach, we use graph theory to construct a GN: 40 this will be the shortest length network (pipelines to be laid out by farmers), by allowing the introduction of new nodes (new biogas collection points). The following elements are taken into account in our algorithm to develop minimum-cost networks: • A GN is a network G that connects a given set of terminals, satisfying given flow demands q [i,j] from node i to node j and has a cost function.
• We include as a design variable, which represents the capacity cost exponent, a proxy for economies of scale in the investment cost of a pipeline.
• GN or trees can contain Steiner points, these are the "splitting/collection points" that would be added to create a more costeffective network.
• The size of the flows among the different nodes is included in the design space. This is a very challenging set of design variables and constraints for minimizing the cost of the network, for which we developed an effective and efficient solving algorithm in the remainder of this section. Figure 1 shows how such networks may look, given this algorithm. Figure 1A shows an example with two sources that can both supply 11 m 3 /s and 11 sinks that all need 2 m 3 /s. Figure 1B shows the MST to connect both sources with all sinks. The thickness of the network pipelines is relative to the required capacity of the pipelines in order to satisfy all demand, although extra cost for capacity is not taken into account in determining this tree ( = 0). Figure 1C shows the network when extra points can be included to reduce the total length of the network.
Capacity costs are still not incorporated by setting = 0. It is clear that the network length and by that the investment costs can significantly be reduced by adding these extra splitting points. Figure 1D shows the result for a medium influence of the capacity on the investment costs ( = 0.5). The effect of incorporating these costs is that higher capacity, ie, more expensive, pipelines are now shorter in favor of lower capacity, ie, cheaper, ones. This effect is even more evident when is increased to 0.99 in Figure 1E. In that case, it is hardly profitable to combine pipelines going to different sinks and most pipes connect a source with a sink in the shortest way. The following sections explain how the algorithm combines our design variables to span up our trade space and how it then finds the cost minimum solution in this space.

Objective: finding a minimum-cost Gilbert tree
Crucial for this design problem is that we include the volume of the flows among the different nodes in the design space, and this becomes part of our design variables; as we explained in the previous section, this is crucial, but has not been done yet in the literature. Because we include this design variable into our problem, our problem translates mathematically to finding a so-called minimum-cost Gilbert tree (MCGT).
Our algorithm intends to find this particular tree. We first explain this challenge.
A GN 40 is a network G, not necessarily a tree, which connects a given set of terminals, satisfying given flow demands q [i,j] from node i to node j and with a cost function in which E(G) is the set of all edges in G, l e is the length of edge e, q e is the capacity of e, and f(q) is a nonnegative, nondecreasing, triangular function on the capacity q. The network can contain Steiner points. If the network has a tree (or forest) topology, we will call it a Gilbert tree (GT). The minimum-cost GN (MCGN) is the network among all possible GNs with total minimum costs.
Our problem is a generalization of the MCGN problem described by Thomas and Weng. 40 We use the continuous concave function f(q) = q , 0 ≤ ≤ 1; however, where they set in advance the required flows between each pair of nodes, we only set the demand and supply of the nodes, but leave the decision about which source will deliver which sink as part of the decision problem. If we make this decision upfront, our problem directly transforms into theirs, see Figure 2.
Thomas and Weng 40 make no difference between sources and sinks, but they only require a network with sufficient capacity to supply the flow requirements between each pair of nodes. In that case, for higher capacity cost exponents (near 1), it might be profitable to build a complete network with a connection between each node pair.
Moreover, their MCGN might contain cycles. In our problem, however, capacity is only needed from sources to sinks. Given that, a network that minimizes the investment costs (1) will always have a tree topology, as stated in Theorem 1. do not know about any efficient algorithm for solving this problem.
They suggest to use the Gilbert-Melzak method 36 for finding a suboptimal solution together with a global optimization technique.

Improving networks: the adapted Gilbert-Melzak method
Our approach will scan the solution space by using different promising STs as a starting point. We apply our method, the adapted Gilbert- In Heijnen et al, 52 we gave a short explanation of the Gilbert-Melzak method, as we used it for one-source multisink networks. In this paper, we will adapt the method to use it for the multisource multisink case.
For the illustration here, we start from an initial noncrossing ST that connects all terminals and had the required capacities as weights on the edges. This initial tree was then used to search for improvements.
The same approach is used for multisource multisink networks as long as the initial ST satisfies the constraints (B.2)-(B.4) defined in B.1. Each change in the tree needs to preserve these constraints.
In each step, the current tree (or forest) is locally changed as long as a network with lower total cost is found. The improvements are based on the observation that it is profitable to partly join two adjacent connections if their interior angle is small, ie, smaller than the angle constraint. 40 This angle constraint depends on the value of . See, for example, the MST in Figure 1B. Almost all angles are small, indicating that when extra costs for capacity are not so high, ie, is near 0 ( Figure 1C), investment costs can be reduced by combining pipelines.
Which will not be the case if is high ( Figure 1E).
If the angle satisfies this constraint, adding a splitting point, ie, Steiner point, S will lower the total costs of the network. Using a geometric approach, the optimal location (see  The original procedure of the one-source multisink network needs some adaptations to be applicable on a multisource multisink network.
The main one is the fact that the direction of an edge can change if a sink, after a local network change, is supplied by a different source.
If capacity needs in the same edge have opposite directions, they are subtracted and only the net capacity will be assigned. The mathematical specification of the adapted procedure to assign capacity to the edges is explained in 0.4. Figure 4 shows the working of this procedure with a small example of four sources and three sinks. The supply of the sources and the demand of the sinks are denoted at the upper left of the nodes. To start off the procedure, the supply is defined as a nega-tive value ( Figure 4A). The procedure starts by selecting the leafs of the tree and add the required capacity to the incident leaf edge ( Figure 4B).
These edges are removed and the supply and demand of the new leaf nodes are updated with the capacity of the removed edges ( Figure 4C).
The procedure is repeated until all edge capacities are set ( Figure 4F).

Initial starting points: various spanning trees
The method in §3. 3  There are, however, some special STs that might have low investment costs in special cases.

Minimum spanning tree
When the capacity cost exponent = 0, ie, when only the edge length influences the investment cost, the MST is the lowest cost ST. Figure 5A gives an example of this tree.

Hub network
On the other hand, when the capacity cost exponent = 1, ie, parallel edges of capacity 1 are just as expensive as one edge of capacity 2, direct edges between sources and sinks result into the lowest costs.
In a one-source multisink network, this will lead to a star topology with the source functioning as hub. In a multisource multisink network, there are multiple stars when all sources are directly connected to all sinks. To reduce costs, priority is given to the shortest length edges between sources and sinks. If the capacity of a source is not sufficient to supply the nearest sinks, connections to the next nearest sink are also needed. We call this network the hub network (HN). Figure 5B gives an example of the HN.

Minimum-cost spanning tree
If is low, a minimum-cost ST (MCST) will probably resemble a minimum length ST and if is high, it will probably have more direct connections between sources and sinks. We use this assumption to propose an Repeat the steps 1-5 for all edges in the tree and select the best tree so far. The procedure is repeated as long as trees with lower costs can be found. We call this tree the minimum cost spanning tree (MCST) MCST. Figure 6 gives an example of this procedure.
Note that even in case = 0, the MCST does not have to be equal to the MST, since the MCST can also be a forest instead of a tree, which the MST cannot.

Benchmarking system designs
From the literature overview in Section 2, we found some papers discussing similar problems to the one in this paper. We shortly discuss these and describe to what extent our approach presented above would or would not be able to use the results of those papers for benchmarking against our system design outcomes. We found two papers for which the cases might not directly be used with our method, but that describes the proposed algorithms clearly, This method can be used to benchmark our results.
• André et al 25  The comparable methods from the literature discuss CO 2 networks.
However, the methods are not restricted to these types of networks.
We did not find useful equivalent methods applied on other type of networks. None of the available literature allowed us to benchmark the outcomes of our algorithm against theoretical or other cases. We therefore set out to test the performance (efficiency, efficacy, and usability) of our algorithm on our hypothetical and real-world cases.
The results are reported in the next section.

METHOD PERFORMANCE RESULTS
This section shows the performance results from our method. First, we will test the efficiency of our method in Sections 4.1-4.3. Since the MCGT-Problem is NP-hard, no optimal solutions can be determined within polynomial time. To analyze our method, we will compare our results with computable optimal layouts for very small problems. We will also apply our method on different initial STs to compare the results. Moreover, we will use the two alternative optimization techniques from literature. 17,25 In Section 4.4, we will test the efficacy of our method for an empirical systems engineering problem: the development of the biogas network in a real-world setting. Finally, in Section 4.5, we illustrate the usability of our method by describing the outcomes of a number of systems engineering student groups that used the method in various systems design problems. These students were relatively unfamiliar with the underlying math but were reasonably experienced in working with different systems engineering models and approaches in multiactor decision-making processes.

Method performance for small problems
For small problems, the optimal solution can be found by generating all possible Steiner tree topologies, adding the minimum required capacities to the edges, locating the Steiner points to their optimal location using the adapted Gilbert-Melzak method and calculating the total costs.
We use again Prüfer sequences (see B.6) to efficiently generate all possible Steiner tree topologies using the following characteristics: • All Steiner points have a degree equal to 3.
• A Steiner tree with N terminals can have at most N−2 Steiner points.
To obtain all possible Steiner tree topologies, we generate all Prüfer sequences L with the following characteristics:

Method performance for many larger examples
To obtain statistically significant differences (if any), we generated 100 examples randomly with the characteristics as described in Table 1 We generated the lowest cost GT on the three initial STs: MST, HN If it comes to the relative cost deviations, the network layout from the delta change algorithm was found to be on average 4% more expensive than the best tree found in our method. The network layout from Kzamierczak's algorithm was on average 5.9% more expensive than the best tree from our method.
For the three initial STs in our method, the minimum-cost ST is on average only 4.2% more expensive than the best tree found. The GT resulting from this initial tree has an average cost deviation to the best tree of 1.5%.  Table 3. The results reveal that the use of the Δ-tree as an additional initial ST gives very good results, both in finding the best minimum-cost tree as in computational time.
In case of time restrictions, one might use only a selection of the initial trees to obtain final results. We analyzed the cost and time results for all different combinations of initial STs. The results (in C) show that a good selection would be to combine the HN with the Δ-tree.  Table 4).

Method performance for large networks
In

F I G U R E 7
Bar chart of the number of times (out of 40) the best tree was found in sets 1-3, respectively also, for the larger networks, the Δ-tree as initial tree gives the best results. As could be expected, for lower values of , the MST gives better results than for higher values of , which is the opposite for the HN.
To show a possible relation between the results and the network size, Figure 8 shows a bar chart of the average relative costs com-

Method efficacy for finding a biogas network
We will also apply all methods on the following real-world case  From Table 5, the advantages of our method are clear. The best network is found by applying our method on the Δ-tree ( Figure 10F).
André et al 25 ( Figure 10A)  Kazmierczak et al 17 ( Figure 10B) build the network step-by-step and allows the addition of splitting points. However, because they fix the pipelines from earlier steps, these splitting points will almost never be on an optimal position. Moreover, shorter, and by that, cheaper pipelines will be added first, forcing the network into a certain layout that cannot be changed afterward. Besides, the computational time for  Figure 10F). engineering domain. The student groups who selected our method explored the following system design problems, showcasing that our method is widely usable for various network design problems:

Method usability
• expansion of gas networks to accommodate shale gas exploration in the Netherlands, • creation of power grids in rural India, • hydrogen refueling infrastructure, • district heating network, • international electricity grid interconnections, and • alternative urban heating systems (no more natural gas).
The specific network outcomes of these projects are not relevant for discussion in this paper, but the general observations from the application of the method by students confirm our claim that the method is easy to use and produces relevant results. In many cases, students realized more cost-effective (cheaper) networks than the ones they would develop by manually exploring the (vast) design space. Especially in "green-field" cases, students were unable to introduce new nodes for reducing system cost, without the support of our method. We also observed that students used the method as a transparent approach to find a theoretical minimum-cost target, before delving into other operational, institutional, or other stakeholderrelated issues with colleagues in their group. Issues that would surface in those social processes, and to which the network topology would need to be adapted in an iterative fashion, included, for example, • operational constraints, eg, downtime issues due to (lack of) maintenance; • power play of actors in the network, eg, blocking optimal location of splitting nodes; • need for coordination and public acceptance, eg, the acceptance of shale gas networks; • speed of infrastructure rollout in relation to required related developments, eg, hydrogen vehicle adoption; • speed of innovation of competing technology, eg, heat pump technology; and • market design, eg, transmission fees.
A very interesting finding was that students would not revert to trying to adapt or refine the algorithm in order to include these issues in solving the design problem, but rather would use the optimal networks as a starting point for the ensuing discussions. This intuitive approach kept the model and the solutions transparent, and helped to identify clearly the difference between rational cost-optimal solutions and modified solutions that would actually work and be accepted in practice. These observations strengthen our assumption, supported by complex decision-making literature, that many stakeholder and social issues should not be modeled endogenously into a model, but should rather be kept out in the open discussions with the stakeholders.
Although our observations concerning our method were done for students in a learning situation, there is no reason to assume that this would not translate to actual real-world situations.

CONCLUSIONS
The purpose of this paper is to develop a method to support infrastructure system design processes in multiactor contexts. In these multiactor settings with diverging stakes and interests, actors need to develop large complex sociotechnical systems of systems, scanning vast trade spaces under a plethora of diverging assumptions. The systems engineering toolbox and theories point to the need to estimate the value of different design options, while problematizing such value determination in a multiactor complex setting. We showed that quite some developments point into the direction of more complicated models, trying to collapse social and engineering design variables and processes into one big model or framework. This paper begs to differ.
Our paper has shown that complex systems engineering processes do not necessarily require more complicated tools or methods to include all stakeholder values in one overarching model from the outset. On the contrary, we posit that when a fast and reliable straightforward method is used, this creates so much time and room for actors to explore the system on their own terms that the discussions will focus on the assumptions and parameters instead of the contested outcomes. Such discussions have shown to increase trust in the design process and may therefore yield (more) acceptable outcomes within acceptable times. This effect is confirmed in the complex decisionmaking literature.
From several contemporary examples in the infrastructure design field, CO 2 , biogas, and offshore grids, we derived the need for quick, yet accurate, algorithms to find low-cost or minimum-cost network topologies. We also concluded that such algorithm should empower the different actors to bring in their assumptions and uncertainties independently as to explore the individual stakes and sensitivities in a shared network development project.
We presented an effective and robust method to find a minimumcost layout for new infrastructures of multiple sources and multiple sinks following a geometric graph-theoretical approach. The location of these sources and sinks and their demand or supply are known.
The approach determines the topology of the network, and the length and capacity of the pipelines to be constructed. This paper proves that a minimum-cost multisource multisink network will never contain a cycle and is therefore a MCGT (or forest), not a network. This MCGT can best be found by applying the adapted Gilbert-Melzak method starting from a well-chosen initial ST.
A core finding of this paper is that finding a very good candidate for the initial ST is crucial because the process is deterministic and having the best starting topology matters for the final results. We found several good candidates for this purpose. The MCST, found by making profitable edge turns improving the MST, gives good results but becomes computationally challenging for very large networks. The Δtree 25 also gives very good results as initial tree and can, in general, be found in less time. It is crucial, however, to use a set of starting trees to increase the probability to find the lowest cost network layout. A combination of the defined HN and the Δ-tree has shown to be Paretooptimal with respect to minimal costs and minimum computation time.
Our approach outperforms existing methods on average with 4% to 6% of the costs of the final network. Even though this seems insignificant, the actual real-world money value can be significant since infrastructure development involves large budgets. More importantly, however, is that our approach is in general fast and finds a minimum-cost network at lower cost than conventional methods. This makes our method substantially more suitable to support a multiactor systems engineering process than any conventional method, as it can scan vast trade spaces in a fraction of the time.
Our proposed method is fast and gives good results, and in our future research endeavors, we will apply the method in real-world mul- Using such relatively straightforward mathematical optimal design in a highly stakeholder-driven systems engineering process will lead to rationalization of the process, and could move negotiators and systems engineers away from "estimates" of experts who typically do not include additional hubs (Steiner points) in their informal designs.
Our method therefore illustrates that multiactor complex design challenges do not, as a matter of course, require complex, allencompassing methods into which all values of different actors are collapsed. Instead, we have shown and underpinned that simple, straightforward methods may even be more useful in some of such cases. Our findings thus call for a serious discussion in the systems engineering discipline regarding the extent to which we should mathematically try to include all social phenomena into our models. Instead of the observed trend in the systems engineering community toward including and endogenously modeling all of these social science aspects and turning them, erroneously, into one-dimensional performance criteria, we should find a proper balance between mathematical modeling and actual stakeholder interaction.   Table A2 of the optimization problems discussed in the literature as described in Section 2 and the methods used to solve these problems. Table A1 gives the abbreviations used in Table A2.

B.1 Mathematical formulation
Mathematically, we translate the problem as follows, comparable to the formulation used by Xue et al. 39 Let A = [a 1 , a 2  In this paper, we assume that each node is or a sink or a source. In case the node is both sink and source, we only use the net demand or supply to determine the network capacity.
Let G be a network connecting the sources with the sinks. We want to find the network G that minimizes the total investment costs, given The sinks and sources (together called the terminals) can be both transmission nodes and supply or demand nodes, respectively, which lead to the following constraints for all source nodes v ∈A Moreover, for all sink nodes v ∈B Without loss of generality, we assume that the total supply of the sources covers at least the total demand of the sinks.
The formulated problem is NP-hard. 39

B.2 Formal optimization
The optimization problem formulated above can be rewritten in a formal optimization NLP form.
Given the following input: -, 0 ≤ ≤ 1 the capacity cost exponent, -N the total number of nodes to be connected,  25  HD  1N  IC  PC  LY, SZ  TR  HA   26  AW  N1  LN  LY  SN  TR  HA   16  CO  NN  IC  SC  LY, SZ  TR  DS   32  OI  N1  IC  SC, PR  LY, SZ  TR  HA   15  CO  NN  IC  SC  LY, SZ  ON  TR  DS   29  WT  N1  IC  PC, PR  LY, SZ, PP  TR  GA   24  HT  1N  IC  SC, PC  LY, SZ  SN  TR  GS   18  CO  NN  IC, OC  PC  LY, SZ  TR  GA, HA   31  WT  1N  LN  PC, the capacity demand of nodes.
Find the decision variables: -q [i,j] , 1 ≤ i, j ≤ 2N − 2 the capacity of an edge from node i to node j, For which is minimized with Under the constraints: This optimization problem has many local minima.  We define r = min(q e |e ∈ Cyc), as the minimum capacity of all edges in the cycle and for each edge, we define a e = q e − r, ∀e ∈ CW and b e = q e − r, ∀e ∈ CCW, so all a e and b e are nonnegative.
Let f(q) be a concave function. Comparable to Xue et al, 39 we define a new function F by Since F is a linear combination of concave functions with coefficients l e > 0, F itself will also be concave. Then, Therefore, either adding capacity r to e ∈ CW and subtract capacity r to e ∈ CCW will reduce the network costs, or adding capacity r to e ∈ CCW and subtract capacity r to e ∈ CW will reduce the network costs.
However, if we add capacity r (or -r) to the cycle edges, the network costs will be reduced, but the cycle will not per se be broken. When we repeat the procedure, each cycle in the network will finally terminate and the network will have a tree or forest topology with lower costs than the original network. ■

B.4 Capacity assignment procedure
Before an initial ST can be used to find local changes, sufficient, but not too much, capacity needs to be assigned to the edges in such a way 5. Repeat from step 3 until all capacities are set.

B.5 Importance of initial spanning trees
The Gilbert-Melzak method is a deterministic method. Starting from one specific initial ST, it will always result in the same GT.

Theorem 2.
Every GT topology S on n terminals can be found by applying the adapted Gilbert-Melzak procedure on a specific ST T on these n terminals.
Proof. The adapted Gilbert-Melzak procedure used here to find an MCGT topology from a specific initial ST improves the tree locally in an iterative process starting by the worst angle in the current tree and continuing as long as improvements on the total costs can be made.

B.6 Spanning trees from Prüfer sequences
Prüfer sequences of length N − 2 have a one-to-one relation with all STs on N nodes. 53 Using Prüfer sequences, all possible STs can easily be generated. There are N (N−2) different STs on N nodes, both crossing and noncrossing. We will check them all, although crossing trees will always be more costly than noncrossing trees. 47 For 15 randomly generated experiments with total number of nodes N = 4, 5, or 6 and total number of sources is 1, 2, or 3, we generate all STs and assign the required capacities to the edges using the procedure from §B.4. We applied the adapted Gilbert  was used. The "Best tree" column tells how many times the lowest cost tree was found by the selection of the initial STs out of the 100 examples. The "Average relative costs if not" column shows the average cost deviation from the best tree when the initial ST selection did not find the best tree. It is an indication of how bad the results are when the best tree is not found. Remember that the best tree is not always the global minimum-cost tree. The "Average time (s)" column gives the average time in seconds that was needed to find the end results for this combination of initial STs.  the end results of the GTs resulting from the MCST have the lowest spread. However, the interquartile range of the GTs resulting from the Δ-tree is smaller, but the Δ-tree leads in some cases to more extreme results.