Optimal connected subgraphs: Integer programming formulations and polyhedra

Connectivity is a central concept in combinatorial optimization, graph theory, and operations research. In many applications, one is interested in finding an optimal subset of vertices with the essential requirement that the vertices are connected, but not how they are connected. In other words, it is not relevant which edges are selected to obtain connectivity. This article is concerned with the exact solution of such problems via integer programming. We analyze and compare (mixed) integer programming formulations with respect to the strength of their linear programming relaxations. Along the way, we also provide a tighter (compact) description of the connected subgraph polytope—the convex hull of subsets of vertices that induce a connected subgraph. Furthermore, we give a (compact) complete description of the connected subgraph polytope for graphs with no four independent vertices.


INTRODUCTION
In many clustering and network analysis applications one is interested in finding an optimal subset of vertices with the main requirement being that the vertices are connected, but not how they are connected. In other words, one looks for a subsets of vertices, such that the subgraph induced by these vertices is connected. Which edges are selected to obtain connectivity is not relevant.
Applications of such an induced connectivity span a diverse set of areas: Computational biology [13], wildlife conservation [11], computer vision [9], social network analysis [30], political districting [19], wireless sensor network design [6], and even robotics [4]. Besides this practical relevance, connectivity is also a central and well-studied theoretical concept.
From an optimization perspective, a fundamental prototype for induced connectivity problems is the maximum-weight connected subgraph problem (MWCSP); see, for example, [1]. Given an undirected graph G = (V, E) and vertex weights p ∶ V → R, the task is to find a connected subgraph S = (V(S), E(S)) ⊆ G such that ∑ v∈V (S) p (v) is maximized. The literature also describes variations of the MWCSP such as the rooted and the budget constrained problem, see [2]. Another well-known optimization problem that is based on induced connectivity is the unweighted (as well as uniformly weighted) Steiner tree problem: Any solution (i.e., Steiner tree) consisting of n nodes will be of weight n − 1; it

Definitions and notation
For the vertices and edges of an undirected, graph G we write V(G) and E(G), respectively. For a directed graph D, we write A(D) for its set of arcs. For a subset of vertices U ⊆ V, we define Further, the notation n ∶= |V| and m ∶= |E| will be used. For U ⊆ V define (U) ∶= {{u, v} ∈ E|u ∈ U, v ∈ V∖U} and for a subgraph G ′ ⊆ G and U ′ ⊆ V(G ′ ) define A corresponding notation is used for directed graphs (V, A): For U ⊆ V define + (U) ∶= {(u, v) ∈ A|u ∈ U, v ∈ V∖U} and − (U) ∶= + (V∖U). For a single vertex v, we use the short-hand notation (v) ∶= ({v}), and accordingly for directed graphs. We define the neighborhood of a vertex set U ⊆ V as N(U) ∶= {v ∈ V∖U|∃u ∈ U, {u, v} ∈ (U)} .
For a single v ∈ V, we set N(v) ∶= N({v}). For directed graphs, we define We denote by (G) the maximum number of independent vertices in graph G. Let (V, A) be a directed graph, let r, t ∈ V, and consider an r − t flow f in (V, A). We denote the net flow value of f by |f | ∶= f ( + (r)) − f ( − (r)). Given an MWCSP instance (V, E, p) we define T P ∶= {v ∈ V|p(v) > 0}.
Let v and w be two distinct vertices of G. A subset C ⊆ V∖{v, w} is called (v, w)-separator, or (v, w)-node-separator, if there is no path from v to w in the graph (V∖C, E [V∖C]). The family of all (v, w)-separators is denoted by (v, w). Note that (v, w) = ∅ if and only if {v, w} ∈ E. For directed graphs, we say that C ⊆ V∖{v, w} is a (v, w)-separator if all directed paths from v to w contain a vertex from C.
For any function x ∶ M  → R with M finite, and any M ′ ⊆ M define x(M ′ ) ∶= ∑ i∈M ′ x(i). Given an IP formulation F we denote its optimal objective value by v(F). Further, we denote the optimal objective value and the set of feasible points of its LP-relaxation by v LP (F) and  LP (F), respectively. If we want to emphasize a specific problem instance I, we also write F(I).

Preliminaries: MWCSP and related problems
The MWCSP is  -hard; see, for example, [23]. It is even  -hard to approximate the MWCSP within any constant factor as shown in [1]. Note that in the case of only non-negative vertex weights, the MWCSP reduces to finding a connected component of maximum vertex weight; in the case of only non-positive vertex weights, the empty set constitutes an optimal solution.

Rooted MWCSP
A close relative of the MWCSP is the rooted maximum-weight connected subgraph problem (RMWCSP); see, for example, [2], which incorporates the additional condition that a non-empty set T f ⊆ V needs to be part of any feasible solution. For simplicity, we usually assume that p(t) = 0 for all t ∈ T f .

Unweighted Steiner tree problem
Given an undirected connected graph G = (V, E) and a set T ⊆ V of terminals, the unweighted Steiner tree problem in graphs (USPG) is to find a tree S ⊆ G with T ⊆ V(S) such that |E(S)| is minimized. The USPG can also be seen as a Steiner tree problem with uniform edge weights. Moreover, the USPG can be formulated as an RMWCSP by setting T f ∶= T and assigning each nonterminal vertex a weight of −1. Many of the hardest Steiner tree benchmark instances are unweighted; see [25] for an overview. Moreover, many theoretical articles consider just the unweighted case; see, for example, [17], who describe an exact polynomial-space algorithm for the USPG.

Steiner arborescence problem
Several results of this article rely on the Steiner arborescence problem (SAP), which is defined as follows: We are given a directed graph D = (V, A), costs c ∶ A → R ≥0 , a set T ⊆ V of terminals and a root r ∈ T. The SAP requires an arborescence (i.e., directed tree) S ⊆ D with T ⊆ V(S) that is rooted at r, such that c(A(S)) is minimized.

FORMULATIONS FOR ROOTED CONNECTED SUBGRAPHS
This section is concerned with connected subgraph problems where a predefinded, non-empty set of vertices needs to be part of any feasible solution. We start with formulations for the SAP. While the SAP is not based on induced connectivity itself, it forms the base of several other results in this article.

Formulations for the Steiner arborescence problem
Consider an SAP instance (V, A, T, r, c). Associate with each arc a ∈ A a binary variable y(a) indicating whether a is contained in the Steiner arborescence (y(a) = 1) or not (y(a) = 0). A natural formulation by [35] (i.e., one in the original variable space) can thereupon be stated as: One verifies that the constraints (2) ensure the existence of (directed) paths from the root to each terminal in a feasible solution. We note that a feasible but not optimal solution to DCut is not necessarily the incidence vector of a Steiner arborescence. Indeed, the convex hull of all y ∈ N A 0 that satisfy (2) is of blocking type, that is, its recession cone equals R A ≥0 . Another well-known formulation, see, for example, [35], is based on flows. This formulation affiliates with each terminal t ∈ T∖{r} an r − t flow f t .

Formulation 2. Directed multicommodity flow formulation (DF)
if v ∈ V∖{r, t}, By using the max-flow min cut theorem, one shows that DF is an extended formulation of DCut, that is, proj y ( LP (DF)) =  LP (DCut); see, for example, [14]. Both formulations can be strengthened by the so called flow-balance constraints from [24]: We will refer to the extensions of the above formulations that additionally include (9) as DCut FB and DF FB , respectively. We end this section with a (new) result for SAP, which will be used several times in the following.
Proof. For the case of |T| = 1 and |T| = 2, the lemma holds already without the flow-balance constraints: The case |T| = 1 is clear, and the case |T| = 2 (corresponding to the shortest-path problem with non-negative weights) results in a totally unimodular constraint matrix. So let (V, A, T, c, r) be an SAP with two terminals t, u besides the root r. We additionally require that a feasible solution does not have any leaves apart from r, t, u. For this so called two-terminal Steiner tree problem a complete polyhedral description is given by [3]: The above description is based on the following observation: Any feasible arborescence for the two-terminal Steiner tree problem consists of a path from r to a splitter node s, as well as a s − t and a s − u path. Note that any of these paths can be a single node. The flow from r to the terminals t and u is split into a common part f , and two separate parts f t and f u . Let (f t , f u , y) be an optimal LP solution to DF FB . Assume that this solution is minimal, that is, for any feasible solution is contained in the polyhedron described above. Define for all a ∈ A: First, we show (10). Let v ∈ V. Because of the assumed minimality of (f t , f u , y), we have that y = max{f t , f u }. Together with (15) we get:f If v = r, then and thus (19) implies that (10) holds. If v ∈ {t, u}, then and Finally, if v ∈ V∖{r, t, u}, the flow-balance constraints imply that (19) is non-negative.
Next, consider (11)-and equivalently (12). By definition it holds that which implies (11). Likewise, (13) follows from the definition off ,f t , andf u . ▪ Note that the lemma is best possible in the sense that there exist SAP instances with |T| = 4 such that v LP (DCut FB ) ≠ v(DCut FB ); see, for example, [28,31].

Rooted maximum-weight connected subgraphs
This section discusses the directed variant of the RMWCSP, see [2]: Given a directed graph D = (V, A), vertex weights p ∶ V → R, a non-empty set T f ⊆ V and an r ∈ T f , find a connected subgraph S ⊆ D containing T f such that any v ∈ V(S) can be reached from r on a directed path in S, and p(V(S)) is maximized. Any undirected RMWCSP can be formulated in directed form by choosing an arbitrary r ∈ T f and replacing each edge by two anti-parallel arcs.
Note that any solution to the directed RMWCSP can be represented as an arborescence. This observation leads to the following IP formulation, see, for example, [2], which is based on a well-known formulation for SAP, see, for example, [21]. Define for each v ∈ V a variable x(v) ∈ {0, 1} that is equal to 1 if and only if vertex v is part of the solution. Analogously, define for each a ∈ A a variable y(a) ∈ {0, 1}.

Formulation 3. Rooted Steiner arborescence formulation (RSA)
Constraints (26) establish the relation between the arc variables and (the actually redundant) vertex variables. Constraints (27) correspond to constraints (2) in the DCut formulation. The constraints make sure that in a feasible solution S, for any v ∈ V(S) there is an r − v path in S as well. Finally, constraints (28) assure that all fixed terminals are contained in any feasible solution.
In [2], a new formulation for the directed RPCSTP based on node-separators is introduced. Note that the use of node-separators for modeling connectivity is already suggested in [18].

Formulation 4. Rooted node separator formulation (RNCut)
Constraints (32) ensure that connectivity is fulfilled: By enforcing that for any solution vertex v, all (r, v)-separators contain at least one solution vertex as well. Constraints (33) ensure the inclusion of all fixed terminals.
Besides the two IP models introduced above, several other formulations for RMWCSP (sometimes including a budget constraint) have been introduced in the literature; see, for example, [2,11]. However, one can show that these formulations are weaker with respect to the LP-relaxation than both of the above models, see [2] for some such results. Another example is the formulation from [10] that is based on single-commodity flows. However, also this formulation can be shown to be weaker than Formulation 3 by using max-flow/min-cut arguments-similar to corresponding results for minimum spanning tree or Steiner tree problems, which can be found for example in [22].
In [2], it is stated that the LP-relaxations of the RNCut and RSA model yield the same optimal value. Unfortunately, this claim is not correct, as the following proposition shows. Appendix B gives a counterexample-and furthermore provides some insight on how the node separator constraints miss to capture structures accurately described by edge cut constraints. Proposition 2. It holds that proj x ( LP (RSA)) ⊂  LP (RNCut) and the inclusion can be strict.
Proof. The inclusion is essentially shown in [2]. For the sake of completeness, we nevertheless prove it the following. Let (x, y) ∈  LP (RSA). We will show that x ∈  LP (RNCut). Consider any v ∈ V∖({r} ∪ N + (r)) and a non-empty C ∈ (r, v). Let V r be the vertices that are reachable from r in the graph (V∖, E[V∖]). We obtain: which shows that (32) is satisfied. Thus, x ∈  LP (RNCut).
Finally, an RMWCSP instance for which proj x ( LP (RSA)) ⊊  LP (RNCut) holds is given in Figure B1 in Appendix B.
Because of x( ) = 0.5, we have that either y((a, )) < 0.5 or y((b, )) < 0.5 holds. Thus, we have either , which contradicts (27). In particular, it holds that One can strengthen the RSA formulation by inequalities similar to the flow-balance constraints. However, these constraints depend on the objective vector (i.e., they are only valid for specific node-weight assignments), so they cannot directly be used for polyhedral results.
We refer to the strengthened formulation as RSA FB . One readily obtains the following result from Lemma 1.
. For each t ∈ T p , we add a new terminal t ′ to T ′ and arcs (r, t ′ ) of weight p(t) and (t, t ′ ) of weight 0 to A ′ . It holds that recall that we assume T p ∩ T f = ∅. Any optimal LP solution (y, z) to RSA FB can be extended to a feasible LP solution y ′ to DCut FB defined by y ′ (t, t ′ ) = x(t), y ′ (r, t ′ ) = 1 − x(t) for all t ∈ T p , as well as y ′ (a) ∶= y(a) for all a ∈ A. Thus, Because I ′ has at most three terminals, Lemma 1 and (36) imply that the above inequalities are satisfied with equality. Consequently, v LP (RSA FB (I)) = v(RSA FB (I)). ▪

Unweighted Steiner tree problems
This section analyzes and compares two formulations for the USPG. First, we state the node-separator formulation from [16]. Note that in [16], a more general version for the prize-collecting USPG is used. However, the prize-collecting USPG is essentially a MWCSP. The results of this section can be partly extended to this more general variant (which is done in Section 3 for the non-rooted case), but for simplicity, we now consider the USPG only.

Formulation 5. Terminal node separator formulation (TNCut)
Second, we look at the well-known bidirected cut formulation (BDCut) for (U)SPG. This formulation corresponds to the DCut formulation for the SAP obtained by replacing each edge of the SPG by two anti-parallel arcs of the same weight, and choosing an arbitrary terminal as the root.

Exactness of the bidirected cut formulation
This section formulates conditions under which the bidirected cut formulation has no integrality gap. We start with a direct consequence of Lemma 1, which applies also to weighted SPG.
A simple reduction technique for USPG is to contract adjacent terminals (and delete one edge from each resulting pair of multi-edges). The following proposition shows that the absolute integrality gap of BDCut is invariant under this operation. This property will be exploited in (the subsequent) Theorem 6 to reduce the size of instances with a small number of independent vertices. Proposition 5. Let I be an USPG instance with adjacent terminals t, u. Let I ′ be the USPG obtained from contracting t and u. It holds that v LP (BDCut(I)) = v LP (BDCut(I ′ )) + 1. (42) Proof. Throughout the proof, we assume that u is the root for the BDCut formulation, that is, r = u. It is well known that the choice of the root does not affect v LP (BDCut); see, for example, [21] (this result also follows from the proof of Theorem 6). Furthermore, let D ′ = (V ′ , A ′ ) be the bidirected graph obtained by contracting r and t and let r ′ be the new vertex. So, Let y be an optimal LP solution to BDCut(I). The optimality of y implies that y( − (t)) = 1, see [31]. Create an optimal solutionỹ (which can possibly be equal to y) as follows.
With this result at hand, we obtain the following theorem (recall that (G) denotes the independence number of graph G).  (I ′ )). Furthermore, because of (G) ≤ 3 it holds that |T ′ | ≤ 3. For |T ′ | < 3, the BDCut formulation is well known to have no integrality gap. So assume |T ′ | = 3. By construction of I ′ , the terminals form an independent set. Further, let y be an optimal LP solution to BDCut(I ′ ) with an arbitrary r ∈ T ′ being the root.
Suppose that v LP (BDCut(I ′ )) ≠ v(BDCut(I ′ )). By Lemma 1, there is a v ∈ V ′ ∖T ′ such that Because of (G) ≤ 3, at least one of the terminals needs to be adjacent to v. We may assume that this property holds for r. Otherwise, we can readily create another optimal LP solutionỹ that satisfies (43) and has a root adjacent to v: Assume that a t ∈ T∖{r} is adjacent to v and let f t be a unit flow from r to t such that f t ≤ y; defineỹ((q, u)) ∶= y((q, u)) − f t ((q, u)) + f t ((u, q)) for all (u, q) ∈ A ′ . Define a new LP solution y ′ from y as follows. y(a). Note that because of (43) it holds that y ′ (A ′ ) < y(A ′ ). It remains to be shown that y ′ is feasible. Suppose that there is a U ⊆ V∖{r} with U ∩ T ′ ≠ ∅ and y ′ ( − (U)) < 1. Because y is feasible, it has to hold that v ∈ U. LetŨ ∶= U∖{v}. By the construction of y ′ it holds that y( − (Ũ)) = y ′ ( − (Ũ)) = y ′ ( − (Ũ)) + y ′ ((r, v)) − y ′ ( + (v)) ≤ y ′ ( − (U)) < 1, which contradicts the feasibility of y. Consequently, we have shown that v LP (BDCut(I ′ )) = v (BDCut(I ′ )) and, thus, v LP (BDCut(I)) = v (BDCut(I)). ▪ The theorem is best possible; that is, there exist USPG instances such that (G) = 4 and v LP (BDCut) ≠ v(BDCut); see, for example, [14,15].

2.3.2
Comparison of edge and node-based formulation Formulation 5 (TNCut) was used within a branch-and-cut algorithm by the most successful solver at the 11th DIMACS Challenge [12]. Furthermore, the solver was able to solve several USPG benchmark instances that had been unsolved for more than a decade to optimality. Thus, one might wonder how this formulation theoretically compares with the better known bidirected cut formulation. As the next proposition shows, BDCut is always stronger than TNCut and the relative gap can be quite large.

Proposition 7. It holds that v LP
where the supremum is taken over all USPG instances.
Proof. For the first inequality consider an optimal LP solution y to BDCut. Define x ∈ R V by x(v) ∶= y( − (v)) for all v ∈ V∖{r} and x(r) ∶= 1. The optimality of y implies x(v) ≤ 1 for all v, see [31]. Let t, u ∈ T with t ≠ u and C tu ∈ (t, u). We will show that C(t, u) satisfies (39). If C tu ∩ T ≠ ∅, then x(C tu ) ≥ 1, because x(q) ≥ 1 for all q ∈ T due to (2) and the definition of x. Thus, (39) holds. If C tu ∩ T = ∅, let U r be the connected component in the graph induced by V∖C tu with r ∈ U r . By definition of C tu , either t ∉ U r or u ∉ U r . Therefore, y( + (U r )) ≥ 1, which implies y( − (C tu )) ≥ 1 because of + (U r ) ⊂ − (C tu ). Now we obtain from the definition of x that Finally, by construction of x we have that note that y( − (r)) = 0 because y is optimal. For (44) we construct the following family of USPG instances. For any k ≥ 3 let I k be the USPG instance with k + k 2 nodes, k + k 2 edges, and k terminals defined as follows. Let t i for i = 1, … , k be the terminals and define for Figure 1. A feasible (and indeed optimal) LP solution x to TNCut(I k ) is given by x(t) ∶= 1 for all terminals t and x(v) ∶= 0.5 for any Steiner node v. Its objective is k 2 2 + k − 1. On the other hand, it holds that v LP (BDCut(I k )) = v(BDCut(I k )), because I k consists of a cycle (and is thus in particular series-parallel); see, for example, [20]. Any optimal Steiner tree in I k contains all edges except for those between two FIGURE 1 USPG instance I 3 . Terminals are drawn as squares. adjacent terminals t i and t i+1 . Thus, v LP (BDCut(I k )) = v(BDCut(I k )) = (k + 1)(k − 1) = k 2 − 1. Consequently, which concludes the proof. ▪

Corollary 8. The (relative) integrality gap of TNCut is at least 2.
Note that one can strengthen TNCut by constraints that correspond to the flow-balance constraints for BDCut; see [16]. However, if compared to BDCut FB , the results of Proposition 7 remain the same for this stronger version of TNCut.

FORMULATIONS FOR NON-ROOTED CONNECTED SUBGRAPHS
In this section, we consider the undirected MWCSP. Some of the following results can also be extended to the directed case. However, the undirected MWCSP is the more common (and, arguably, also more natural) problem.

Node-based formulations
This section considers formulations for the MWCSP that use only node variables. The probably best known one, see, for example, [34], is given below.

Formulation 6. Node separator formulation (NCut)
Constraints (47) guarantee that in a feasible solution S for any disjoint v, w ∈ V(S), at least one node of each (v, w)-separator is also contained in S.
In this section, we are interested in cases where NCut has no integrality gap. Recalling the invariance of the BDCut integrality gap to the contraction of terminals, one might wonder whether a corresponding property holds for the MWCSP and the NCut formulation. As show in the next proposition, the answer is yes. As before, we will exploit this property for showing an integrality condition based on the independence number: By exhaustively contracting all adjacent, positive vertices. Note that when contracting adjacent vertices t, u ∈ T p into a new vertex t ′ , we set p(t ′ ) ∶= p(t) + p(u).

Proposition 9. v LP (NCut) is invariant under the contraction of adjacent vertices of positive weight.
Proof. Let I be an MWCSP instance with an edge {t, u} ∈ E such that t, u ∈ T p . Let I ′ = (V ′ , E ′ , p ′ ) be the instance obtained from I be contracting {t, u} into a new vertex t ′ . It holds that v LP (NCut(I ′ )) ≤ v LP (NCut(I)), because any x ′ ∈  LP (NCut(I ′ )) can be mapped to a x ∈  LP (NCut(I)) with For the opposite case, let x be an optimal LP solution to NCut(I). The optimality of x, and the fact that {t, u} ∈ E imply Assume that x ′ (t ′ ) ∈ (0, 1)-otherwise, the proof is already complete. It remains to be shown that x ′ ∈  LP (NCut(I ′ )). Suppose this is not the case. Then there are a, b ∈ V ′ and an a-b separator C ′ ab ⊂ V ′ such that Because x is feasible, t ′ ∈ C ′ ab . Thus, we obtain from (50) that and therefore min{x(a), Now we return to the original instance I. Because x is optimal, and x(t) = x(u) < 1, there is a q ∈ V∖{t, u} and a C qt ∈ (q, t) such that Similarly, there is a s ∈ V∖{t, u} and a C su ∈ (s, u) with x(u) + x(s) − x(C su ) = 1. At least one such combination q, C qt , or s, C su satisfies u ∉ C qt or t ∉ C su , otherwise we could increase x(u) and x(t). Assume w.l.o.g. u ∉ C qt . From (53), we obtain Thus, (53) and (52) imply a, b ∉ C qt . We note that C qt ∉ (a, q), because (52) and (53) imply Likewise, C qt ∉ (b, q). Consequently, any path from {t, u} to a or b needs to cross C qt ; otherwise, the latter would not separate q and t. Therefore,C ab ∶= (C ′ ab ∖{t ′ })∪C qt separates a and b (in the original graph). However, from (50) and (54) we obtain which contradicts the feasibility of x. ▪ Furthermore, one obtains the following optimality criterion:

Proof.
Consider an MWCSP I = (G, p) with |T p | ≤ 2. The case |T p | ≤ 1 is clear. Let {a, b} ∶= T p and assume p(a) ≥ p(b). Thus, there is a minimal optimal LP solution x such that x(a) = 1. Let (V, A) be the bidirected equivalent of G. Create a new directed graph (V ′ , A ′ ) by replacing each node v ∈ V∖{a, b} by two nodes v 1 , v 2 and arcs (v 1 , v 2 ), (v 2 , v 1 ). Further, all ingoing arcs of v become ingoing arcs of v 1 , and all outgoing arcs of v are now outgoing arcs of v 2 . Define arc capacities k for each pair of these new arcs by x(v); for any (remaining) arc e ∈ A set k(e) ∶= ∞. 1 By the max-flow/min-cut theorem there is an a-b flow f with |f | = x(b) in this extended network. Define the directed MWCSP I r ∶= ((V, A), T f , r, p) with T f ∶= {a} and r ∶= a, and set y ∶= f ↾ A . Because of the optimality and minimality of x it holds that (x, y) ∈  LP (RSA(I r )). Thus, v LP (NCut(I)) ≤ v LP (RSA(I r )). Furthermore, y satisfies constraints (35). Because of v(NCut(I)) = v(RSA(I r )), Lemma 3 implies that v LP (NCut(I)) = v(NCut(I)). ▪ Figure 2 shows an MWCSP instance with |T p | = 3 and v LP (NCut) ≠ v(NCut). It holds that v(NCut) = 1, but v LP (NCut) = 1.5 (set the values of all negative weight node variables to 0.5 and the remainder to 1).
Finally, by combining the previous two propositions we obtain a significantly shorter proof of a main result from [34].
Proof. Let p ∈ R V . If (G) ≤ 2, then Proposition 9 implies that the MWCSP (G, p) can be transformed to an MWCSP with at most two positive weight vertices without changing v LP (NCut). Now, Proposition 10 gives v LP (NCut) = v(NCut). Because p can be chosen arbitrarily,

Indegree constraints
Given an undirected graph G = (V, E), a ∈ Z n is an indegree vector if there is an orientation For each indegree vector the corresponding indegree inequality is given as where x ∈ R V ≥0 are the node variables. [26] shows that the indegree inequalities describe the connected subgraph polytope if G is a tree. Furthermore, [34] shows conditions for (57) to be facet inducing, and show that the constraints can be separated in linear time. It is further shown that the constraints (57) can strengthen the NCut formulation.

Edge-based formulations
An edge-based formulation for the directed MWCSP is introduced in [1], based on a transformation to the prize-collecting SPG. We will use essentially the same formulation for the undirected MWCSP, but without the transformation to the prize-collecting SPG, and thus with a different objective function. Consider the bidirected equivalent D = (V, A) to the given undirected graph. Let (V r , A r ) be the directed graph defined as follows with an additional node r: Define the following extended MWCSP formulation based on the new graph (V r , A r ).

Formulation 7. Extended Steiner arborescence formulation (ESA)
ESA is almost the same as RSA (Formulation 3). The additional constraint (61) ensures that at most one arc incident to the (artificial) root node r is selected. Otherwise, a solution could consist of several connected components in the original graph (V, E).
The remainder of this section aims to prove an integrality condition for proj x ( LP (ESA)) based on the independence number. Our approach can be divided in two parts. First, we show that for any MWCSP instance with 1 ≤ |T p | ≤ 3 there is an optimal LP solution (x, y) with x(v) = 1 for a v ∈ T p . In the second part (following Lemma 15), we use this v as a root node and apply the same principal ideas already used in Section 2.3.1 for the USPG: We show the invariance of the integrality gap under edge contraction and reduce any MWCSP instance with bounded independence number to an MWCSP instance with bounded number of positive vertices. We start with an easy technical result.

Lemma 12.
Let (x, y) be an optimal LP solution (x, y) to ESA, and let v ∈ V. There is aỹ ∈ R A r withỹ((r, v)) = x(v) such that (x,ỹ) is an optimal LP solution to ESA.

Lemma 13.
Let (x, y + ) be an optimal LP solution to ESA + . Then (x, y) ∈ R V+A r with y(a) ∶= y + (a) for a ∈ A + r and y(a) ∶= 0 for a ∈ A r ∖A + r is an optimal LP solution to ESA.
Proof. Let ESA ′ be the reduced version of ESA where constraints (60) are only enforced for vertices In this proof, we only consider minimal optimal LP solutions, that is, solutions for which no entry can be reduced without losing either feasibility or optimality. First, we show that any optimal LP solution to ESA + is also optimal for ESA ′ . To this end, we show the existence of an optimal LP solution (x ′ , y ′ ) to ESA ′ such that y ′ ((r, v)) = 0 for all v ∈ V∖T p . Assume there is an optimal LP solution (x ′ , y ′ ) to ESA ′ with y ′ ((r, v)) > 0 for a v ∈ V∖T p . Because (x ′ , y ′ ) is optimal, there is an r − t flow f t with f t ≤ y ′ for a t ∈ T p with |f t | = y ′ ((r, v)). We can now proceed as in Lemma 12 to revert the flow going to t. The resulting optimal solution (x,ỹ) satisfiesỹ((r, v)) = 0 andỹ((r, u)) ≤ y ′ ((r, u)) for all u ∈ V∖{t}.
Second, we show that any optimal LP solution (x ′ , y ′ ) to ESA ′ with y ′ ((r, v)) = 0 for all v ∈ V∖T p satisfies constraints (60) also for v ∈ U with v ∉ T p . We use essentially the same line of argumentation used in [21] for the SPG bidirected cut formulation. Suppose there is a U ⊆ V and a u ∈ U with Choose such a U with |U| as small as possible. Because of (65), there is a e ∈ − (u)∖ − (U) such that y ′ (e) > 0. Because of the minimality of (x ′ , y ′ ), there is a W ⊆ V and a t ∈ W ∩ T p such that e ∈ − (W) and Because of e ⊆ U and |e ∩ W| = 1, one obtains |U ∩ W| < |U|. We will show that U ∩ W satisfies (65), which contradicts the minimality of |U|. By standard graph theory we have that y ′ (( − (U)) + y ′ ( − (W)) ≥ y ′ ( − (U ∩ W)) + y ′ ( − (U ∪ W)).
With (66), it follows that y ′ (( − (U)) ≥ y ′ ( − (U ∪ W)), which leads to a contradiction. ▪ Further, we require the following result. The (quite lengthy) proof is given in the Appendix.

Lemma 15.
If |T p | ≤ 3, then there is an optimal LP solution (x, y) to ESA such that x(t) ∈ {0, 1} for all t ∈ T p .
As the last piece, we have the now familiar contraction result (with a slight generalization).

Proposition 16. v LP (ESA) is invariant under the contraction of adjacent vertices of non-negative weight.
The proposition can be proven in a similar way as Proposition 5, with a few additional technical details. We now reach the main result of this section.  (I ′ )). Also, I ′ satisfies |T ′ p | ≤ 3 and the vertices T ′ p are independent. By Lemmas 12 and 15, there is an optimal LP solution (x,ỹ) to ESA(I ′ ) such that x(u) ∈ {0, 1} for all u ∈ T ′ p , and y((r, t)) = 1 for one For simplicity, we deviate from the assumption that fixed terminals have 0 weight. It holds that v(ESA(I ′ )) = v(RSA(I ′ t )) and v LP (ESA(I ′ )) = v LP (RSA(I ′ t )). We will show that v LP (RSA(I ′ t )) = v(RSA(I ′ t )), which concludes the proof. Let (x, y) be the restriction of (x,ỹ) to (V ′ , A ′ ). Note that (x, y) is an optimal LP solution to RSA(I ′ t ). Suppose that (67) does not hold. Thus, by Lemma The case |T ′ p | < 3 can be readily ruled out by a flow argument. So assume |T ′ p | = 3. Because of (G) ≤ 3, at least one vertex u ∈ T ′ p is adjacent to v. Recall that x(u) ∈ {0, 1}. If x(u) = 0, we reduce the problem to the support graph of (x, y), which corresponds to the case |T ′ p | < 3. So assume Further, construct an optimal solution (x,ỹ) to I ′ u with root u analogously to Lemma 12. In this way, y( + (v)) <ỹ( − (v)) holds again (for the same v as above). In the following, assume u = t. Define a new LP solution (x ′ , y ′ ) from y as follows. For a 0 ∶= (t, v) set y ′ (a 0 ) ∶= y( + (v)). For any a ∈ − (v)∖{a 0 } set y ′ (a) ∶= 0. For all a ∈ A ′ ∖ − (v) set y ′ (a) ∶= y(a). Set x ′ (v) ∶= y( + (v)), and x ′ (w) ∶= x(w) for all w ∈ V∖{v}. By construction of I ′ t it holds that p(v) < 0 (otherwise, v would have been contracted into u). Thus, p ′ T x ′ > p ′ T x. The feasibility of (x ′ , y ′ ) can be seen as in the proof of Theorem 6. ▪ Note that there are graphs with a(G) = 4, such that proj x ( LP (ESA)) is not integral. For an example, extend the graph in Figure 2 as follows. Add a new vertex v and edges between v and the (three) vertices of negative weight shown in the figure.

Comparison of the formulations
A result from [1] states that the directed equivalents of ESA and (a slight generalization of) NCut induce the same polyhedral relaxation of the directed connected subgraph polytope. This result suggests that the same relation holds for the undirected case. Unfortunately, the result from [1] is not correct (the proof suffers from a similar problem as that discussed in Appendix B for the rooted case). The strict inclusion result given in the next proposition can indeed also be extended to the directed case. Proposition 18. The following inclusion holds and can be strict: Proof. Let (x, y) ∈  LP (ESA) and let a, b ∈ V, a ≠ b. Let C ∈ (a, b) and let U a be the connected component in the graph (V∖C, E[V∖C]) with a ∈ U a . DefineŪ b ∶= V∖U a andŪ a ∶= U a ∪ C. Because ofŪ a ∩Ū b = C, one obtains where we use − ∶= − D r . Thus, = y( + (r)) + y( − (C)) (72) An example for a strict inclusion is given in Figure 2. Consider the following point that is in  LP (NCut), but not in proj x ( LP (ESA)): Set the x values of all negative weight node variables to 0.5 and the remainder to 1. To see that this point is indeed not in proj x ( LP (ESA)), consider arc variables y such that y( − (v)) = x(v) for all vertices v. Because of x( ) = 1, we can proceed as in Lemma 12 and assume that y({r, }) = 1. First, we have y( − ({a, b, c})) ≥ x(a) = 1. Because of Next, we consider the indegree constraints. Following [34], we define While  ′ ⊈  LP (NCut), see, for example, [34], the indegree constraints cannot improve the ESA formulation, as the following proposition shows.

Proposition 19.
The following inclusion holds and can be strict: Proof. Consider an undirected graph G, and let D be its bidirected equivalent. Furthermore, let D r be the extended, directed graph on which ESA is defined. Let (x, y) ∈  LP (ESA). First, note that constraints (59) and (60) imply for all {v, w} ∈ E that Let be an indegree vector. It holds that ∑ v∈V which implies that (57) is satisfied by x; thus, x ∈  ′ . For a strict inclusion consider the graph in Figure 2 and the point x as defined in the proof of Proposition 18. To see that this point satisfies all indegree constraints, consider an indegree vector q that minimizes ∑ v∈V q v x(v). Because all adjacent vertices of a, , and f have value 0.5, we obtain q a = q = q f = 0. Thus, , which shows that the indegree constraint is satisfied. ▪ Summarizing the results of this section, one obtains: and the inclusion can be strict.
Finally, note that by using one flow for each vertex, similar to the DF formulation, it is also possible to obtain a compact extended formulation for the connected subgraph polytope that is equivalent to ESA-and thus (strictly) stronger than the combined node-separator and indegree formulation.

CONCLUSION
This article has analyzed node and edge-based formulations for combinatorial optimization problems based on induced connectivity. Furthermore, we have shown conditions for the LP-relaxations to be tight. In particular, a (compact) complete description of the connected subgraph polytope for graphs with less than four independent vertices has been given. Overall, it has been demonstrated that the edge-based formulations consistently provide stronger LP-relaxations than their node-based counterparts. For MWCSP, the considered edge cut formulation has been shown to be strictly stronger than the combination of the well-known node-separator and indegree formulations.
Finally, we note that the theoretical predominance of edge-based formulations over node-based ones is complemented by recent computational results: In [32], a MWCSP solver that uses the ESA + FB formulation is shown to significantly outperform all other (and in particular node-based) MWCSP solvers from the literature.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request. Because of (A8) and (A9), we obtain Defineỹ ∈ R A + r byỹ (e) ∶= max{y(e),f a (e), f b (e)}, for all e ∈ A + r , and definex ∈ R V accordingly. It holds that p Tx = p T x, andx(c) = 1. ▪

APPENDIX B: NODE SEPARATORS AND REJOINING OF FLOWS
Consider the directed RMWCSP instance (G, T, p, r) with G = (V, A) depicted in Figure B1. A proof from [2] intends to show that v LP (RNCut) ≤ v LP (RSA) holds. For this purpose, the authors consider an arbitrary solution x ∈  LP (RNCut) and construct an auxiliary graph G ′ by replacing each node v ∈ V∖{r} with an arc (v 1 , v 2 ). All ingoing arcs of v become ingoing arcs of v 1 , and all outgoing arcs of v are now outgoing arcs of v 2 . Moreover, (non-negative) capacities k ′ on G ′ are introduced for each arc (v ′ , w ′ ) of G ′ by 1, otherwise. Figure B2 shows an auxiliary support graph of the instance illustrated by Figure B1. It is possible to send a flow with flow value x(v) from root node r to each arc (v 1 , v 2 ) with v ∈ V∖{r} because of constraints (32). Let f v (j, l) be the amount of a flow with source node r, sink node v ∈ V∖{r}, and flow value x(v) sent along arc (j, l). Define the arc variablesŷ(j, l), (j, l) ∈ A, of FIGURE B1 Directed RMWCSP instance FIGURE B2 Illustration of an auxiliary support graph G ′ corresponding to the instance in Figure B1 regarding the optimal solution x(v) = 0.5, v ∈ V∖T, and x(t) = 1, t ∈ T, to the RNCut formulation the RSA formulation as follows:ŷ (j, l) ∶= { max v∈V∖{r} f v (j 2 , l 1 ), j, l ∈ V∖{r}, max v∈V∖{r} f v (j, l 1 ), j = r, l ∈ V∖{r}.
Hence, the arc variables of the instance in Figure B2 are given byŷ(j, l) = 0.5 for each (j, l) ∈ A. Moreover, define the node variables asx(v) =ŷ( − (v)). Thus, in our case, it holdsx(a),x(b),x(c),x(e) = 0.5, andx( ) = 1. The proof from [2] claims that we can follow x(v) =x(v), v ∈ V, by this definition of the variables. However, this claim is not true because of 0.5 = x( ) ≠x( ) = 1, and therefore, no solution can be constructed from the solution x to the RNCut model. In summary, and somewhat broadly speaking, the weaker LP-relaxation can be explained as follows. The RNCut formulation can be interpreted as a multi-commodity flow problem in an enlarged graph. However, enlarging the graph opens new possibilities for what is sometimes called rejoining of flows [31]: Flows for different commodities enter a node on different arcs, but leave on the same arc. Such a rejoining can lead to an increased integrality gap.