## SEARCH BY CITATION

### Keywords:

• branch and cut;
• clique;
• cutting plane;
• separation;
• stable set;
• stable set polytope

### Abstract

This tutorial provides an overview of various characteristics of effective branch and cut type algorithms for the maximum stable set problem. We discuss several facet-defining inequalities for the stable set polytope along with their separation routines. In particular, we review implementation tweaks for the separation routines and reference empirical studies, illustrating the performance of these cutting planes for benchmark graphs. In addition to the polyhedral study, we present basic preprocessing, discuss heuristic methods particularly suited within a branch and cut framework, and examine a branching rule.

### 1. Introduction

Let G=(VE) be an undirected, simple graph with node set V and edge set EV × V. Gc is graph G with a node-weight function . A stable set of a graph G is a set of nodes S with the property that the nodes of S are pairwise non-adjacent. A stable set is also known as an independent set, co-clique or anticlique. We abbreviate a stable set as . A stable set of graph Gc, which maximizes , is called a maximum stable set, or a maximum-weight stable set and is denoted by the symbol . A maximum stable set is called a maximum-cardinality stable set or a maximum-size stable set for the special case in which the weighting function is equal to . The weight of a maximum stable set is denoted by α(Gc) and, in the case of , it is the cardinality of a maximum stable set of the corresponding unweighted graph. In this case, we neglect c and simply write α(G). The size of a maximum-cardinality stable set is called the stability number or the stable set number. We distinguish between a maximum stable set and a maximal stable set; a maximal stable set is an inclusion-maximal stable set. This means that there is no stable set that contains the maximal stable set with a higher weight.

The weighted graph in Fig. 1 has eight nodes, v1, …, v8. The nodes and their weights are written as a pair for each node. The set {v2v8}, for instance, is a stable set of G. The dark-shaded nodes v2, v4 and v7 form a maximum stable set . The weight of is 6 represented by α(Gc). It can be seen from this graph that a maximum stable set is not uniquely determined. Set {v2v4v8} is also a maximum stable set with weight 6. The light shaded nodes v3 and v5 form a maximal stable set of G with weight 5, because Γ({v3v5})∪{v3v5}=V, where the neighborhood Γ(W) for WV denotes the set of all nodes in which are adjacent to at least one node in W.

The complement of a stable set is a clique, i.e., a clique is a complete subgraph. The clique number is the maximum weight of a clique in G and is denoted by ω(Gc). Note that a clique in G corresponds to a stable set in with the same weight and vise versa. Hence, there is a bijection between the stable sets in G and the cliques in , which gives

• (1)

This shows that the maximum stable set problem and the problem of determining a maximum clique are equivalent.

It is well known that it is -hard to determine a maximum stable set in an arbitrary graph. For a proof see, for instance, Garey and Johnson 1979. This is still true in the special case in which the weighting function is equal to . Furthermore, it can be shown that for fixed ɛ>0, there is no polynomial time algorithm for approximating the stability number within a factor of under the assumption that (cf. Arora and Safra, 1992; Fujisawa et al., 1995).

The maximum stable set problem is a classical combinatorial optimization problem. It is not only of theoretical interest, but has applications in various fields (Bomze et al., 1999; Rebennack, 2006; Avenali, 2007; Butenko, 2003). Therefore, it is not surprising that there is a huge body of literature on exact and heuristic approaches for solving instances of the maximum stable set problem (clique problem). Branch and cut algorithms are one of several techniques used to tackle these computationally challenging problems. Other popular methods include branch and bound (Pardalos and Rodgers, 1992; Pardalos et al., 1993), branch and price (Warrier et al., 2005), continuous formulation approaches (Gibbons et al., 1997), global optimization (Pardalos and Phillips, 1990), Lagrangian relaxation (Campelo and Correa, forthcoming) and semidefinite programming (Dukanovich and Rendl, 2007).

The purpose of this tutorial is to provide an overview of the various contributions towards branch and cut algorithms for the maximum stable set problem. Our focus lies in examining polyhedral structures of the stable set problem. A non-expert should be able to implement a moderately sophisticated branch and cut algorithm after studying this tutorial. However, this tutorial is not intended to be a comprehensive literature review. The remainder is largely based on Rebennack (2006, 2008) and Rebennack et al. 2011.

This tutorial is organized as follows. We start with the concepts of general branch and cut algorithms in Section 'Branch and cut algorithms: general concepts'. The stable set polytope is discussed in Section 'Stable set polytope', where various facet-defining inequalities for the stable set polytope are reviewed. The separation of the previously presented inequalities is the topic of Section 'Separation'. Preprocessing is presented in Section 'Preprocessing', followed by branching in Section 'Branching', and heuristic methods in Section 'Branching'. We conclude with Section 'Conclusions'.

### 2. Branch and cut algorithms: general concepts

In this section, we discuss some general concepts of branch and cut algorithms for integer programming (IP). This part is inspired by Christof 1997 and Theis 2005 and is mainly based on Elf et al. 2001 and Kallrath and Wilson 1997.

We want to solve an IP problem in the following generic form:

• (2)
• (3)
• (4)

with integral decision variables x. The task is to maximize the function while the variables must adhere to the set of linear constraints (3). Following this section, we focus on the special case of binary variables x.

The difficulty in solving an IP problem of form (2)–(4) lies in the integrality constraints (4). In fact, these constraints make the problem -hard, see Karp 1972 and Borosch and Treybig 1976. This still holds true for the case in which the decision variables are binary. By contrast, linear programming (LP) problems are of form (2)–(3) with continuous decision variables. LP problems can be solved in polynomial time; it is still unknown whether LP problems can be solved in strongly polynomial time, see Rebennack 2008.

One approach to solving the IP problem is to compute the convex hull of all feasible solutions of the IP problem, i.e., . The optimization problem over this convex hull is then an LP; as all vertices of PI are integer feasible solutions. If we can describe (in polynomial time) this polytope with a polynomially sized list of inequalities, we can compute the optimum in polynomial time. Unfortunately, in general, the number of facets of a polytope PI may be exponential in the number of variables or it may be -hard to compute certain facets.

Consider now Fig. 2, which shows the graphical representation of an IP problem with two decision variables and six constraints. The integer polytope (shaded area) is given as the convex combination of all feasible solutions with the optimal solution . However, relaxing the integrality constraints leads to the solution . Note that in order to compute the optimal solution via LP techniques, it is not necessary to derive all the facets. Instead, the two dashed facets passing through are enough in our example to describe the integer polytope at the optimal solution point.

In order to solve the system (2)–(4), one can omit the integrality condition on x to obtain a so-called LP-relaxation, or more precisely, an LP-domain-relaxation. The solution, , of the LP-relaxation may be fractional, but it provides an upper bound (UB) of the maximization IP (2)–(4). By adding special inequalities to the system Axb, the LP-relaxation polytope more closely approximates polytope PI. “Special” means that, if is a vector satisfying , we want a valid inequality , a so-called cutting plane, separating from PIP. Determining whether such an inequality exists and computing one, if possible, is called the separation problem. For instance, in Fig. 2, inequality (11) is a cutting plane for . An important theorem in this context is that, under some technical conditions, the optimization problem can be solved in polynomial time if and only if the separation problem is polynomially solvable (Grötschel et al., 1988). This is known as the fundamental theorem of the equivalence of separation and optimization. In order to solve an IP problem, one could generate such cutting planes until the optimal solution is found. The first cutting plane method goes back to Miliotis 1978. In practice, however, a combination of cutting planes with the branch and bound technique is more successful. This combination is called Branch and Cut. The use of that method was first published by Grötschel et al. 1984 and its name was introduced by Padberg and Rinaldi 1987. A detailed history of Branch and Cut methods is presented in Applegate et al. 2006.

The core of a branch and cut algorithm is displayed in Fig. 3. After initialization and preprocessing, this algorithm uses a tree to maintain subproblems. It begins the bounding process with its root. After solving an LP problem (initially, this is the LP-relaxation), the solution must be checked for LP feasibility (if the LP-relaxation is infeasible, then the IP is infeasible as well). In the case of feasibility, the local upper bound (LUB) is updated. If the LUB is smaller than the lower bound (LB) – for maximization problems, the objective function value of any feasible solution defines an LB – the subproblem is fathomed. Otherwise, the algorithm checks if the solution is integer feasible. If it is an integer solution, the best solution for this subproblem has been computed and this node of the tree can be fathomed. Generally, the solution is not integer feasible and heuristics are used to compute feasible solutions. This is done in the step “expand LP” of Fig. 3. The separation routine computes cutting planes that are added to the LP-relaxation. If improvement of the LP objective function value after adding new constraints is too small or the separation routines do not compute any new constraint, the algorithm branches, creating new subproblems; the collection of all such subproblems is called tree. If the selection of a subproblem of the tree is successful, which means that the tree is not empty and that the UB is greater than the LB, the algorithm reenters the bounding phase. This process repeats until the tree is empty.

While the LP-relaxation of (2)–(4) provides a global upper bound (GUB), each subproblem computes an LUB, which is only valid for all its children in the tree. In some cases, the particular integer programming problem instance is too difficult to be solved exactly. Therefore, one can stop the algorithm after a certain number of iterations. A strength of branch and cut algorithms lies in the fact that the solution quality is provided for the obtained solution (assuming that the problem is feasible and that at least one feasible solution has been computed). This quality is measured by the gap, which is calculated by (GUBLB)/GUB.

It is not always helpful to add inequalities to the LP-relaxation. The system Axb may grow too big, increasing the computation time. Therefore, after each separation phase, some inequalities are eliminated and stored in a pool. This pool can also be checked every few iterations to see whether it contains a violated constraint which is then added again to the system Axb. This process is called pool separation.

In the next sections, we focus on various aspects (preprocessing, generation of feasible solutions, separation and branching) of a branch and cut algorithm tailored to solve maximum stable set problems; but we study the stable set polytope first.

### 3. Stable set polytope

The separation procedures are the heart of any branch and cut algorithm. Specifically, one is interested in computing facet-defining inequalities; these are inequalities that are not dominated by any other valid inequality. The generation of facet-defining inequalities is the topic of Section 'Separation', which is based on the polyhedral study of the stable set problem in this section. As such, all the results obtained are valid both for cardinality stable set problems as well as for maximum-weighted stable set problems. We do not go into the dark alleys of polyhedral theory, but instead, refer to Grötschel et al. 1988, Brønsted 1983, and Ziegler 1995. Let us now have a closer look at the polytope of the stable set problem.

For this purpose, let us define an incidence vector of a stable set of graph G as a -dimensional vector with the following components

• (12)

This allows us to formulate:

Definition 3.1. ((Schrijver, 2003)) The stable set polytope of a graph G=(VE) is the convex hull of the incidence vectors of all stable sets in G. It is denoted by

If it is clear from the context which graph is meant, we write PSTAB instead of PSTAB(G). The stable set polytope is bounded by the n-dimensional unit cube and, therefore, it is indeed a polytope. The definition of a stable set implies that the unit vectors are always stable sets. The zero vector is trivially a stable set – the empty set – and therefore, the stable set polytope is full-dimensional. This implies that all facets of PSTAB are inequality constraints.

A maximum stable set corresponds to a vertex of PSTAB and the incidence vector to a binary variable x. As the weight of a maximum stable set is equal to , it is a candidate for the objective function of an IP formulation. To complete the IP formulation, consider the following system:

• (13)
• (14)

The definition of an incidence vector of a stable set implies the so-called non-negativity inequalities (13). They are always facet-defining for PSTAB, as there are n−1 affinely independent solutions that have value zero in one entry. The inequalities (14) ensure that there cannot be a pair of adjacent nodes in one stable set, which is a direct consequence of its definition. This type of constraints is called edge inequality. Hence, inequalities (13) and (14) are both valid for the stable set polytope. Inequalities (14) are not generally facet-defining, see Section 'Clique-constraint stable set polytope'. If graph G contains no isolated nodes1, then all integer solutions of the inequality system (13) and (14) are incidence vectors of a stable set of G. However, if the graph is disconnected, each variable corresponding to a isolated node is unbounded. To avoid this, one can limit it with a value of 1. For the purpose of this tutorial, we consider only graphs without isolated nodes. Inequalities (14), together with the non-negativity inequalities, are called trivial inequalities. The last observations lead us to a linear IP formulation for the maximum stable set problem:

• (15)
• (16)

In this integer program, vector x corresponds to an incidence vector of a stable set and matrix AT is the node-edge incidence matrix of G. Inequalities (15) are equivalent to (14).

As it is -hard to determine a maximum stable set in an arbitrary graph, it is also -hard to find an optimal solution of . This suggests the use of branch and cut algorithms. Consider now the LP relaxation of for a connected graph, given by

• (17)
• (18)

Its polytope is described by

and called a stable set polytope relaxation.

To assess how well PRSTAB approximates PSTAB, we solve for the graph given in Fig. 1, for example, with the simplex method. We obtain the solution vector with an objective function value of 8.5. This vector is not an incidence vector of a stable set. A maximum stable set for this graph has weight 6 (see Section 'Introduction'). Thus, the absolute difference of the two objective function values is 2.5 and the percentage difference is 41.6%. The percentage difference becomes even larger if we solve the maximum-cardinality stable set problem only for the triangle2 formed by nodes v5v6 and v7. In this case, the objective function value is 1.5 while the cardinality of a maximum stable set is 1, which is a difference of 50%. In general, the objective value of for an arbitrary graph with weighting 1 is greater than or equal to , independent of its structure. This property shows that the stable set polytope relaxation is not a tight approximation of the stable set polytope in general. This suggests a further polyhedral study of the stable set polytope is necessary to enable us to solve the maximum stable set problem via branch and cut algorithms.

Note from our example that the solution is -valued, which is due to structural properties of the stable set polytope relaxation. Balinski suggested

Corollary 3.2. (Balinski, 1970; Nemhauser and Trotter, 1974). The vertices of PRSTAB (G) are-valued.

To motivate the next theorem, we look again at the graph of Fig. 1, but in this case we delete the three edges v1,v5, v5,v7 and v6,v8 and denote the graph by . Solving for this graph leads to the following result: x*=(0, 1, 0, 0, 1, 0, 1, 0). In this case, a (0, 1)-valued solution indicates that the stable set is . Again, this has a special reason. The new graph is bipartite. In Fig. 4, one can see the bipartition V1 and V2 of .

The following theorem summarizes this observation.

Theorem 3.3. (Grötschel et al., 1988). The non-negativity inequalities (13), together with the edge inequalities (14), are sufficient to describe PSTAB(G) if and only if G is bipartite and has no isolated nodes.

For bipartite graphs, Theorem 3.3 implies that the maximum stable set problem can be solved in polynomial time (e.g., by solving the linear program ). Thus, using the linear-domain-relaxation in a branch and cut algorithm solves the problem at the root node of the branch and bound tree. Therefore, there is no need in a branch and cut framework to check whether the graph is bipartite or not. Nevertheless, it can be checked in linear time if a graph is bipartite. The polynomial time algorithm designed by Hopcroft and Karp in 1973 can be adapted to solve the maximum stable set problem for bipartite graphs, cf. Grötschel et al. 1988.

#### 3.1. Cycle-constraint stable set polytope

PRSTAB(G) and PSTAB(G) are only equal if G is bipartite. Figure 1 illustrates that the inequalities (13) and (14) might not be sufficient to describe the stable set polytope. One of the simplest non-bipartite graphs are the graphs induced by odd cycles; the subgraph G[W] := (WE(W)) induced by node set WV is the subgraph of G with node set W and edge set E(W), containing all edges of graph G with both endnodes in W. A three cycle and a five cycle can be seen in Fig. 5. These graphs are induced subgraphs of Fig. 1. It turns out that point lies in PRSTAB but not in PSTAB for the two graphs shown in Fig. 1; in particular, N is a vertex of PRSTAB. The values in the nodes of the two odd cycles are optimal values of the corresponding variables in the -model.

This suggests a new class of inequalities for PSTAB(G); these inequalities are called odd-cycle inequalities

• (19)

with V(C) denoting the set of nodes contained in cycle C. By construction, the odd-cycle inequalities are valid for the stable set polytope, as the cardinality of any stable set in an odd cycle can be at most the greatest integer smaller than half the length of the cycle3. The polytope satisfying the non-negativity, edge and odd-cycle inequalities is called cycle-constraint stable set polytope of G and is denoted by

A graph G is called t-perfect4 if PCSTAB(G)=PSTAB(G), which means that the inequalities of PCSTAB are sufficient to describe the stable set polytope. Bipartite and almost-bipartite graphs5 are examples of t-perfect graphs. To check whether a graph is t-perfect or not is : a non-integer vertex of PCSTAB would show that the graph is not t-perfect. As in the previous section, the special structure of t-perfect graphs helps us to find a maximum stable set, as stated in the next corollary.

Corollary 3.4.. The maximum stable set problem in a t-perfect graph can be solved in polynomial time.

We show in Section 'Odd-cycle inequalities' that the odd-cycle inequalities' separation lies in . This property, combined with the theorem of the equivalence of separation and optimization (see Section 'Branch and cut algorithms: general concepts'), provides a proof for Corollary 3.4.

As mentioned above, facets of PSTAB are computationally desirable for branch and cut algorithms because they are not dominated by any other valid inequality. Thus, facets contribute more toward closing the gap than any inequality which is dominated by the facet. A necessary condition for an odd-cycle inequality to be facet-defining for PSTAB is that its odd cycle is chordless; an edge joining two nodes of a cycle, which is not itself an edge of the cycle, is called a chord. Consider Fig. 6 (a). The chord v1v4 breaks the five cycle in the three cycle (v1v1v4v4v4v5v5v5v1v1), and the four cycle (v1v1v2v2v2v3v3v3v4v4v4v1v1). In general, if there is a chord, one obtains an odd cycle of smaller length and an even cycle. The inequality corresponding to the smaller odd cycle together with the edge inequalities dominates the odd-cycle inequality; their domination shows that an odd-cycle inequality with the cycle containing a chord cannot define a facet. In our case, one obtains

• (20)

A graph, which is a chordless cycle, is called hole. If an odd cycle induces an odd hole, then the corresponding odd-cycle inequality is called an odd-hole inequality. In general, we have

Corollary 3.5.. (Nemhauser and Trotter, 1974). Let G be an odd hole. Then is facet-defining for PSTAB(G).

Corollary 3.5 makes the assumption that G has to be an odd hole. In all other cases, the structure of the graph dictates whether the odd-cycle inequalities (19) are facet-defining for PSTAB(G) or not. We will see in Section 3.3 that the two odd-hole inequalities indicated by Fig. 5 are facet-defining for the graph of Fig. 1. On the contrary, Fig. 6 (b) shows a chordless five cycle with an additional node. The inequality

• (21)

is valid and dominates the odd-cycle inequality ; this domination demonstrates that this odd-cycle inequality has to be lifted to become a facet-defining inequality.

##### 3.1.1. Lifting

The process called lifting is the extension of a valid inequality for a polytope P to a valid inequality for a higher dimensional polytope . Therefore, one must compute appropriate coefficients for the additional variables corresponding to the higher dimensions of compared with that of P. If all the coefficients of these additional variables are zero, then the process is called trivial lifting. This is the simplest way of lifting and leads in any case to a valid inequality of PSTAB. However, the main purpose of lifting is to produce a facet-defining inequality of from a facet-defining inequality for P. As is a face of polytope P, its dimension has to be increased to be facet-defining for . Consider again the odd-cycle inequality for the graph of Fig. 5, which is a facet of the odd hole induced by the nodes v1, …, v5. We have already seen that this inequality does not define a facet of the whole graph. The addition of 2x6 gives us inequality (20). This is a facet-defining inequality, as the characteristic vectors corresponding to the stable sets {v1v3}, {v1v4}, {v2v4}, {v2v5}, {v3v5} and {v6} are linearly independent and satisfy (20) with equality. Note that for a coefficient of 1 for variable x6, inequality (20) is still valid, but does not define a facet, if the coefficient is >2, the inequality is no longer valid. The next theorem shows that there are always such integer coefficients for PSTAB.

Theorem 3.6. (Nemhauser and Trotter, 1974). Let G=(VE) be a graph and WV. Suppose

is facet-defining for PSTAB(G[W]). Then there are for all vi∈V\W such that

is facet-defining for PSTAB(G).

We neglect the details of the proof but instead examine a construction scheme for the coefficients. Consider first the special case with . Let

be a modified weighted stable set problem where Aj denotes the jth column of node-edge incidence matrix AT and . If vj is not adjacent to any node of W, the coefficient βn=0. For the case of , each coefficient can be computed in the way described and can be added to set W. Hence, the new coefficients may depend on the previously computed once. Note that the results depend on the sequence of the computed coefficients.

This idea of lifting was primarily introduced by Padberg (1973). It is called sequential lifting. Padberg applied it to the case in which set W indicates an odd cycle in G. This is a special case of Theorem 3.6 and provides the information that an odd-cycle inequality can be lifted to a facet whenever the odd cycle is chordless, inducing an odd hole in G. A slight modification of sequential lifting can be found in Giandomenico and Letchford (2006).

#### 3.2. Clique-constraint stable set polytope

As on bipartite graphs, the non-negativity and edge inequalities are sufficient to describe PSTAB and a graph is bipartite only if it contains no odd cycles, one might think that the odd-cycle inequalities are enough to describe PSTAB in general. Unfortunately, there is a graph with four vertices which is not t-perfect. Consider the example of Fig. 7, illustrating the induced subgraph K4=G[{v5, … ,v8}] of Fig. 1. The numbers inside the nodes represent the optimal solution values of the corresponding variables for optimizing over PCSTAB(K4).

We recognize that the graph of Fig. 7 is complete and, hence, a clique. This suggests the following inequalities

• (22)

which are called clique inequalities. Now consider

Theorem 3.7. (Padberg, 1973). Let G be a graph with node set V and. Inequality (21) is valid for PSTAB(G). An inequality is a facet of PSTAB if and only if Q is a maximal clique in G.

Theorem 3.7 has the following implication for the edge-inequalities (14): they are facet-defining for PSTAB only if the edge in the graph is a maximal clique; otherwise, they are dominated by any clique inequality with the clique containing this edge. The clique inequalities and the odd-cycle inequalities are the same for triangles.

The clique-constraint stable set polytope is given by

Any xPQSTAB(G) also satisfies the edge-inequalities (14). In the special case that the clique-constraint stable set polytope and the stable set polytope are the same for graph G, then G is called perfect, i.e., PQSTAB(G)=PSTAB(G).

##### 3.2.1. Perfect graphs

A coloring of G is a partition of V into disjoint stable sets. The coloring number, denoted by χ(G), is the smallest number of stable sets needed for a coloring of G. To partition the node set of a graph G into disjoint stable sets, one needs at least the size of a maximum clique in G. This gives the well-known inequality

• (23)

which is true for any graph G.

In 1961, Berge called a graph G perfect if

• (24)

It can be shown that the polyhedral definition, PSTAB(G)=PQSTAB(G), is equivalent to the graph-theoretical one stated in equations (16), cf. Grötschel et al. (1988). The next theorem provides a characterization of perfect graphs. This theorem was already posed by Berge in 1962 but primarily proven in 2002 by Chudnovsky, Robertson, Seymour and Thomas.

Theorem 3.8. (Strong Perfect Graph Theorem, Chudnovsky et al., 2006; Ramírez-Alfonsín and Reed, 2001). A graph is perfect if and only if it, or its complement, does not contain an odd hole of length at least five as an induced subgraph.

This theorem can be stated more simply as the assertion that a graph is perfect if and only if it contains no odd hole and no odd antihole; an antihole is the complement of an odd hole. In honor of Berge, a perfect graph is sometimes also called Berge graph. Examples of the wide class of perfect graphs are bipartite graphs, line graphs of bipartite graphs6 and triangulated graphs7. Each complement of a perfect graph is a perfect graph; this property is known as the Weak Perfect Graph Theorem, proven by Lovász in 1972. Consider the following

Theorem 3.9. (Grötschel et al., 1988). The maximum stable set problem for perfect graphs can be solved in polynomial time.

The proof of Theorem 3.9 uses an infinite class of inequalities which are called orthonormal representation inequalities. They take the form with real vectors ui satisfying and for all and an arbitrary vector with . In this case denotes the Euclidean norm. The convex set of all vectors satisfying the non-negativity and the orthonormal representation inequalities is called a theta body. The theta body defines a polytope if and only if graph G is perfect. It can be shown that the orthonormal representation inequalities generalize the clique inequalities. For the special case of a perfect graph, the theta body is equal to PQSTAB and PSTAB. Furthermore, it can be shown that the separation problem for the orthonormal representation inequalities can be solved in polynomial time. This implies that the maximum stable set problem for perfect graphs can also be solved in polynomial time. This is quite remarkable, as the clique separation problem is -hard, see Section 'Clique inequalities'. Indeed, to determine a solution for PQSTAB is in general -hard, too, and only for perfect graphs proven to be polynomial. We neglect detail here and refer the interested reader to Grötschel et al. (1988).

Checking if a graph is perfect can be done in polynomial time, but the algorithms are quite sophisticated. Despite recent progress, the best known algorithms have a running time of , cf. Chudnovsky et al. (2005). For further studies of perfect graphs, we recommend the book of Ramírez-Alfonsín and Reed (2001).

#### 3.3. Additional valid inequalities for PSTAB

In this section, we examine some additional inequalities, valid for the stable set polytope of a graph G. The following is based on Schrijver (2003), Giandomenico and Letchford (2006), and Balas and Padberg (1976).

Consider Fig. 8 (a). It shows an antihole with seven nodes, which is the complement graph of an odd hole. We recognize the so-called antihole inequalities

• (25)

which are valid for PSTAB(G). Note that an antihole with five nodes is isomorphic to an odd hole with five nodes. The corresponding inequality to an antihole with six nodes is equal to the sum of its two triangle inequalities. For an antihole with n≥6 nodes, adding all its triangle inequalities leads to the following inequality

• (26)

For more than eight nodes, all triangle inequalities are not enough to describe the antihole. Note, the odd-cycle inequalities are also valid for even cycles, but they are equal to the sum of the edge inequalities. By contrast, the antihole inequalities provide additional information for antiholes of an even order >4.

In Section 'Lifting', about lifting we considered an odd hole having an additional node which is adjacent to all other nodes. Such an additional node is called hub and the graph is called a wheel. For the special case of a wheel with five nodes, we recognize a facet-defining inequality for that particular graph. This type of inequality is known as an odd-wheel inequality and has the general format

• (27)

with C as an odd cycle and hub with uvE for all vC. In the case of , inequality (26) is a clique inequality. Inequality (26) is valid for PSTAB(G) and defines a facet if G is isomorphic to an odd wheel. For an example of an odd wheel with eight nodes, see Fig. 8 (b).

Let p and q be integers satisfying p>2q+1 and q>1. A graph G is called a web if G is isomorphic to the graph consisting of the nodes {v1, …, vp} with an edge vivj, if and only if modulo (n−2). A web is abbreviated W(pq). A graph is called an antiweb, denoted by AW(pq), if and only if . Examples can be seen in Figs. 8 (c) and (d). The following inequalities

• (28)
• (29)

are called web inequalities and antiweb inequalities, respectively. Both types of inequalities are valid for PSTAB(G). The web inequalities (27) define facets if p and q are relatively prime8 and G=W(pq), while the antiweb inequalities (28) are facet-defining for PSTAB(AW(pq)), if there is no with p=k·q.

Let us return to graph G defined in Fig. 1. If we add the five cycle, the triangle and the four-clique inequalities to PRSTAB(G), it can be shown that PRSTAB(G) is equal to PSTAB(G) for this graph G. Unfortunately, the inequalities that have been discussed until now are not sufficient to describe PSTAB in general. Therefore, consider graph G of Fig. 9. We observe two triangles and two five cycles. Adding their corresponding inequalities to PRSTAB leads to a polytope with 35 vertices9. There is exactly one non-integer vertex, which is given by

This implies that the added inequalities are not sufficient to describe PRSTAB(G). In order to calculate a description of the facets of G, we use POlyhedron Representation Transformation Algorithm (PORTA). The PORTA is software that uses Fourier–Motzkin elimination algorithms to analyze polytopes and polyhedra. The output of PORTA for graph G of Fig. 9 reads as follows.

“DIM” represents the number of variables and the vector after “VALID” lies inside the polytope described by the inequalities of “INEQUALITIES_SECTION”, which shows that the polytope is not empty. Let us have a closer look at the inequalities. Constraints (a) to (h) are the non-negativity inequalities (13) and the inequalities (i) to (m) are some edge inequalities (14). (n) and (o) are inequalities corresponding to a maximal clique while (p) and (q) are five-cycle inequalities. Inequality (r) does not belong to any inequality, which has been studied until now. Nevertheless, it defines a facet.

Inequality (r) belongs to the large class of so-called rank inequalities. Let G=(VE) be a graph and WV. Then, these inequalities read

• (30)

From their construction, inequalities (29) are valid for PSTAB(G). The edge, odd-cycle, clique, antihole, web and antiweb inequalities belong to the class of rank inequalities. Therefore, these inequalities are not facet-defining for PSTAB(G) in general. The inequality of an odd wheel with five or more nodes is not a rank inequality, for instance.

### 4. Separation

In the previous section, we described some classes of inequalities valid for the stable set polytope. The number of valid inequalities may grow exponentially and therefore it is not helpful to add all of them to an LP relaxation of PSTAB. In any case, it is not necessary to describe the complete polytope to solve a particular problem. One would add only the “important” inequalities. This is done sequentially by solving an LP-relaxation and adding violated inequalities to the formulation. Finding these violated inequalities constitutes the separation problem. We start with the separation of the odd-cycle inequality, which is based on Cheng and Cunningham (1997), Gerards and Schrijver (1986), and Grötschel and Pulleyblank (1981).

#### 4.1. Odd-cycle inequalities

The separation problem of the odd-cycle inequalities (19) for a solution x and graph G reads as follows: Either find an odd-cycle inequality, which is violated by x, or prove that x satisfies all odd-cycle inequalities of G. This can be done by computing a “minimum-weight” odd cycle in a graph, with an appropriate weighting function. If the solution x satisfies the corresponding odd-cycle inequality for that minimum-weight odd cycle, then all odd-cycle inequalities are satisfied. Otherwise, a (maximal) violated odd-cycle inequality has been obtained. Therefore, first consider an algorithm that computes a minimum-weight cycle of an arbitrary graph with edge weighting. Second, define edge weights depending on a current LP solution.

#### Algorithm Minimum odd cycle in a graph

Input
• Edge-weighted graph Gc=(VEc), weighting c is non-negative
Output
• Minimum-weight odd cycle of Gc with weight h
• // Construct auxiliary bipartite graph H:=(VHEH)

• 1: VH:={v+v | v∈V} // Duplicate all nodes of graph G

• 2:

• 3: // Define the weights of H

•  // Initialize odd cycle and its weight h

• 4: and h:=∞

•  // Construct for each node of Gc a minimum-weight odd cycle

• 5: for all nodes uV do

• 6:  Compute a minimum-weight path from node u+ to node u in graph H with weight , k even

• 7:  Obtain the closed walk WG: =(uuu1u1u1u2u2, … ,ukukuu) in graph G

• 8:  Construct odd cycle C=(v0v1v2, … ,vmv0) with weight cG from walk WG;

•    , m even

• 9:  if cG<h then

• 10:    and h:=cG // Update minimum-weight odd cycle

• 11:  end if

• 12: end for

• 13: return Odd cycle and its weight h.

Consider now Algorithm 4.1. In the first three steps, an auxiliary graph H=(VHEH) is constructed. Node set V is duplicated and the two copies are called V+ and V. An edge u+v or uv+ is in EH if and only if edge uvE. As there is no edge between two nodes of V+ and no edge between two nodes of V, H is bipartite with . The weights are copies of the weights of graph G. In step 6, a minimum-weight path from node u+ to node v is computed. As nodes u+ and u are in two different sets of the bipartition, an odd number of edges is contained in each closed path. The corresponding odd walk in G is constructed in step 7 by deleting the indices + and −. Note that this walk can possess edge and node repetitions. In order to obtain an odd cycle in G, one has to delete the double nodes and edges. One idea is to start with node u and mark all visited nodes and edges along the closed walk. If a marked node is visited again, all nodes and edges along this closed path can be removed from the walk. In the case of an edge repetition, the last visited node and the repeated edge must be removed from the walk. After this is done, all node and edge repetitions have been eliminated. The resulting closed walk can have odd or even length or may be empty. If the closed walk is odd, one has found a minimum-weight odd cycle in Gc, as c is non-negative. In the other two cases, one of the removed closed paths has odd length. Each of these odd cycles has less or equal weight than any odd cycle containing u. Hence, we store one of the odd cycles in set C. As all edge weights are non-negative, the removal of nodes from the path does not destroy the correctness of this algorithm. As each node and edge in WG is marked only with its frequency, the procedure described above can be done in linear time in the length of WG. Therefore, a minimum-weight path from u+ to u in H, with respect to weighting cH, corresponds to an odd cycle in Gc, which has a weight less than or equal to that of any odd cycle containing node u. As in step 5, such an odd cycle is computed for all nodes u of graph G; Algorithm 4.1 computes a minimum-weight odd cycle in G.

Recognize that the computational complexity is dominated by the for loop in step 5 and the construction of a shortest path in step 6. As all weights of auxiliary graph H are positive, one can use the algorithm of Dijkstra to calculate the shortest path, which has running time . These observations are summarized in

Proposition 4.1.. A minimum-weight odd cycle in a graph G can be computed with Algorithm 4.1 in.

Now, consider Fig. 10 (a). In this case, we say that all edges have weight zero. In Fig. 10 (b) the auxiliary graph constructed by Algorithm 4.1 is shown. All edge weights are 0 by construction. We start to compute a minimum-weight odd cycle containing node v1. A resulting shortest path from node to node is represented by the marked edges. The translation of this shortest path to a closed walk can be seen in Fig. 10 (c) on the top graph. We recognize that node v3 is contained twice in the walk. Using the described method leads to the removal of nodes v4, v5 and v6 from the walk. In this case, the remaining closed walk is an odd cycle containing node v1. Computation of an odd cycle including v4 is shown in the graph below. Again, there is a node repetition and we remove nodes v1 and v2. After that, we recognize that edge v3v4 is contained twice in the walk. Eliminating this edge shows that there is no odd cycle containing v4, and the resulting walk becomes empty. As a by-product, we receive an odd cycle containing nodes v1, v2 and v3.

Next, we want to use Proposition 4.1 to show that the separation problem for the odd-cycle inequalities can be solved in polynomial time. We therefore have to define an edge weighting. Let be a vector satisfying the non-negativity (13) and edge inequalities (14), for instance, a solution of . Define an edge weighting of graph G depending on as

Suppose that C is an odd cycle in G. Then the weight of C with respect to c, is

An odd-cycle inequality in G is violated by vector if and only if

Therefore, a most violated odd-cycle inequality corresponds to an odd cycle in G having minimum weight with respect to c. This leads to Algorithm 4.2. In step 1, the edge weights for graph G are calculated and in step 2, a minimum-weight odd cycle is computed. Step 3 provides information as to whether or not all odd-cycle inequalities are satisfied. A computer's double precision dictates that one should check the violation with a small tolerance . However, from a theoretical point of view, has to be fixed to zero to get an exact separation routine. We obtain the following theorem.

Theorem 4.2.. The separation problem for the class of odd-cycle inequalities can be solved in polynomial time.

We must now show that Algorithm 4.2 has polynomial running time in order to prove Theorem 4.2. Observe that the running time is dominated by the computation of the minimum-weight odd cycle. As the constructed weights are all non-negative, Algorithm 4.1 can be used, leading to running time . Note that it is quite important that all trivial inequalities are met by vector . Otherwise, the weights c would become negative and Dijkstra's algorithm would not be applicable. In this case, one would have to use a more general algorithm, e.g., Bellmann–Ford, which can handle negative weights.

Unfortunately, negative edge weights could produce negative cycles depending on the topology of the graph and the problem instance could have no finite solution. But actually, the difficulty with negative edge weights can be easily avoided by separating the edge and trivial inequalities (which can be done in ) before executing Algorithm 4.2.

#### Algorithm Odd-cycle inequalities separation

Input
• Graph G=(VE), satisfies edge inequalities, ɛminViol
Output
• Maximum violated odd-cycle inequality
•  // Calculate weight c.

• 1: for all uiujE

•  // Compute minimum-weight odd cycle in G

• 2: Use algorithm 4.1 to compute and h from Gc

• 3: if then

• 4:  return There is no violated odd-cycle inequality in G.

• 5: else

• 6:  return is a maximum-violated odd-cycle inequality in graph G with respect to vector .

• 7: end if

Theorem 4.2 is quite remarkable as the number of odd cycles in a graph can be exponential. The rationale is that there is no need to check all odd cycles.

The first known algorithm for the separation of the class of odd-cycle inequalities dates back to Grötschel and Pulleyblank 1981. This algorithm also makes use of an auxiliary graph. The auxiliary graph is no longer bipartite and the minimum-weight odd cycle is computed via a perfect matching. Unfortunately, it is quite time consuming to calculate a perfect matching which can be done, for instance, in . The odd-cycle inequalities separation based on perfect matchings then has a running time of .

##### 4.1.1. Implementation aspects

Let us suggest a few tweaks when implementing an odd-cycle inequality separation routine. The following modifications turned out to be computationally effective when solving stable set problems in a branch and cut framework (Rebennack et al., 2011).

1. Reduce the size of H
A node whose corresponding variable has value of 0 or 1 can be deleted from the auxiliary graph H. The reason is that one single node with such a property implies that no odd cycle containing this node can violate the odd-cycle inequality.
Assuming that all trivial inequalities are satisfied by a solution vector , an odd-cycle inequality can maximally be violated by value , independent of its size. Therefore, small odd cycles should be preferred during separation. The definition of the edge weights provides for the case of that the weights of the edges u+v and uv+ of auxiliary graph H are zero. Hence, a minimum-weight path in H may contain additional edges and nodes which are not required in order to calculate a shortest path. To avoid this phenomenon of additional edges and nodes, one can add a small positive value, for instance 10−6, to the weights which are zero. In addition, this has the benefit that in some cases unnecessary even cycles are not generated by the algorithm, revisit Fig. 10 (c).
The odd-cycle inequalities separation aims to find the most violated odd-cycle inequality in a graph. In a branch and cut framework, however, it is more interesting to obtain any violated inequality, within a tolerance ɛ>0. Therefore, we consider each odd-cycle inequality resulting from Algorithm 4.2 in step 2 and check if its weight is smaller than .
4. Remove chords
Recall that an odd cycle has to be chordless for the odd-cycle inequality to have a chance being facet-defining for PSTAB; however, chordlessness is not a sufficient condition for being facet-defining, see Section 'Cycle-constraint stable set polytope' If we consider Fig. 6 (a) again and compute an odd cycle starting with node v2 or v3, we see that the resulting cycle contains a chord. This is independent of whether we add a small value to its weights or not. Let us now discuss an extension of Algorithm 4.2 with the purpose of deleting all chords. Algorithm 4.3 requires an odd cycle C with one principle property. We assume that the cycle has to be free from node and edge repetitions, which is already satisfied according to the definition of a cycle. A real restriction is that cycle C does not contain a smaller subcycle which is odd and includes vi. The computed odd cycle in Algorithm 4.1 meets these criteria if small values are added to all weights of H which are 0. Let us now consider the details of Algorithm 4.3. After relabeling the nodes of cycle C, the algorithm checks whether each second node is adjacent to it (step 6). It is enough to check only nodes which have an odd distance in the cycle; otherwise, C would contain an odd subcycle. Therefore, we need the assumption already discussed. If two nodes in step 6 are adjacent, a chord has been detected. In step 7, the even cycle containing node vi is removed from . The procedure starts again with this smaller odd cycle (step 9). Observe that the assumptions are also met by the smaller cycle. In step 14, an odd hole is returned. The running time is linear in the size of C as each possible chord is only considered once. Therefore, one could extend Algorithm 4.2 by calling Algorithm 4.3 after step 2. This leads to a polynomial time separation routine for the odd-hole inequalities! Note that this separation is exact.
5. Check size and lift
Check the size of the computed chordless odd cycles via Algorithm 4.3. If the size is equal to three, then increase this triangle to (any) maximal clique. This lifts the corresponding triangle inequality to be facet-defining.
6. Speed-up
Compute only odd cycles in G for a node u, if u is not contained in an odd cycle already computed during this separation. Although this is a heuristic, the speed up of the separation routine can be quite substantial.

Note that suggestions 1–5 do not change the exactness of the separation routine.

#### 4.2. Clique inequalities

Computing a maximum clique in a graph is , as it is equivalent to finding a maximum stable set in the complement graph. The separation problem for clique inequalities asks to find a violated clique inequality for a given solution x, or to prove that all clique inequalities are satisfied. This is equivalent to finding a maximum clique in a node-weighted graph G where x defines the node weights. This implies that the separation problem for the clique inequalities is .

It might not be a good idea to solve problems as subproblems. Unfortunately, computational tests reveal that the clique inequalities help to close the gap substantially, cf. Rossi and Smriglio 2001, Rebennack et al. 2011. Thus, an effective branch and cut algorithm must utilize heuristic separation methods for the clique inequalities.

#### 4.3. Rank inequalities

In this section, we discuss a heuristic separation of the large class of rank inequalities. The core of this procedure is a graph shrinking mechanism, called edge projection, introduced by Mannino and Sassano 1996. The idea is to reduce the size of the graph in order to facilitate the computation of violated rank inequalities which are then antiprojection to the whole graph.

#### Algorithm Delete chords

Input:
• Graph G=(VE), odd cycle C and start node vi

Ensure that C has no node repetition and there is no odd cycle in G[V(C)] containing vi

Output:
• Chordless odd cycle
•  // Relabel the nodes in C

• 1: with wiC

• 2: j:=2

• 3: for all do

• 4:  k:=j+2

• 5:  for all do

• 6:   if wjwkE then

• 7:

• 8:    vi:=wj

• 9:    goto 1

• 10:   end if

• 11:   k=k+2

• 12:  end for

• 13:  j=j+2

• 14: end for

• 15: return Chordless odd cycle

Let us look in detail at this shrinking method. The smaller graph is obtained by removing both nodes u, v of a particular edge e=uv, along with all the nodes in their common neighborhood Γuv:=Γ(u)∩Γ(v). Thus, the edge set

is removed from the graph as well. Additional edges in

are added to the graph. Those additional edges are called false edges, i.e., is the set of false edges. This allows

Definition 4.3.. The graph with and edge set is called the projection of e in G.

Note that the projected graph is smaller and, most likely, denser than the original graph G. This is appealing, as one would expect to find violated inequalities more easily in .

We must compute valid inequalities for . It turns out that the edge e is the crucial component for generating valid inequalities. Consider edge with a maximum stable set in G such that . Such an edge e is called projectable in G. The next theorem shows that the projectability of an edge is a sufficient condition to construct rank inequalities valid for PSTAB(G).

Theorem 4.4. (Rossi and Smriglio, 2001). Let e=uv be a projectable edge in G and. If x(W)≥l is a valid rank inequality for, then x(W)+xuv)+xu+xvl+1 is valid for PSTAB(G).

It is to check whether an edge is projectable or not (Mannino and Sassano, 1996). Thus, for computational efficiency, an alternative characterization for edges to be projected based on satisfying Theorem 4.4 is desirable. Consider strongly projectable edges: if edge e is projectable in every induced subgraph of G containing both u and v, then edge e is called strongly projectable. A handy characterization of strongly projectable edges is given by Rossi and Smriglio.

Theorem 4.5. (Rossi and Smriglio, 2001). An edge e=uvE is strongly projectable in G if and only if it is not the central edge of an induced subgraph isomorphic to a diamond, Fig. 11 (a), a bull, Fig. 11 (b), or a double fork, Fig. 11 (c).

It can be checked in polynomial time whether an edge is strongly projectable or not. The following necessary condition for a strongly projectable edge holds:

Corollary 4.6. (Rebennack et al., 2011). If edge uvE is strongly projectable in G, then the common neighborhood Γuvis a clique.

The definition of a strongly projectable edge together with Theorem 4.5 implies the following

Theorem 4.7. (Mannino and Sassano, 1996). An edge e=uvE is projectable in G if it is not the central edge of an induced subgraph isomorphic to a diamond, a bull or a double fork.

Unfortunately, many stable set instances contain quite a limited number of strongly projectable edges (Rebennack et al., 2011). To use the projection method, one might remove some edges, which would destroy the strong projectablity. Therefore consider

Corollary 4.8. (Rossi and Smriglio, 2001). Let e=uvE. If is a clique, then uv is strongly projectable.

This suggests the following method. First, select a clique from set , and second, remove all edges uw with . This leads to the strongly projectable edge uv. Note that the deletion of edges does not alter the antiprojection procedure of Theorem 4.4. The drawback of removing these edges is that the graph is changed structurally. Thus, it might be favorable to choose a clique Q of large size.

Figure 12(a) shows graph G with strongly projectable edge e=v3v5. The projection of v3v5 results in the graph shown in Fig. 12 (b). The clique inequality x1+x2+x6≤1 is a valid inequality for but not for PSTAB(G). The antiprojection results in the inequality which is facet-defining for PSTAB(G).

By contrast, edge e=v3v5 is not strongly projectable for the graph shown in Fig. 13 (a). Using Corollary 4.8 with the choice of Q={v6} results in the removal of edge v4v5. The resulting projection is shown in Fig. 13 (b). Using the same clique inequality results in the antiprojected inequality x1+x2+x3+x5+x6≤2.

An interesting question is what influence the antiprojection step has on the tightness of the computed inequalities. This is answered by the next.

Lemma 4.9. (Rebennack et al., 2011). A facet-defining inequality for is not, in general, facet-defining for. For a rank inequality to be facet-defining for PSTAB(G), it is not necessary that the inequality be facet-defining for before the antiprojection.

The second statement is illustrated via Fig. 14. The inequality induced by the projected graph in Fig. 14 (b) is not facet-defining, because it is dominated by the two clique inequalities x1+x4+x5≤1 and x6+x7+x8≤1. However, we have seen in Section 'Additional valid inequalities for PSTAB' that the graph in Fig. 14 (a) is facet-inducing.

Next, let us look at a criterion for the case in which the facet-defining inequalities for the projected graph remain facet-defining after antiprojection. It turns out that the false edges are the key to whether or not an antiprojected inequality is facet-defining. A false edge is called critical for an inequality and a strongly projectable edge e, if inequality is not valid for if is removed from .

Proposition 4.10.. (Rebennack et al., 2011). Let e=uvE be strongly projectable, , and x(W)l be facet-defining for. If for e and x(W)l there is a critical edge, and a stable set in such that and, then the inequality x(W∪{uv}) ≤l+1 is facet-defining for PSTAB(G−Γuv).

If W is a maximal clique in (of size ≤3) and one of its edges is a false edge, then the assumptions of Proposition 4.10 are satisfied; similarly, if induces an odd hole in and contains a false edge. However, lifted odd-cycle inequalities do not meet this criteria, cf. Rebennack et al. 2011.

The next proposition extends the concept of edge projection to general, valid inequalities for .

Proposition 4.11.. (Rebennack et al., 2011). Let e=uvE be a strongly projectable edge and be valid for and not dominated by a non-negativity facet. The inequality

for all is valid for PSTAB(G).

The edge projection can also be performed again on a projected graph and the theory reviewed in this section remains the same.

##### 4.3.1. Implementation aspects

The strength of edge projection lies in the reduction of the size of the graph and the simultaneous increase in the density of the smaller graph. The density increases even further when the edge projection is used iteratively. The theory of edge projection generalizes in a straight forward manner when iterating the edge projection, but the implementation itself can be quite challenging. As the separated inequalities have to be antiprojected, the information on how this inequality is computed has to be stored. A tree structure seems to be the most appropriate choice. Each node of the tree stores the information of one projection. Once an inequality has been found, it can be antiprojected by walking up the tree.

The edge projection is shown as a flowchart in Fig. 15. At first, an edge has to be selected to be the candidate for a projectable edge. The selection is quite crucial. Any violated inequality containing false edges have to be antiprojected. This implies that the projectable edges are added to the inequality and the right-hand side is increased. If the weights of these added nodes are too small, a violated inequality in the smaller graph may be satisfied after the antiprojection. A possible choice is to select only nodes for which the sum of their weights is >0.6.

As the strongly projectable edges tend to be rare (see Section 'Rank inequalities'), the neighborhood of one of the endnodes of the edge must be modified. Once a strongly projectable edge uv has been found, the graph must be projected. This means that the nodes u and v, as well as their common neighborhood Γuv, have to be removed from the graph and the false edges must be added. In order to manage this update quickly, one could store the graph in an adjacency matrix. Each update step can then be done in constant time. If the projection is complete, a separation routine can be called; instead, one may also project again. One can use any separation procedure which computes rank inequalities10; Rossi and Smriglio suggested clique separation (F. Rossi and S. Smriglio, 2006, personal communication). Maybe, the reasons are that the new graph is denser which can be exploited by this separation. In addition, heuristic procedures for clique separation are quite fast compared with other separation routines. The use of the clique separation is also justified by the theory, as there is a good chance that the antiprojected inequalities of maximal cliques are facet-defining, cf. Proposition 4.10. Good computational performance of edge projection with clique separation is also reported in Rebennack et al. 2011.

Another element of the edge projection process is the antiprojection, where the data of the projection step(s) are required. The tree structure of the edge projection process allows for retrieving the necessary information for the antiprojection process. After some inequalities have been computed, one can project again to find more violated inequalities or one can terminate the program. It is also possible to go backward in the tree, i.e., to reverse some of the edge projections. The idea is to project until the tree is empty and to go a few projection iterations back before calling the separation procedure.

In this section, we briefly describe additional separation routines for PSTAB (in alphabetical order).

1. Antiholes
To the best knowledge of the authors, there is no effective separation routine for antihole inequalities (24). In fact, it is not even known whether the antihole separation problem is polynomial or .
2. Local cuts
The principal of local cuts dates back to Applegate et al. 2001, when this concept was applied for the first time to the traveling salesman problem. Assume that there is a subgraph G′ of the stable set instance containing n′ nodes, that all m′ feasible stable sets for G′ are known, and that a solution vector x′ is given. Then, the idea is as follows: solve a linear program with m′ variables and n′ constraints, where the optimal objective function value provides information on whether x′ lies inside PSTAB(G′) or not. For the case in which x′ does not lie inside the convex hull of all stable set solutions, then the dual information of the linear program yields a valid cut for PSTAB(G′), which can then be trivially lifted to be valid for PSTAB(G), cf. Section 'Lifting'. For more details on local cuts, we refer the reader to Warrier 2007 and Rebennack et al. 2011. Local cuts are time consuming to generate. Typically, a suitable subgraph has to be generated, all stable set solutions must be enumerated (which can be exponentially many in n′), and a (large) LP problem must be solved. Effective branch and cut algorithms might use these cuts rarely, but computational evidence shows that these cuts can be effective toward closing the gap when all other separation routines fail to generate new and/or effective cuts (Rebennack et al., 2011).
3. Mod-k cuts
Mod-k cuts belong to the class of Chvátal–Gomory cuts with rank one. Like the Chvátal-Gomory cuts, mod-k cuts are general classes of inequalities not specifically tailored to the stable set polytope. Suppose there is a system of linear inequalities Axb with integral coefficients. The idea of mod-k cuts is to compute a vector μ with positive integer entries such that the inequality system multiplied by μ can be strengthened by dividing it by a positive integer k. See Caprara et al. 2000 and Fricke 2007 for more details on mod-k cuts. For the maximum stable set problem, mod-k cuts are computationally effective for random graphs with low density, cf. Rebennack et al. 2011.
4. Web inequalities
Polynomial time separation routines for the web inequalities (27) and antiweb inequalities (28) are presented by Cheng and de Vries 2002. Empirical studies show that these inequalities usually occur in graphs with high density (e.g., >0.7). The complexity of these separation algorithms might be too high for effective cutting plane algorithms.
5. Wheel inequalities
Cheng and Cunningham 1995 describe a separation algorithm for the wheel inequalities (26). Treating each node of the graph G as a possible hub of an odd wheel, the idea is to construct a new graph , where the separation of the odd-cycle inequalities of leads to a separation of the odd-wheel inequalities for graph G. The total running time is dominated by the odd-cycle inequalities separation routine. The order of the complexity is or , depending on the implemented odd-cycle inequalities separation routine. Though the odd-wheel separation is polynomial, its complexity is quite high for efficient implementations of a branch and cut algorithm.

### 5. Preprocessing

“Given a formulation, preprocessing refers to elementary operations that can be performed to improve or simplify the formulation by tightening bounds on variables, fixing values and so on. Preprocessing can be thought of as a phase between formulation and solution” (Nemhauser and Wolsey, 1988, p. 17).

The citation above includes the main ideas of a preprocessing phase. In the case of a branch and cut algorithm for the maximum stable set problem, we focus on eliminating or fixing variables and on some structural properties of maximal stable sets. Preprocessing algorithms usually require linear time, or at most polynomial time. It is also typical that small problem instances can be solved immediately in the preprocessing phase. In the following, we examine special structures and results on how to improve a branch and cut algorithm via preprocessing.

There are some fundamental ideas for preprocessing for the maximum stable set problem. Let Gc=(VEc) be a node-weighted graph.

• One can delete all nodes of the graph with negative or zero weight:
• If the graph is not connected, one can solve the maximum stable set problem by solving the problem for each connected component separately:

The connected components can be identified for instances with a depth-first-search algorithm in linear time with respect to the size of the adjacency structure. This is particularly important for very sparse stable set instances, e.g., so-called call graphs resulting from telecommunications applications (Abello et al., 1999; Hayes, 2000).

• As a special case of the last observation, all isolated nodes with positive weight are part of a (all) maximum stable set(s). Let δ(v) denote the degree of node v which is the number of adjacent nodes to v. Then, one obtains
This can be done in if stored in an adjacency list.
• If the weight of a node is greater or equal than the weight of the sum of all its neighbors, it can be fixed:

Suppose there is a clique Q and a node vV whose neighborhood is a proper subset of the clique. If the weight of node v is greater than or equal to the weight of all other nodes of the clique, this node can be added to a stable set and the whole clique can be removed from the graph. We get

Lemma 5.1.. (Rebennack et al., 2011). Let Gc=(VEc) be a node-weighted graph. If there exists a node viV and a clique Q of G with the property Γ(vi)⊆Q and cicj for all vjQ, then there is a maximum stable set of G which contains node vi.

The next theorem states that all nodes which have value 1 in an optimum solution of can be fixed without changing the optimum solution value

Theorem 5.2. (Nemhauser and Trotter, 1974). Suppose is an optimal -valued solution of. There is a maximum stable set in G that contains.

### 6. Branching

One standard branching idea for a 0–1 IP problem is to generate two subproblems: one (fractional) variable is set to a value of 1 in one subproblem and to a value of 0 in the other subproblem. However, this branching strategy leads to a very unbalanced branch and bound tree for maximum stable set problems. The reason is that setting a variable to a value of 1 has an impact on its neighborhood (all nodes are excluded from a stable set), while setting a variable to a value of 0 has no consequence for the other variables. The branching idea of Balas and Yu 1986 avoids such unbalanced trees.

Let G′=(V′, E′) be the subgraph induced by the set of nodes, which is not fixed in a current subproblem. The goal in each subproblem is to find a maximum stable set (in G′) or to prove that α(G′)≤LB. Let WV′ and assume that α(G[W])≤LB can be calculated. W=V′ implies that the subproblem can be fathomed. Otherwise, if α(G′)>LB, any maximum stable set must contain at least one node of set . On the base of that observation, Balas and Yu showed that every maximum-cardinality stable set with greater weight than the LB must be contained in one of the sets

This is also true for the weighted case with . This branching rule leads to p new subproblems. In each subproblem, node vi is set to a value of 1 and all nodes of are set to a value of 0. This ensures that each subproblem has one node which is fixed to be contained in a stable set, balancing the branch and bound tree.

The size of W and the ordering of the nodes in Z can affect the performance of the branch and cut algorithm. The larger W is, the fewer subproblems are generated (which is desirable). However, the size of W is strongly affected by the quality of the LB. W can be determined for the cardinality stable set problem, for instance, with a clique covering (Balas and Yu, 1986; Rossi and Smriglio, 2001). Also, matchings or holes have been considered as a means to compute W (Sewell, 1998). The method of choice may also depend on whether the problem is a maximum-weighted stable set problem or a maximum-cardinality stable set problem.

In addition, the choice of the branching variable has a great impact on overall performance. By branching on nodes with a high degree, the size of the tree can be reduced, which was empirically shown by Carraghan and Pardalos 1990. The reason is the previously mentioned observation that with the branching node its neighborhood is also set. To sort the nodes in each subproblem in ascending order of degree seems logical, given that the size of the tree is expected to decrease. However, to sort the nodes is computationally expensive as the degree of all nodes has to be calculated in each branching step before sorting. Interestingly, Sewell 1998 showed experimentally that for sparse graphs, the sorting is still efficient.

### 7. Generation of LBs via heuristics

LBs on the maximum (weighted) stable set are obtained by the (weighted) cardinality of any stable set. In a branch and cut algorithm as outlined in Fig. 3, LBs can only be obtained from an integer solution of an LP relaxation problem. Because all feasible solutions provide an LB, the use of heuristics is a prime means of computing LBs. The literature is rich on heuristic approaches for the stable set problem as well as for the maximum clique problem. Any of those methods might be worth considering in a branch and cut algorithm. A thorough literature review on heuristic methods is beyond the scope of this tutorial. Continuous approaches have been successfully applied for solving maximum stable set problems (maximum clique problems) (Gibbons et al., 1996; Busygin et al., 2002). Further, we want to mention the QUALEX algorithm (Busygin, 2006, 2007). The reason is twofold. First, the algorithm performs very well on various benchmark problems in terms of solution time as well as quality. Second, there is a well coded and maintained software package available (http://www.stasbusygin.org/).

Let us now discuss two heuristic methods that are particularly useful in a branch and cut algorithm for the maximum stable set problem.

#### 7.1. Rounding heuristic

After an LP problem has been solved, one applies an integer feasibility test. If this test fails, the separation or branching step is called. Between these steps, one can use heuristics to increase the global LB. With the LP solution, we construct a maximal stable set with Algorithm 7.1. The heuristic places all nodes where the corresponding variable has a solution value of 1 in a set . This set is a stable set, assuming that solution x satisfies the edge inequalities. The neighbors of the added nodes are marked such that they cannot become a member of (step 7). All nodes, which are not added to , are put into a set F together with their weights. The weights are the LP solution values multiplied with the node weights of function c. Afterward, node set F is sorted in ascending order with respect to the weighting (step 10). Step by step, nodes are added to set until no nodes are left. Before a node is added, it is checked if the node has been marked before. This guarantees that defines a stable set in each step. The computational time is dominated by the sorting in step 10. Hence, Algorithm 7.1 has a running time of .

#### 7.2. Improvement heuristic

Consider the graph of Fig. 16. Assume that a maximum-cardinality stable set problem is given along with a maximal stable set as indicated by the black filled nodes v2v4v5v7 and v11. The maximal stable set is not of maximum size. One could, for instance, put nodes v9 and v10 into and remove node v5. This results in a cardinality increase. The obtained set (gray filled nodes) is again a stable set because the distance of node v5 to all other nodes of set is >3. Here, the distance of two nodes is defined as the size of a shortest path connecting them. The second reason is that there are the two shortest paths (v5v5v9v9v9v3v3v2v2) and (v5v5v10v10v10v13v13v11v11) connecting v5 with v2 and v11, which have no node in common. As the two nodes v9 and v10 are not adjacent, they can be exchanged in with v5.

The described method is generally valid; we use this idea to formulate Algorithm 7.2. Three arbitrary nodes of a stable set are selected in step 3. Node vtop is a candidate to be exchanged from the stable set. Therefore, two shortest paths of length 3 from node vtop to u and vtop to v must be calculated. This is done in steps 4 and 7. If these paths exist, one has to check if the exchange of node vtop with the two nodes u1 and v1 of the paths leads to a stable set. The nodes v1 and v2 have to be non-adjacent. Another condition is that no node of the current stable set other than vtop is adjacent to node v1 or v2. These conditions are checked in step 7. If they are met, the cardinality of the stable set can be increased (steps 8 and 9). After such an exchange, it is possible that the resulting stable set is not maximal even if its cardinality is larger than the one of the original set. The reason for not being maximal is that one of the neighbors of the removed node vtop can have no neighbor in . This property is exploited in step 11.

#### Algorithm Rounding Heuristic

Input:
• Graph G=(VE), LP solution vector and weight

Ensure that x satisfies the edge inequalities

Output:
• Maximal stable set
•  // Initialize

• 1: , , yi=−1 ∀iV

•  // Add all variables to which have a value of 1

• 2: for all viV do

• 3:  if vi=1 then

• 4:

• 5:   ∀vj∈Γ(vi): yj=0

• 6:  else

• 7:

• 8:  end if

• 9: end for

• 10: Sort F with respect to the weighting

•  // Add nodes to until the stable set is maximal

• 11: while do

• 12:  Select element (vi, ·) from set F

• 13:  F=F \{(vi, ·)}

• 14:  if yi ≠ 0 then

• 15:

• 16:   ∀vj∈Γ(vi): yj=0

• 17:  end if

• 18: end while

• 19: return Maximal stable set

The complexity to check all possible combinations might be too high for an efficient branch and cut algorithm. Therefore, one could randomly selected some combinations and bound the running time by a time limit tmax (step 2). In order to check all combinations, one would have to check possibilities for each node of , u and v, and compute all combinations of the two paths with length 3. If δmax denotes the maximal degree in G, then there are at most such paths for each pair of nodes. This yields a total running time of .

Observe that Algorithm 7.2 can be generalized to the weighted case with . The exchange of the nodes of a stable set leads to a higher weight if c(vtop)<c(u1)+c(v1). This requirement has to be added to step 7.

#### Algorithm Improve Stable Set

Input:
• Graph , stable set and time limit tmax
Output:
• Stable set with
•  // Intitialize

• 1: ; start CPU timer tCPU

•  // Try to improve until time limit is reached

• 2: while tCPU<tmax do

• 3:  Select three nodes vtop, u and v from the stable set , arbitrarily

• 4:  Calculate path Pu=(vtopvtop u1u1u1u2u2u2uu) with Algorithm 7.3

• 5:  if then

• 6:   Calculate path Pv=(vtopvtop v1v1v1v2v2v2vv) with Algorithm 7.3

•    // Check the sufficient conditions for the exchange

• 7:   if then

• 8:

• 9:

• 10:   end if

• 11:   Add all possible nodes of Γ(vtop) to

• 12:  end if

• 13: end while

• 14: return

Consider once again Fig. 16. The maximal stable set computed with Algorithm 7.2 (gray filled nodes) is not maximum; but, in this case, the presented improvement algorithm fails to find a better stable set. Given that the maximum stable set problem is hard to approximate (cf. Section 'Introduction'), there is no guarantee regarding the quality of this algorithm if .

This improvement Algorithm 7.2 can be slightly modified. For instance, the paths Pu and Pv in step 4 and 6 should not be calculated independently and the conditions of step 7 can be checked while computing these paths. In addition, the nodes in steps 1 and 2 of Algorithm 7.3 should be selected randomly.

#### Algorithm Compute Path

Input:
• Graph Gc=(VEc), node vstartV and node vendV
Output:
• Path P with length 3 from node vstart to node vend
•  // Check for all neighbors of node vstart and vend if the neighbors are adjacent

• 1: for all u1∈Γ(vstart) do

• 2:  for all u2∈Γ(vend) do

• 3:   if u1u2E then

• 4:    return P=(vstart, vstartu1, u1, u1u2, u2, u2vend, vend)

• 5:   end if

• 6:  end for

• 7: end for

• 8: return

Computational experiments show the effectiveness of the improvement heuristics for random graphs, while the rounding heuristics perform better on the DIMACS benchmark graphs (Rebennack et al., 2011).

### 8. Conclusions

In this tutorial, we discussed several building blocks of any branch and cut algorithm for the stable set problem. The polyhedral aspects are a main focus, though we also review preprocessing, branching and heuristics tailored to a branch and cut framework.

In conclusion, the maximum stable set problems (maximum clique problems) are still computationally challenging to solve. Polyhedral approaches have reached their limitations in solving the toughest benchmark problems. Problems arising from practical applications are often even harder to solve, as the problems may lead to very large graphs with special structure to be exploited (Pardalos and Rebennack, 2010).

### Acknowledgements

The authors thank Alexandra Newman (Colorado School of Mines) for her careful proofreading of this tutorial. This research was partially supported by DTRA and NSF grants.

1. 1

A node is called isolated, if there are no adjacent nodes in the graph.

2. 2

A triangle is a clique with three nodes.

3. 3

The number of nodes contained in a cycle is called the length of the cycle.

4. 4

The “t” stands for “trou”, the French word for “hole”.

5. 5

A graph G is called almost-bipartite if there is a node v such that graph G−v is bipartite.

6. 6

The line graph of G is the graph whose node set is the edge set of G and two nodes are adjacent, if and only if their corresponding edges are incident to a same node in G.

7. 7

A graph is called triangulated if it does not contain a chordless cycle of length at least four.

8. 8

Two natural numbers are called relatively prime if their greatest common divisor is 1, or gcd(pq)=1.

9. 9

This is not obvious, but the vertices can be enumerated, for instance, with PORTA.

10. 10

Propostion 4.11 also allows general valid inequalities.

### References

• , , , 1999. On maximum clique problems in very large graphs. In , (eds) External Memory Algorithms, DIMACS Series. American Mathematical Society, Boston, MA, pp. 119130.
• , , , , 2001. TSP cuts which do not conform to the template paradigm. Computational Combinatorial Optimization LNCS 2241, 157222.
• , , , , 2006. The Traveling Salesman Problem: A Computational Study. Princeton University Press, Princeton, NJ.
• , , 1992. Probabilistic checking of proofs; a new characterization of NP. In Proceedings 33rd IEEE Symposium on Foundations of Computer Science. IEEE Computer Society, Los Angeles, CA, pp. 213.
• , 2007. Resolution branch and bound and an application: the maximum weighted stable set problem. Operations Research 55, 5, 932948.
• , , 1976. Set partitioning: a survey. SIAM Review 18, 4, 710760.
• , , 1986. Finding a maximum clique in an arbitrary graph. SIAM Review Journal on Computing 14, 4, 10541068.
• , 1970. On maximum matching, minimum covering and their connections. In (ed.) Proceedings of the Princeton symposium on mathematical programming. Princeton University Press, Princeton, NJ, pp. 303312.
• , , , , 1999. The maximum clique problem. In , (eds) Handbook of Combinatorial Optimization. Kluwer Academic Publishers, Boston, pp. 174.
• , , 1976. Bounds on positive integral solutions of linear Diophantine equations. Proceedings of the American Mathematical Society 55, 299304.
• , 1983. An introduction to Convex Polytopes, Graduate Texts in Mathematics. Springer, Berlin.
• , 2006. A new trust region technique for the maximum weight clique problem. Discrete Applied Mathematics 154, 20802096.
• , 2007. A Least Squares Framework for the Maximum Weight Clique Problem. Available at http://www.stasbusygin.org (accessed January 2011).
• , , , 2002. A Heuristic for the maximum independent set problem based on optimization of a quadratic over a sphere. Journal of Combinatorial Optimization 6, 3, 287297.
• , 2003. Maximum independent set and related problems, with applications. Ph.D. Thesis, University of Florida.
• , A Lagrangian relaxation for the maximum stable set problem. International Transactions in Operational Research, submitted.
• , , , 2000. On the separation of maximally violated mod-k cuts. Mathematical Programming 87, 1, 3756.
• , , 1990. An exact algorithm for the maximum clique problem. Operations Research Letters 9, 375382.
• , , 1995. Separation problems for the stable set polytope. In , (eds) The 4th Integer Programming and Combinatorial Optimization Conference Proceedings, pp. 6579.
• , , 1997. Wheel inequalities for stable set polytopes. Mathematical Programming 77, 389421.
• , , 2002. Antiweb-wheel inequalities and their separation problems over the stable set polytopes. Mathematical Programming 92, 1, 153175.
• , 1997. Low-dimensional 0/1-polytopes and branch-and-cut in combinatorial optimization. Ph.D. Thesis, Ruprecht-Karls-Universität, Heidelberg.
• , , , , , 2005. Recognizing Berge graphs. Combinatorica 25, 143186.
• , , , , 2006. The Strong Perfect Graph Theorem. Annals of Mathematics 164, 51229.
• , , 2007. Semidefinite programming relaxations for graph coloring and maximal clique problems. Mathematical Programming B 109, 345365.
• , , , , 2001. Branch-and-cut algorithms for combinatorial optimization and their implementation in ABACUS. Computational Combinatorial Optimization LNCS 2241, 157222.
• , 2007. Maximally violated mod-k cuts: a general purpose separation routine and its application to the linear ordering problem, Master's Thesis, Ruprecht-Karls Universität Heidelberg.
• , , , 1995. Experimental analyses of the life span method for the maximum stable set problem. The Institute of Statistical Mathematics Cooperative Research Report 75, 135165.
• , , 1979. Computers and Intractability, A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, New York.
• , , 1986. Matrices with the Edmonds-Johnson property. Combinatorica 6, 365379.
• , , 2006. Exploring the relationship between max-cut and stable set relaxations. Mathematical Programming 106, 1, 159175.
• , , , 1996. A continuous based heuristic for the maximum clique problem. In , (eds) Clique, Graph Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Vol. 26 of DIMACS Series. American Mathematical Society, pp. 103124.
• , , , , 1997. A continuous characterization of the maximum clique problem. Mathematics of Operations Research 22, 3, 754768.
• , , , 1984. A cutting plane algorithm for the linear ordering problem. Operations Research 32, 6, 11951220.
• , , , 1988. Geometric Algorithms and Combinatorial Optimization, Algorithms and Combinatorics 2. Springer, Berlin.
• , , 1981. Weakly bipartite graphs and the max-cut problem. Operations Research Letters 1, 2327.
• , 2000. Graph theory in practice: part I. American Scientist 88, 1, 9.
• , , 1997. Business Optimization using Mathematical Programming. Macmillan, New York.
• 1972. Reducibility among combinatorial problems. In , (eds) Complexity of Computer Computations: Proceedings of a Symposium on the Complexity of Computer Computations. Plenum Press, New York, pp. 85103.
• , , 1996. Edge projection and the maximum cardinality stable set problem. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 26, 249261.
• , 1978. Using cutting planes to solve the symmetric traveling salesman problem. Mathematical Programming 15, 177188.
• , , 1974. Properties of vertex packing and independence system polyhedra. Mathematical Programming 6, 4861.
• , , 1988. Integer and Combinatorial Optimization, Wiley-Interscience Series in Discrete Mathematics and Optimization. John Wiley & Sons Inc, New York.
• , 1973. On the facial structure of set packing polyhedra. Mathematical Programming 5, 1, 199215.
• , , 1987. Optimization of a 532 city symmetric traveling salesman problem by branch and cut. Operations Research Letters 6, 1, 17.
• , , , 1993. Test case generators and computational results for the maximum clique problem. Journal of Global Optimization 3, 463482.
• , , 1990. A global optimization approach for solving the maximum clique problem. International Journal of Computer Mathematics 33, 3–4, 209216.
• , , 2010. Computational challenges with cliques, quasi-cliques and clique partitions in graphs. In (ed.) Lecture Notes in Computer Science, Vol. 6049/2010. Springer, Berlin, pp. 1322.
• , , 1992. A branch and bound algorithm for the maximum clique problem. Computers and Operations Research 19, 5, 363375.
• POlyhedron Representation Transformation Algorithm (PORTA), Version 1.4.0. Available at http://www.informatik.uni-heidelberg.de/groups/comopt/software/PORTA (accessed January 2011).
• , (eds), 2001. Perfect Graphs. John Wiley & Sons, New York.
• , 2006. Maximum stable set problem: a branch & cut solver. Master's Thesis, Ruprecht-Karls-Universität Heidelberg.
• , 2008. Stable set problem: branch & cut algorithms. In , (eds) Encyclopedia of Optimization, (2nd edn), Springer, Berlin, pp. 36763688.
• , , , , , , 2011. A branch and cut solver for the maximum stable set problem. Journal of Combinatorial Optimization. DOI: DOI: 10.1007/s10878-009-9264-3.
• , , 2001. A branch-and-cut algorithm for the maximum cardinality stable set problem. Operations Research Letters 28, 6374.
• , 2003. Combinatorial Optimization: Polyhedra and Efficiency, Vol. 24 of Algorithms and Combinatorics. Springer, New York.
• , 1998. A branch and bound algorithm for the stability number of a sparse graph. INFORMS Journal on Computing 10, 4, 438447.
• , 2005. Polyhedra and algorithms for the general routing problem. Ph.D. Thesis, Ruprecht-Karls-Universität, Heidelberg.
• , 2007. A branch, price, and cut approach to solving the maximum weighted independent set problem. Ph.D. Thesis, Texas A&M University.
• , , , , 2005. A branch-and-price approach for the maximum weight independent set problem. Networks 46, 4, 198209.
• , 1995. Lecture on Polytopes, Graduate Texts in Mathematics. Springer, Berlin.