Task allocation and trajectory planning for multiple agents in the presence of obstacle and connectivity constraints with mixed‐integer linear programming

This article addresses the problem of maneuvering multiple agents that must visit a number of target sets, while enforcing connectivity constraints and avoiding obstacle as well as interagent collisions. The tool to cope with the problem is a formulation of model predictive control including binary decision variables. In this regard, two mixed‐integer linear programming formulations are presented, considering a trade‐off between optimality and scalability between them. Simulation results are also shown to illustrate the main features of the proposed approaches.

task is necessary to avoid interruptions of the coordination algorithms. 6 For modeling purposes, a common assumption states that a communication link is established between two agents if they are within a given distance. The kind of problem addressed in the present work is similar to Kantaros and Zavlanos (2016) 7 and Stephan et al (2017). 8 In Kantaros and Zavlanos (2016) 7 spatially distributed objectives (therein termed tasks) are assigned to a subgroup of the agents (the so-called leaders), while the remaining ones move to comply with the connectivity constraints. However, the initial task assignment does not consider obstacles and collisions between the agents, which are taken into account later in the planning phase. Another recent work that performs the task assignment and the planning in two distinct hierarchical layers can be found in Stephan et al (2017). 8 Considering a predefined task assignment, that is, an a priori definition of the final configurations for each agent, Filotheou et al (2018) 9 addressed the problem of guiding MAS in which each agent presented nonlinear second-order Lagrangian dynamics while avoiding collisions and maintaining connectivity. Recently Karkoub et al (2019) 10 proposed an algorithm for the control of MAS with unicycle dynamics endowed with interagent collision avoidance and connectivity maintenance capabilities. However, the article is restricted to the agent steering problem, under the assumption that a preplanned trajectory is available. The higher layer concerning trajectory planning was not addressed.
Many control laws have been proposed in the literature to maintain connectivity. 6 In a more precise manner, the interest here is to develop algorithms that guarantee the communication graph stays connected if it is so at the beginning of the task. Early researches in this field usually worked with a concept termed local connectivity, where the control law is developed to ensure that each initially active communication link will remain active for the rest of the task.
Although keeping the local connectivity of each communication link guarantees the connectivity of the overall communication graph, it is often overly conservative and the agent team may be better off to fulfill its objectives if global connectivity is considered. At each time instant, global connectivity refers to the usual connectivity concept from graph theory: the communication graph may have missing links, as long as a path exists between each pair of agents. Since the agents move, creation, and deletion of links may happen dynamically, so global connectivity does not ensure local connectivity.
Task allocation and motion planning stem as fundamental problems in many applications. Many motion planning techniques for MASs are extensions of the ones developed for a single-mobile agent, which may be categorized mainly into potential fields, 11 graph-based, 12 or sampling-based 13 approaches. In a MAS, the agents should not only avoid collision with environment obstacles but also take interagent collision avoidance into account. 5 One alternative commonly present in the literature to simplify the problem is to assume that some agents are "leaders" and others are their so-called "followers." In such formulations, usually only the leaders are assigned tasks and the followers are required to be kept connected, but not necessarily perform any task. One interesting result for single-integrator agents are the bounds on the ratio of leaders to followers that ensure connectivity can be kept. 14 However, such formulations usually attain suboptimal results in view of the simplifications brought by the assignment of agents to the roles of either leaders or followers. Thus, in the present article we aim at coping with a topology in which each agent has the same role, being free to move as to perform the tasks (herein target set visitations) optimally, provided that the constraints are enforced, including connectivity maintenance and collision avoidance.
An approach that uses model predictive control (MPC) with a mixed-integer linear programming (MILP) encoding to allow trajectory planning endowed with circumvention of polygonal obstacles was proposed in Richards and How (2006) 15 in the context of a single-agent subject to disturbances. In the undisturbed case, a similar approach to a MAS was proposed in Schouwenaars et al (2001). 16 Moreover, Prodan et al (2012) 17 proposed a scheme with two layers, namely, a higher layer that assigns tasks represented as target states to each agent at the end of the horizon and a lower layer that optimizes the trajectory of the agents toward their assigned tasks while avoiding collisions between them. Although this method solves task allocation and trajectory planning within a MPC framework, it does so in separate layers, resulting in suboptimality, since the optimization solver is not able to exploit the relationship between task allocation and trajectory planning.
This article presents a MILP formulation for task allocation and trajectory planning, which respects collision avoidance and connectivity constraints. Our work follows on the steps of previous MILP formulations for trajectory planning, 15,16,[18][19][20] but novel contributions are brought to the field. To the best of our knowledge, our formulation is the first one to explicitly consider global connectivity in MAS trajectory planning using an optimization framework. We may add that a MPC formulation permits us to handle constraints that are relevant for real-world agents, such as actuator saturation. In this regard, many proposed control laws only guarantee connectivity maintenance if high bandwidth unbounded control effort is available, 21 which is unrealistic on real-world agents. Furthermore, in our problem, the agents may collect rewards within specific regions during the task. Any agent may collect any reward, thus we also encode this task allocation within the MILP, allowing an optimal choice of which agent should collect which rewards.
The main contributions are 2-fold: (i) the global connectivity problem is reformulated in a graph theoretical framework and we propose constraints that enforce that the graph has a spanning tree, which ensures connectivity. Such an approach in the context of global connectivity maintenance in a MAS is novel, to the best of our knowledge; (ii) we propose alternative constraints to ensure the existence of a spanning tree within the graph, reducing the growth in the number of constraints from exponential to linear regarding the number of agents, at the cost of additional conservatism. Moreover, we propose an algorithm to perform the initial enumeration of the agents and demonstrate that it yields a feasible optimization problem, even with the more conservative constraints.
The remaining of this article is organized as follows. Section 2 provides theory background and explains the mathematical formulation of the trajectory planner. In Section 3, simulation results are shown to validate the MILP approach presented here. Finally, Section 4 concludes and shares our ideas for future investigation. Constant subtracted from the right-hand side of the nonstrict inequalities "≤" to make the left-hand side obey a strict inequality "<" with respect to the remaining terms on the right-hand side x c,max ∈ R Bound on the maximal distance between agents in the x direction for communication range y c,max ∈ R Bound on the maximal distance between agents in the y direction for communication range

O(•)
Notation to indicate that the order (either memory usage or computation time) of an algorithm is dominated by the quantity • T G A tree r Root node of a tree L A list of the vertices obtained by traversing a tree in postorder and their indices obtained during the traversal  i Set of children of the ith node of a tree, where i is the index of this node in postorder 2

MATHEMATICAL FORMULATION
In this section, the problem of maneuvering a connected MAS through an obstacle field populated with target sets is formulated in mathematical terms using graph theory. Two encodings of the connectivity constraints are presented. Both ensure connectivity maintenance throughout the maneuver, the difference being that the first one models exactly the connectivity problem, at the cost of an exponential growth in the number of constraints with the number of agents, whereas the second is more conservative, with the advantage of a polynomial (linear) dependence of the number of constraints upon the number of agents.

Multiagent communication graph
When global connectivity is of concern, we may encode the MAS from a communication standpoint as a time-varying graph G(t) = {V, E(t)}, 6 where the set of vertices V = {v 1 , v 2 , … , v N a } represents the agents and the set of edges represents the active communication links between agents at time t. In this work, we consider that a communication link is established between two agents if they are within communication range of each other.
Since we are not interested in properties of the communication link, but merely in its existence, no weights are considered for the edges. Moreover, communication is considered to be bidirectional, thus the graph G(t) is undirected. Finally, self-links are considered to be of no interest, so no edge (v i ,v i ) is ever present in G(t).
In this graph framework, agent i may send information to agent j if there is a path from v i to v j in G(t). The graph G(t) may be represented by an adjacency matrix A G(t) ∈ R N a ×N a . Let us denote a ij (t) as the element at row i and column j of A G(t) , a ij (t) = 1 if the edge (v i ,v j ) exists in G(t), while a ij (t) = 0 otherwise. Given that G(t) is undirected, we also have a ij (t) = a ji (t),∀i,∀j. Therefore, formally, the MAS has global connectivity at time t if for each v i ∈ V and v j ∈ V, there is a path between v i and v j .
For the sake of providing background for the following discussions, we will introduce some additional concepts from graph theory. For a more in-depth discussion, the interested reader may refer to textbooks on graph theory, such as Cormen et al (2009). 22

Definition 2.
A path is a nonempty graph P = (V P ,E P ) such that V P = {v P,1 , v P,2 , … ,v P,n } and where v P,i ≠v P,j , for i = 1, 2, … ,n and i≠j.

Definition 4.
A tree is a connected undirected graph from which none of the induced subgraphs are cycles.

Definition 5.
A spanning tree T G = (V,E T ) of an undirected graph G = (V,E) is a subgraph of G that is a tree.
Remark 1. A connected graph G has at least one spanning tree, and may have several if there are cycles in G. A graph that is not connected does not have a spanning tree.
Remark 2. A spanning tree is a subgraph with the minimum possible number of edges while maintaining connectivity, since the inexistence of cycles means that removing any edge would split the graph into two connected components. Theorem 1. If an undirected graph G = (V,E) has |E|=|V| − 1 and does not have cycles, then G is a tree.
The proof of Theorem 1 is omitted for brevity, as it is standard in graph theory.

Mixed-integer linear programming for trajectory planning
In the present section, the employed model and notation are introduced alongside a discussion on how the maneuvering problem of an agent avoiding obstacles may be reformulated as a MPC-MILP optimization problem. The adopted formulation is that of Richards and How (2006), 15 but a simplification is made: only the nominal (undisturbed case) is considered. This simplification allows us to focus on the contributions of the present article, namely, coping with trajectory planning for MAS encompassing obstacle and interagent collision avoidance, flexible assignment between agents and visited target sets, and connectivity constraints within a single MILP. However, the constraint-tightening approach in Richards and How (2006) 15 can be seamlessly incorporated to the proposal within the present work to enable robustness to unknown disturbances. A discrete-time linear model involving positions, velocities, and accelerations of the agent in a plane with axes x and y is considered. The inputs are a x and a y (accelerations along each axis). The velocities v x and v y and positions x and y are components of the state vector. The discrete-time model can be written as x k+1 = Ax k + Bu k , with the state and control vectors x T = [x v x y v y ] ∈ R n x and u T = [a x a y ] ∈ R n u , respectively. For a unit sampling time, the model matrices are which correspond to double integrator dynamics in each axis. The maneuver in Richards and How (2006) 15 consists of moving the agent starting at the initial state x initial to a polyhedral target set  = {P target Cx k ≤ q target }, where C is a matrix that extracts position information from the state vector, while avoiding N o polyhedral obstacles  j , 1 ≤ j ≤ N o , and minimizing a weighted time-fuel cost function. The state and control are constrained to admissible sets  and  , respectively, such that 0 ∈  , with  and  also polyhedral. The optimization problem results in a sequence of inputs u k that steer the state from x 0 = x initial to a point in . In this article, the notation for predicted values will simply be an index k referring to the prediction at the kth instant from the current instant (k = 0), as in Fleming et al (2015). 23 The maneuvering optimization problem is posed as: subject to The choice of the weight ∈ R + balances between optimization of the cost of fuel during the maneuver and the time to reach the terminal set, represented by the optimization variable N. The maneuver is concluded at time N + 1, as seen in (3f).
This problem involves two major difficulties for using numerical optimization algorithms, namely: the variable horizon N and the loss of convexity of the feasible set due to the obstacle avoidance constraints in (3e). These issues are both dealt with by the introduction of binary optimization variables in the problem, along with the so-called "big-M" approach. 24 The interested reader is referred to some of the following references for a thorough presentation of the schemes employed in writing a MILP whose solution is equivalent to the solution of Problem 1: Schouwenaars et al (2001), 16 Richards et al (2002), 25 and Richards and How (2006). 15 For general discussions about mixed-integer programming and its application in control we refer the reader to Jünger et al (2009), 26 Vielma (2015), 27 and Prodan et al (2016). 28 The variable horizon issue is dealt with by using one binary variable b hor k for each time instant and imposing the terminal set constraints at every time instant along a fixed horizon T, using the "big-M" method to relax the constraints up until the target is reached. Moreover, the optimal solution of Problem 1 does not impose the state, control, and obstacle avoidance constraints after the variable horizon N (reaching of the target set). Therefore, the state, control, and obstacle avoidance constraints written up to the fixed horizon T are relaxed after the target set is reached. Additionally, the equality constraints in (3b) can be split into two inequalities and relaxed after the horizon. Now, with each constraint relaxed after the horizon, since 0 ∈  , the optimal solution is to make u k = 0, k>N. Notice that for this to hold one must have T greater than or equal to the optimal N. If that is the case, the optimal solution of the MILP problem with the presented "big-M" constraints is the same as the one of the original Problem 1, but can now be obtained via faster and more reliable numerical methods.
On the other hand, the nonconvexity of the exclusion constraints in (3e), that is, the obstacle avoidance constraints, requires additional use of binary variables. If the sets  j , 1 ≤ j ≤ N o are polyhedra with N os faces each, then the obstacles can be represented as the intersection of N os half-planes where the inequality is elementwise. Generalization to polyhedra with different number of faces is straightforward. For a point to be outside the obstacle  j , it suffices that one of the N os inequalities is not respected, therefore the avoidance problem can be written by using "or" constraints, with the choice being that one of the N os inequalities does not hold, that is, ∃n, 1 ≤ n ≤ N os | (p obs j,n ) T Cx k > q obs j,n , where (p obs j,n ) T is the nth row of P obs j and q obs j,n the nth element of q obs j . These "or" constraints can be imposed using the "big-M" method. 24 One interesting enhancement to the formulation in Richards and How (2006) 15 was proposed in Maia and Galvão (2009) 18 involving the inclusion of additional constraints on the obstacle avoidance binary variables. The aim was to ensure that the consecutive positions of the agent at times k and k + 1 did not produce a continuous-time trajectory that intersected the obstacle. This procedure was later made more economic in Richards and Turnbull (2015). 19 The elegant result in Richards and Turnbull (2015) 19 is that, to avoid intersample collision with the obstacles, it suffices to impose the same avoidance collisions at consecutive time instants. A later advancement was reported in Stoican et al (2018). 20 Using the so-called "shadow regions" the authors characterized the regions that may be reached by the agent in one time step without colliding with the obstacles and merged them in a collection of convex polyhedra, which may require less binary variables. However, the ensuing optimization problem is a mixed-integer nonlinear program (MILNP), which is known to be more cumbersome. Simplifications could be introduced to recast the problem into a MILP form, but at the cost of losing some of the benefits of the shadow region method. Therefore, in the present article, the approach in Richards and Turnbull (2015) 19 was adopted for its simplicity.
Finally, to cope with the appearance of the norm one of the control variables in (2), while keeping a linear cost function, one may use additional real-valued variables k, , k = 0, 1, … ,T, = 1, 2, … , n u and constraints as in Section 6.1.1 of Boyd and Vandenberghe (2009). 29 Thus ,(T + 1)n u additional continuous variables and 2(T + 1)n u inequalities are required.

Multiagent trajectory planning with interagent collision avoidance
A formulation of the maneuvering problem using MPC-MILP for multiple agents is presented in Schouwenaars et al (2001). 16 This multiagent planner is not a direct extension of the one presented in Subsection 2.2, since it does not implement some of the features of the latter one, such as the variable horizon and the intersample avoidance constraints. However, two additional characteristics are interesting to present: extension to the multiple agent case and collision avoidance between agents. Such is also the case in the planner formulation in Richards et al (2002), 25 in which the terminal states can be chosen by the agents, but the horizon is still fixed. In this case, the MAS consists of N a agents. Although in Schouwenaars et al (2001) 16 and Richards et al (2002) 25 the possibility of each agent having different dynamics is considered, this will be not addressed in the present work, but the reader can see that extension to this case is easily done. Thus, each agent will be supposed to have the dynamics given as For each agent, the state vector is defined similarly as in the single agent case as The formulation also contemplates state and command limits and obstacle avoidance by employing the following constraints (once again different limits for each agent could be assumed, but this is not done so in the present article to simplify the exposition): (7) Constraints (5), (6), and (7) refer to state limits, command limits, and obstacle avoidance, respectively. In Schouwenaars et al (2001) 16 and Richards et al (2002) 25 rectangular obstacles are considered as defined by their lower left and upper right extremes [x j,min , y j,min ] T and [x j,max , y j,max ] T , respectively, for j = 1, 2, … ,N o . Notice that this is a particular case of the obstacle constraints presented in Subsection 2.2. The constant is subtracted from the right-hand side of the nonstrict inequalities "≤" to make the left-hand side obey a strict inequality "<" with respect to the remaining terms on the right-hand side, thus removing the possibility that the agent occupies a position at the boundary of the obstacles.
In this multiagent setting, agents may collide with each other, so new constraints must be added for interagent collision avoidance. As in the case of obstacle avoidance, assuming that the safe region around an agent is represented by a polyhedron {P col Cx ≤ q col }, to impose that the agents do not collide it is sufficient to impose the following constraints (similar to the obstacle avoidance ones): where each b col i, ,n,k is a binary variable employed for the interagent collision avoidance mechanism, (p col n ) T is the nth row of P col , q col n is the nth element of q col , and has the same role as in (7). The idea here is very similar to the one used for obstacle avoidance, with the remark that, since the safe region around each agent is assumed to be the same, if agent i does not collide with agent , this pair does not need to be considered twice, which allows the reduction in the number of binary variables expressed by the last equality in (8).

Multiagent trajectory planning with connectivity assurance
The problem to be solved in the present article is that of guiding one of N a agents (which one of the agents is also an optimization variable) to a fixed terminal target set, collecting rewards when one of the agents passes through one of the other N t − 1 intermediate targets (if possible and profitable, that is, if it is feasible and if the time and fuel consumption to pass through an intermediate set is compensated by the reward it offers), while maintaining connectivity of the graph representing the communication network of the agents, avoiding collisions between agents and the N o obstacles and between pairs of agents. As such, the proposal within this article allows complete flexibility, since there is no preassignment of an agent to a target set nor a leader is chosen beforehand, rather these choices are made during the optimization to cope with the constraints and achieve the minimal cost. Therefore, this formulation can be regarded as a strategy with no leader, the agents being bound only by collision avoidance and communication ranges around each of them. The cost function that encompasses these objectives is: The agents are assumed to have a certain communication range, which is common for all agents, represented as a polyhedron with N sc faces: {P con Cx ≤ q con }. Therefore, the pair of agents i and is considered connected at instant k if {P con C(x i,k − x ,k ) ≤ q con }. Considering symmetric polygons with respect to the origin allows the conclusion Minor changes must be made to deal with the case where the polyhedron is not symmetric. In particular, in the present article the communication ranges will be defined by the bounds on the absolute values of the distances in each axis, namely, x c,max and y c,max , resulting in a rectangular communication range with sides parallel to the x and y axes. Any other form that results in a polyhedron could be used, therefore this presents no restriction of the proposal. In fact, even scenarios where the communication range may be better described as a circle can be coped with by inscribing a polygon in the circle. The enforcement of the polygonal constraints would ensure that the circle communication connectivity constraints are also respected. The amount of conservatism introduced by the approximation of the circle by a polygon can be reduced by increasing the number of sides of the polygon, at the cost of an increase in the number of constraints.
We remark that the adjacency matrix does not penalize the distances between the agents in our formulation. Penalizing the distances between agents might steer the optimal solution toward keeping the agents within communication range of one another, although no guarantee of connectivity arises by solely doing this without the connectivity constraints. On the other hand, weights depending on the distance between agents might lead to a more cumbersome optimization problem, as the weights in the adjacency matrix themselves would be optimization variables. For example, if one intends to use the euclidean distance, the relation between the weights and the positions of the agents would be nonlinear.
In order to impose the constraints over a horizon of at most T + 1 time steps, T + 1 binary variables are necessary. Moreover, constraints must be added to the classical maneuvering optimization Problem 1, resulting in Problem 2.
subject to For the sake of completeness, all the constraints involved in the proposed formulation are presented in Problem 2; however, we will focus our discussion here on the ones that were added to the problem in the present article. Constraints (11h) impose obstacle avoidance, while (11i) enforce the corner cutting avoidance, as described in Richards and Turnbull (2015). 19 One of the complements of the half-planes defining an obstacle side must be occupied by the agents at every sample time. This is ensured by (11q), in which a small difference regarding the encoding in Schouwenaars et al (2001) 16 and Richards and How (2006) 15 can be noticed: in the former papers, the choice is to consider the binary variable equal to one when the avoidance constraint is relaxed by the "big-M" procedure and zero otherwise, as opposed to the present formulation, where the binary variable is one when the corresponding constraint is imposed and zero in case it is relaxed. This small difference does not imply any negative consequence, if constraints (11h), (11i), and (11q) are modified accordingly, as was done.
Collision between each pair of agents is prevented by constraints (11j), which are similar to obstacle avoidance constraints. Therefore, at least one half-plane associated with the faces of the agent polyhedron must be imposed, which is done by (11r). Again, the same remark highlighted in the previous paragraph is valid: the choice for the encoding is opposed to the one in Schouwenaars et al (2001), 16 but the constraints are accordingly modified and the change is immaterial to the results.
A certain number of target sets may be visited; therefore, there are decision variables related to the choice of visiting a target set or not. These are associated with rewards that diminish the cost function and may only be collected if one of the agents occupies a position within the associated target set at one instant, as imposed in (11k). The decision binary variables are constrained as in (11n) to allow collection of the rewards only once per target. Moreover, the optimization must be considered up to the instant when the last target set is visited. This is ensured by (11o), which imposes that the optimization horizon is greater or equal than all the instants of visitation of the target sets. Moreover, (11p) imposes that the last target set in the list of possible target sets should be the last visited. The maneuver is guaranteed to finish within the maximal horizon T + 1 by the constraint in (11m). This last feature allows a maneuver with a goal end position and consideration of additional opportunities of visiting other intermediate targets if convenient.
The main contribution of this work is related to (11l). This constraint, associated with (11t), (11u), and (11v), imposes that each agent must be within the communication range of at least one other agent and that no pair needs to be considered twice (meaning that, if it was imposed that agent i is already considered in the communication range of agent j, than agent j is not again imposed to be within the communication range of agent i). Note that a polyhedral communication area must be considered. Despite a circular communication area being a more natural model, this is avoided as there would be a need to include quadratic constraints, which would make the optimization problem more challenging. In practice, the communication polyhedron may have as many faces as desired, therefore an arbitrarily precise approximation to a circle may be considered at the cost of adding inequalities to the optimization problem. For illustration, N os = 4 was considered in the simulations in the present article.
These constraints also impose that the graph must be connected. The equalities in (11v) encode the fact that the connectivity is bidirectional; therefore, there is no need to reconsider the terms b con i,j,k in the lower left triangles of the adjacency matrices (recall that the communication graph is time-varying, so we must have one adjacency matrix per time step), actually reducing the number of necessary binary variables.
For each time step k, Theorem 2 formally attests that the constraints (11t) and (11u) only permit generation of connected graphs. Moreover, all possible connected graphs are taken into account, as discussed later. First of all, we may say that these constraints are selecting a spanning tree for the communication graph.

Theorem 2. At time step k, satisfaction of constraints (11t) and (11u) guarantees that the communication graph is connected.
Proof. Theorem 1 imposes two conditions for a graph G = (V,E) to be a tree: |E|=|V| − 1 and no cycles must be present. Constraints (11t) directly impose the first condition. The second one is guaranteed by constraints (11u), which are called subtour elimination constraints in the literature. 30 These constraints prevent the existence of cycles for each induced subgraph, therefore inexistence of cycles for the graph follows.
In fact, assume, for the sake of contradiction, that a cycle exists, then one could consider the induced subgraph G(S) = (S,E(S)), where S is the set of the vertices in this cycle. For this cycle, we have |S| vertices, so counting the edges due to the cycle entails at least |S| edges, that is, |E(S)| ≥ |S| (edges other than the ones in the cycle may also be present) and constraints (11u) do not hold, which is a contradiction. Therefore, the communication graph imposed by constraints (11t) and (11u) must a tree. Since a tree is connected by definition, the connectedness of the communication graph follows. ▪ To verify that the tree imposed by these constraints is really a spanning tree of a less restricted connected communication graph which may be chosen by the optimization, observe that the "big-M" technique used in (11l) works by imposing the constraint if the edge is present in the adjacency matrix, but simply relaxing it otherwise. In other words, the existence of an edge constrains the two agents to be within communication range; however, the inexistence of it leaves the optimization free to decide whether these agents should be placed within communication range or not. Hence, the actual communication graph may have more links than the ones seen by constraints (11t) and (11u), which is actually choosing a spanning tree. Since these constraints are able to generate any spanning tree and no constraints are imposed to the other (redundant) edges of the graph, the optimization is free to select the most convenient connected graph.
On the other hand, the amount of constraints introduced by (11u) may limit the applicability of the proposal in this article due to scalability issues. Since each induced subgraph must be considered, a priori we would need 2 N a of these constraints. In practice, some of these cases do not make sense or trivially hold. For |S| = 0 and |S| = 1, no edges are present. For |S|=N a , constraint (11t) is already a more restricted version. For |S| = 2, the constraints reduce to: which are trivially true given that b con i, ,k are binary variables. Thence, (11u) actually introduces (T + 1)(2 N a − N a − 2 − N a (N a − 1)∕2) constraints, that is still O (T2 N a ), that is, the number of constraints increases exponentially with the number of agents, which is theoretically undesirable for scalability. In practice, in terms of optimization time, this is not much of a downsize, as our simulation results support. On the other hand, if one considers memory usage, a larger number of inequalities may pose stringent requirements on hardware.

More restrictive formulation using fewer constraints
To cope with scalability issues in the number of constraints, an alternative heuristic formulation is presented, which requires less constraints and is shown to always produce a connected graph (though possibly suboptimal). Consider replacing the constraints (11t) and (11u) with the following alternative constraints: which impose that a single edge exists in each row of the adjacency matrix, except for the last one, which must have none.
In this case, we have merely (T + 1)(N a − 1) = O(TN a ) constraints, thus the optimization problem scales much better. The graphs generated by this constraint are also connected, in fact they are also spanning trees as before. This fact is formally stated as Theorem 3.
To better grasp Theorem 3 and other following discussions, let us introduce an intuitive representation of the possible assignments for binary variables b con i, ,k . For a given time step k, we may associate an adjacency matrix to these binary variables. Due to constraint (11v), no self-loops exist and all elements below the main diagonal are zero. Therefore, we may think of possible assignments for b con i, ,k as building a directed graph where b con i, ,k = 1 means the directed edge (v i ,v ) exists.
Proof. Consider any of these such graphs G = (V,E) with |V|=N a . First, since the constraint guarantees that each row of the adjacency matrix has exactly one active column, except for the last none, which has none, and we have N a − 1 rows, the graph has N a − 1 edges. Let us assume, by absurd, that cycles exist in the graph. Without loss of generality, let us pick one of these cycles such that its set of edges is: Figure 1A shows this loop. Considering the edge (v q ,v p ) of this undirected graph, there are two possible assignments for the adjacency matrix: b con q,p = 1 or b con p,q = 1 (where we dropped the time step index for conciseness). Due to (11v), b con i,j = 0, ∀j ≤ i, thus b con q,p = 1 and b con p,q = 1 imply q < p and p<q, respectively. Let us consider these two cases separately. If b con q,p = 1, we must have q < p. Since no other column in row q may be active due to (13), we necessarily have b con q−1,q = 1 for edge (v q−1 ,v q ) to exist. Analogously, we have b con q−2,q−1 = 1, … , b con p+1,p+2 = 1, b con p,p+1 = 1, which impose the following numbering between vertices (due to (11v)): p<p + 1 < · · · < q − 1 < q. This situation is illustrated in Figure 1B. Thence, q < p and p<q creates a contradiction.
Therefore, the graph cannot have any cycle. Since the graph has |E|=|V| − 1 edges and no cycles, by applying Theorem 1, we prove that this graph is a tree. ▪ Still, the not obvious downside is that this new formulation is suboptimal since the alternative set of constraints does not generate some spanning trees. To verify this, we will show a counterexample. If N a = 3, V = {v 1 , v 2 , v 3 } and we have three possible spanning trees: However, the spanning tree G 3 violates constraint (13) and will not be taken into account by the optimization. To see this, consider the adjacency matrix of this graph: where the symbol * was used to denote values that do not matter. The sum of the first row yields 2 in this example, then constraint (13) does not hold. In general, this comes from the fact that agents with a higher index have the privilege of being able to have a higher degree in this formulation. Consider the first agent, the only edge this agent may be in the chosen spanning tree comes from the first row, so the maximum degree for this agent is 1. For agent 2, the edges may come from rows 1 and 2, then the maximum degree is 2 in this case, and so on. Therefore, the ith agent has a maximum degree of i, except for the last one, which has the same maximum degree as the penultimate, that is, N a − 1.
Despite this theoretical limitation, in simulation experiments, if the agents' numbering is initially selected so an initial spanning tree respecting constraint (13) exists, we have observed that this heuristic formulation is able to generate multiagent trajectories very close in terms of optimality to the ones provided by the more complex optimization problem resulting from constraints (11t) and (11u). Indeed, if the communication graph is initially connected, it is always possible to enumerate the agents respecting constraint (13). In the following, Algorithm 1 is proposed to do this numbering and Theorem 4 states that the resulting numbering yields feasibility of constraints (11v) and (13) whenever a connected graph exists.
First, consider the initial connected communication graph. If this graph is not known beforehand, we may construct it by checking if each pair of agents are within communication range of each other, which takes O(N 2 a ) time. As a connected graph, it admits at least one spanning tree. Let us pick any of these spanning trees. A spanning tree may be easily obtained by executing a breadth-first search (BFS) or a depth-first search (DFS), which are known to construct a tree 22 and have computational complexity of O(N 2 a ) if an adjacency matrix is used to represent the graph. Then, recall that in a tree each node has exactly one parent. Therefore, directing each edge of this tree from node to parent, it appears that an assignment respecting constraint (13) exists. We still need to take into account the fact that agents with lower numbers have a lower degree limit. Since the first agent has a degree limit of 1, we must select a leaf for it. In this sense, our intuition says that we should enumerate leaves first, then go on enumerating the parents of these leaves, and so on. In fact, enumerating the agents in postorder 31 results in the required numbering, which needs O(N a ) time. The pseudocode for a postordering of a tree T G with initial index i is presented in Algorithm 1. Moreover, the following theorem states that the resulting numbering from Algorithm 1 ensures feasibility of constraints (11v) and (13).

Theorem 4. If an initially connected graph exists, then labeling according to Algorithm 1 results in feasibility of constraints (11v) and (13).
Proof. If the graph is initially connected, then in view of Remark 1 it has a spanning tree; therefore, its nodes can be labeled using Algorithm 1, and it is assumed in the following that such a labeling has been done. Let us consider a node labeled with i in the spanning tree and its c i children labeled as i,1 , i,2 , … , i,c i .
A possible choice of representation of the adjacency matrix at instant k is to set the binary variables representing the connection between each node i,j and its parent i to one in row i,j , that is, and set the remaining as By applying Algorithm 1 to label the nodes of this tree, it is clear that the label of a parent is greater than the label of all of its children, then it is true that: In view of (15), (16), and (17), both (11v) and (13) are trivially satisfied for rows i,j of the adjacency matrix. This constructive procedure can be applied to each level of the tree, entailing that constraints (11v) and (13) are respected, filling the adjacency matrix up until the level of the root of the tree. The label attributed by Algorithm 1 to the root is N a and since the root has no parent b con N a ,m,k = 0, m = 1, 2, … , N a , which enforces (11v). ▪ Algorithm 2 summarizes the procedure presented here to enumerate the agents in a way that constraint (13) is true (the algorithm assumes DFS, but BFS may be used as well). Considering the computational complexity of each of its parts, the overall procedure takes O(N 2 a ) time, which is already the lowest time complexity we may achieve, given that we have to first build the connected graph by checking each pair of agents.

Algorithm 2.
Enumerates agents so the heuristic formulation may be used ⊳ T G is the tree built by DFS L ← PostOrder(T G , ∅, 1) ⊳ L maps nodes to their postorder numbering and output * is not used end function

SIMULATION RESULTS
Three scenarios are compared in the following subsection to illustrate the use of the proposed approaches: (i) the proposal in Problem 2, (ii) the simplification to Problem 2 proposed in Subsection 2.5, and (iii) the imposition of a totally connected graph in the multiagent problem without using the proposed connectivity assurance method based on writing the adjacency matrix with binary optimization variables. In this last approach, to ensure connectivity, each agent is constrained to be within the polyhedron that represents the communication range of each of the N a − 1 other agents. The adopted conditions for the simulations are presented in Table 1. In this table, the obstacles have N os = 4 facets, the safe regions around the agents have N cs = 4 facets, and the polyhedra representing the communication ranges have N sc = 4 facets, each with the projection of each facet parallel either to the x-or y-axis; therefore, they are represented by the upper and lower limits on x and y. We stress that the representation of the safety regions and communication ranges as squares in these examples do not represent a limitation of the proposed approach, as other polygons with more sides could be used. It is rather an arbitrary choice for illustration. For other more general nonpolygonal safety regions or communication ranges, for example, circles, inner polygonal approximations of arbitrary accuracy are enabled by our proposal.
The software Matlab was used for the simulations, with CPLEX as the optimizer in a computer with an Intel Core i7 6700K processor (4.0 GHz clock) and 24GB RAM.
In all figures depicting trajectories and the associated connectivity graphs in the illustrative examples the following convention is used: the obstacles are depicted as black-filled polygons, the target sets are white-filled polygons with black borders and contain a roman number inside corresponding to the order in Table 1, the trajectories of each agent are depicted with a different color (with the corresponding agent id depicted in the legend), with their positions depicted as crosses (×). The communication range of each agent is depicted as a semitransparent square with sides of 2 length units compatible with the communication range presented in Table 1 and of the same color as the respective agent; however, these are just depicted at the instants of visitation of the target sets, since drawing them throughout the whole trajectory would clutter the figures excessively. Moreover, the connectivity graphs are not depicted at every instant, rather only at instants in which at least one target set is visited, as to avoid showing excessive information. In these cases, dark gray lines are used to represent a connection between agents at that instant.

Comparison of the results in the three simulation scenarios
In employing a numerical optimizer, one may define a maximal time for returning the feasible solution with the least cost at the moment. Figure 2A shows the results with a maximal time of 20 seconds for the proposed Problem 2. As can be seen, the agents move toward the fixed ultimate goal (target set 5). In their trajectory, two other target sets are visited. The connectivity graph of the agents is depicted in Figure 2B,C only at the instants of visitation of the targets, although the graph always remains connected, since showing it at every instant would demand excessive space. It can be noticed that the connectivity is maintained and, in fact, the flexibility of the positions of the agents is used to visit two target sets at once, while maintaining the connectivity. To provide insight into the proposed formulation, the connectivity matrices at the instants of visitation of the target sets are presented:

Parameter name Symbol Values
Maximal horizon T 20 Number of agents N a 6 Number of targets N t 5 Number of obstacles The configurations in the connectivity matrices reflect the graphs in Figure 2B,C. Notice that the matrix in (19) corresponds to a graph in which agent number 1 (red trajectory) is connected to three others, since the terms b con 1,3,4 = b con 1,4,4 = b con 1,6,4 = 1, a situation that could not be achieved with the modified version of Problem 2. On the other hand, (b con i, ,7 ) corresponds to a graph in which no agent has degree greater than two.
As the optimization time is increased to 40 seconds, Figure 3A depicts the ensuing trajectory still visiting three target sets. It can be seen again in Figure 3C that the topology is explored to allow visiting two targets at once, namely, target sets IV and V.
In Figure 4A, it can be seen that with 60 seconds the trajectory with the least cost so far visits all five target sets. Figure 4B-F shows the connectivity graph at the instants each target set is visited. Figures 5, 6, and 7 depict the same scenarios as Figures 2, 3, and 4, respectively, but employing the proposed modified version of Problem 2 in Subsection 2.5. It is most interesting to note that the results, given the same amount of time, are similar to the ones obtained with Problem 2, despite the fact that the modified version has more stringent constraints. As a matter of fact, by comparing the trajectories in Figures 6 and 3, one can see that the modified version was able to visit 4 targets given 40 seconds, whereas the best solution available so far to Problem 2 visits only three. After 60 seconds, both approaches find solutions that visit all five target sets.
For comparison, the trajectories when all the agents are required to remain within the communication range of each other are shown in Figure 8A. The optimal solution up until 20 seconds of computation time can be seen to visit three targets, as well as the case when the minimal connectivity is required in Figure 2A. Figure 8B-D shows the connectivity graph, which is totally connected, at the target visitation instants.
As the optimization time is increased to 40 seconds, the trajectories are displayed in Figure 9A. Similarly to the case of minimal connectivity (Figure 6), in this case four target sets are visited. The fully connected graph can be observed at the target visitation instants in Figure 9B-E.
On the other hand, to find a solution that includes trajectories visiting all five targets, a computation time of 450 seconds is required with the full connectivity imposition. This is 7.5 times the computation time required by the proposed minimal connectivity algorithm. The trajectories are shown in Figure 10A, and the connectivity graphs at each target visitation instant are shown in Figure 10B-F.
As a matter of fact, the capacity of finding solutions with lesser cost than the full-connectivity imposition is inherent to the proposal in this work, since the solution with full connectivity is a particular one to the minimal connectivity algorithm. Table 2 shows the cost value obtained at each computation time with the three approaches, where (i) is Problem 2, (ii) is the modified Problem 2, and (iii) is the fully connected graph version. Three facts deserve commenting: (1) the cost obtained with approaches (i) and (ii) is smaller than the one obtained with the full connectivity given the same optimization time, (2) the best cost obtained after 450 seconds with the full connectivity is still greater than the one obtained after 60 seconds with the proposals (i) and (ii), and (3) the modified Problem 2 (approach (ii)) yields solutions with better cost compared with (i), given the same optimization time). This may be a counterintuitive result, as the approach (ii) is suboptimal when compared with (i), but the smaller number of constraints facilitates the obtainment of better solutions within the same optimization time. Of course, one might expect that, given enough time, the cost of approach (i) becomes smaller than or equal to the cost of approach (ii). Moreover, both approaches (i) and (ii) involve more binary variables than (iii), and still less computation time is required by (i) and (ii) to achieve better solutions, illustrating that the proposed algorithms, though in principle more computationally demanding, in practice show better results within the same amount of computation time.

Results of Algorithm 2
In this subsection, we present results regarding Algorithm 2. We implemented this algorithm in C++ and executions with up to 100 agents were resulting in execution times on order of 10 −5 seconds, thus we may say that its execution time may be neglected when compared to the planning time.
In Figures 11A, 12A, and 13A, the initial communication graphs are shown, with agents assigned arbitrary id numbers, whereas Figures 11B, 12B, and 13B present the resulting numbering employing Algorithm 2. To verify its correctness, see Figure 11A, where the choice of numbering for the agents would not satisfy constraint (13), since the agent labeled with #1 must connect to two others, but this violates constraint (13). On the other hand, with the renumbering in Figure 11B, the label #1 can be seen to have been assigned to the rightmost agent, which needs only to be connected to one neighbor (namely, the one labeled as #2 in Figure 11B), satisfying constraint (13).

CONCLUSION
In this work, we developed mixed-integer linear programs for multiagent task allocation and trajectory planning with obstacle avoidance and global connectivity maintenance. In the considered task, the agents had to reach a final destination, but were allowed to collect rewards during the task. The proposed formulations plan trajectories for the team of agents while deciding which agent should collect which reward.
Two optimization formulations were presented. The first one is optimal, considering the exact number of constraints that are necessary to model the connectivity problem. However, the number of connectivity maintenance constraints grows exponentially in the number of agents. Therefore, we also introduced an heuristic formulation where the connectivity maintenance increases linearly in the number of agents, which is a more scalable solution. We call this approach heuristic, because it does not consider all possible communication graphs the MAS may assume, so it is not ruled out that a lower cost solution using the exact connectivity constraints might be found.
Simulation results were used to evaluate the proposed approaches. The planner indeed found feasible trajectories for the MAS while respecting the constraints and was able to find solutions where the agents collect all rewards  I   II   III   IV   V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  IV V  I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I  when the optimization solver was allowed to run for enough time. Moreover, we showed that both formulations obtained better solutions than a planner, which obligates the communication graph to be fully connected during the task.
Further enhancements that could be done with small additional overhead include: (i) implementation of a constraint tightening-approach to ensure robustness to external disturbances; (ii) incorporation of penalties proportional to the projection of the vectors connecting pairs of agents over some predefined directions so as to approximate the interagent distances, thus prioritizing solutions with lower dispersion between the agents; (iii) introduction of bounds on the fuel expense (herein represented by the sum of the absolute values of the control vector) of each agent; and (iv) introduction of a capacity bound of each agent, which is consumed by visiting the target sets, so as to consider knapsack-like problems.