A branch-and-cut approach and alternative formulations for the traveling salesman problem with drone

In this paper, we are interested in studying the traveling salesman problem with drone (TSP-D). Given a set of customers and a truck that is equipped with a single drone, the TSP-D asks that all customers are served exactly once and minimal delivery time is achieved. We provide two compact mixed integer linear programming formulations that can be used to address instances with up to 10 customer within a few seconds. Notably, we introduce a third formulation for the TSP-D with an exponential number of constraints. The latter formulation is suitable to be solved by a branch-and-cut algorithm. Indeed, this approach can be used to find optimal solutions for several instances with up to 20 customers within 1 hour, thus challenging the current state-of-the-art in solving the TSP-D. A detailed numerical study provides an in-depth comparison on the effectiveness of the proposed formulations. More-over, we reveal further details on the operational characteristics of a drone-assisted delivery system. By using three different sets of benchmark instances, consideration is given to various assumptions that affect, for example, technological drone parameters and the impact of distance metrics.


INTRODUCTION
Probing drones and other autonomous units in last-mile delivery is a current trend that is investigated by retailers and logistic service providers such as Amazon.com, Inc. or Deutsche Post DHL Group (see [20] and references therein). The integration of these unmanned units into existing logistic architectures might yield synergetic benefits that could emerge in terms of time or cost savings, both of which might be crucial for effectively carrying out cost-efficient same-day deliveries in the future (refer to [2]). In fact, drones have already become a proven technology in many related public and private sectors including energy, agriculture and forestry, environmental protection, and emergency response (see [21] and references therein).
Generally, we might accept that a drone can move faster between two locations than a truck as a drone has a larger average velocity, can move as the crow flies, and is not subjected to the limitations of a road network such as, for example, congestion. In addition, drones are far more lightweight than trucks, which might cause them to consume less energy when moving from one point to another. However, a drone's payload weight and cargo capacity in terms of volume are inherently limited. Moreover, as multirotor drones rely on comparatively small batteries for powering their flight, their range of operation is quite restricted compared to a delivery truck.  In the TSP-D, it might be useful to visit vertices repeatedly to increase the utilization of the drone Section 2.1, we provide a brief account on the development of novel (M)ILP formulations or any other exact method that comes with an optimality certificate. Afterwards, in Section 2.2, we highlight some additional works that are primarily concerned with heuristics. The works presented in these sections have some overlaps. In fact, many of the authors that introduced a novel MILP formulation, also developed heuristics for solving the problem (see, e.g., [1,20]). More general literature reviews on the potentials of utilizing drones in a civil manner in and outside the context of vehicle routing problems can be found in [21,25]. Murray and Chu [20] introduced the flying sidekick traveling salesman problem (FSTSP) that aims at finding a drone-assisted traveling salesman problem (TSP) tour. In this problem, it is assumed that a depot and a set of customers are given, each of whom must be served exactly once (see Figure 1). Initially, the truck-drone tandem is located at a central depot. The objective is to find a feasible routing of the truck and drone such that all customers are served and both vehicles have returned to the depot as quickly as possible, that is, with minimal makespan. The authors propose a MILP formulation for solving the FSTSP and specify a drone sortie by a triple (i, j, k): the drone is launched at vertex i, serves customer j, i ≠ j, and is retrieved by the truck at a third vertex k, j ≠ k. Notably, in their model, round trips are forbidden, that is, for every sortie (i, j, k) the following condition must hold: i ≠ k. Using a state-of-the-art MILP solver, the authors cannot find any provably optimal solution within a 30-minute time limit for instances with just 10 customers. As an alternative, Murray and Chu [20] propose a heuristic for solving the problem.

Model-based literature
The TSP-D was proposed by Agatz et al. [1] and shares most of the assumptions of the FSTSP. However, there are a few key differences to the FSTSP. First, the drone can be retrieved at the same location from which it was launched (see Figure 1). Second, the truck may visit customers more than once as this could be beneficial in operating the drone (see Figure 2).
Agatz et al. [1] propose a set-covering based ILP formulation for solving the TSP-D. Within this approach, they use an operation as the essential building block (see also Figure 4): an operation is characterized by an ordered sequence of at least two (or more) vertices as visited by the truck. Within this sequence, at most one customer can be served by the drone through a sortie that connects the start and end of the sequence. Using the ILP formulation, the authors can tackle instances with nine customers within no more than 40 seconds. However, due to the exponential number of operations w.r.t. the number of customers, they also report that this approach does not scale well. In fact, on instances with just 11 customers, several hours are required to obtain optimal solutions. Notably, some theoretical insights and heuristics that leverage local search and dynamic programming (DP) are also provided in this work.
In a follow up paper, Bouman et al. [3] provide another method that is based on DP. The authors highlight that the new DP method is able to find optimal solutions much faster compared to the ILP approach presented in [1]. In fact, the speedup is about fortyfold on instances with nine customers. However, it also becomes apparent that the performance degrades as the instance size is increased: for provably optimal solutions, instances with 12 customers already take 1 minute, an increase to 15 customers requires more than 2 hours of computation time, and no solution is found within 12 hours beyond this size. To overcome this issue, they propose a heuristic variant of their DP approach where only a subset of feasible operations is considered. The authors can show that a considerable speedup can be achieved through this procedure with just a slight loss on the solution quality w.r.t. optimal solutions. Es Yurek and Ozmutlu [11] provide an exact method for solving the TSP-D that is based on a decomposition of the problem into two components. First, all feasible tours to the TSP are generated that visit every possible subset of customers. Each tour provides a lower bound (LB) on a TSP-D solution in which the customers that are not covered by the tour must be served by the drone. Starting with the shortest tour (smallest LB), mathematical programming is used to identify the optimal drone sorties. This procedure continues until the best remaining LB is worse than the incumbent TSP-D solution. The authors compare their results to [1,20] and highlight that they can solve instances with 10 customers within 5 minutes and instances with 12 customers within 30 minutes. Although, the authors stress that performance depends largely on the type of instance as well as parameters (e.g., speed of the drone). In particular, their approach performs poorly on densely clustered instances: here, for the case of 12 customers, the algorithm rarely terminates within 1 hour.
In the work of Poikonen et al. [22], a branch-and-bound method is introduced for solving the TSP-D. Each node in the search tree represents a delivery sequence as visited by the truck that might contain only a subset of customers. At each node, for the customers not covered by this sequence, the optimal TSP-D solution can be computed through the DP algorithm proposed in [1]. During the search, child nodes are created by inserting customers into the parental sequence. Poikonen et al. [22] stress that, in the TSP-D, the objective value could also improve if an additional customer is inserted into the sequence. Therefore, as we climb the tree, for some children, the objective value obtained through the DP can in fact be smaller than that of its ancestors. The authors introduce the tree exploration ratio as a parameter that provides a tradeoff on optimality guarantee vs speed. With an optimality guarantee, they can solve instances with nine customers in just 1 minute. However, the authors also report that the solution time becomes intractably large when the number of customers is increased to 14.
Dell'Amico et al. [7] propose a modified FSTSP model that builds on the foundation laid by Murray and Chu [20]. In particular, they propose a new formulation in which some explicit constraints are replaced with exponentially many constraints that can be added in a cutting plane fashion. The affected constraints concern restrictions regarding, for example, the availability of the drone and the correct orientation of sorties (w.r.t. the route of the truck). Dell'Amico et al. [7] can show that these changes lead to an improvement in performance as they can solve several instances with 10 customers optimally within 1 hour. Moreover, in a subsequent work, Dell'Amico et al. [8] propose a more compact formulation. This is achieved by explicitly accounting for the drone's availability at each vertex (a similar approach is used in [27]). The authors can show that the branch-and-cut formulation yields favorable results as instances from [20] with 10 customers can be solved in about 15 minutes on average. Notably, the endurance of the drone has a significant influence. Moreover, a few instances with up to 20 customers can be solved optimally within 1 hour of runtime.
In Schermer et al. [29] a variant of the TSP-D named the TSP-D with robot stations (TSP-D-RS) was proposed. This variant merges the sidekick-based with the station-based approaches (refer to Section 1). The compact MILP formulation presented in this paper also covers the truck-drone as a special case. Using a state-of-the-art solver, the proposed formulation can be used to solve these special cases in relatively short computation time. Indeed, problem instances with ten customers can be solved within 20 seconds and instances with up to 15 customers in less than 15 minutes.
Even though most assumptions are commonly accepted, the existing literature is in fact quite divergent 1 : several further subtle nuances differentiate the various problem statements from one another, making a comparison only valid to a limited extent. Apart from the problem instances and concrete choice of parameters (e.g., the drone's velocity), details of the problem statements differ in whether round trips are permitted, if vertices can be revisited by the truck, and whether both vehicles operate on the same type of network. Moreover, there is no general consensus on if an overhead time is incurred for launching and recovering the drone and how such penalties should be applied concretely. Furthermore, there is no agreement on if a drone is permitted to wait at a customer after making a delivery, thus conserving energy before the return flight, or if, instead, it must hover while waiting for the truck to pick it up again. In any case, there is no generally recognized way of specifying a drone's endurance precisely.
As will be unveiled in much more detail in the following sections, in this paper we provide three new formulations; in particular, one of them can be used to design a branch-and-cut scheme for solving the TSP-D effectively. Indeed, we can optimally solve instances with n = 9 or n = 10 customers in just a few seconds, instances with n = 14 customers within around 5 minutes, and several instances with n = 19 or n = 20 customers within an hour.

Methodology-based literature
Several further works exist that either study variants of the problem (multiple trucks, drones, or different objective functions) or are primarily concerned with heuristics. In what follows, due to the reasons listed at the beginning of this section, we just mention them briefly.
In his thesis, Ponza [24] discusses the possibilities of solving the FSTSP heuristically using metaheuristics such as simulated annealing (SA) or ant colony optimization. Of these possible algorithms, SA is implemented and evaluated. For small-scale instances with at most 10 customers, on which optimal solutions can be obtained through a MILP formulation (that follows the one proposed by Murray and Chu [20]), the SA algorithm can obtain optimal solutions in most cases. After the investigation of these small instances, a more detailed parameter study is performed on larger instances with up to 200 customers.
In [6], de Freitas and Penna propose a hybrid general variable neighborhood search (VNS) for solving the FSTSP and TSP-D (the authors present a similar approach in [5]). In this algorithm, the first step consists in determining the optimal tour to the TSP. Next, a greedy improvement procedure is proposed that takes customers from the truck's route and assigns them to drone sorties. Finally, VNS serves as an in-depth improvement procedure. In particular, the authors propose several problem-specific neighborhood structures. The authors provide a detailed numerical study and show the effectiveness of their approach by testing on the problem instances proposed in [1,20,24].
The work by Ha et al. [15] studies a variant of the TSP-D with a minimal cost (as opposed to time) objective. The authors provide a MILP formulation that is based on the one provided in [20]. However, the primary contribution of the work is the design of an effective greedy randomized adaptive search procedure (GRASP) that can be used to insert drone operations into an existing TSP solution as well as some local search operators for refining the solution quality afterwards. In a subsequent work, the same authors investigate the performance of a hybrid genetic algorithm as an alternative to their GRASP [16]. The authors show that this novel algorithm outperforms existing approaches in terms of solution quality on instances with up to 100 customers.
In a similar manner, Sacramento et al. [26] study a variant of the TSP-D that involves multiple vehicles from a cost-per-mile-oriented perspective. For this purpose, they extend the MILP formulation that was proposed in [20]. In particular, they propose an adaptive large neighborhood search (ALNS) procedure that is capable of solving large-scale instances with up to 250 customers.
Following [23,30], Schermer et al. [27] investigate a variant of the TSP-D that features multiple vehicles and drones per vehicle named the vehicle routing problem with drones (VRPD). A MILP formulation and some effective VIEQs are introduced. However, the primary contribution is the design of a Matheuristic that embeds the drone assignment and scheduling problem. This heuristic is evaluated on instances with up to 100 customers. Related to the this work, Schermer et al. [28] also study a variant of the VRPD where en route operations are permitted, that is, the possibility of launching or recovering drones at some discrete points on each arc (see also [18]). Even though a MILP formulation is introduced, which can be used to address small-scale instances, the primary contribution is the design of a heuristic based on the concept of variable neighborhood search (VNS).

THE TRAVELING SALESMAN PROBLEM WITH DRONE
In this section, we provide a formal specification of the TSP-D. We begin in Section 3.1 with a concise problem description. Afterwards, in Sections 3.2 and 3.3, we provide two compact MILP formulations of the problem.

Problem definition
Suppose that there is a set of customers, each of whom must be served exactly once. In the TSP-D, we assume that there is a single truck that is equipped with a single drone for performing the deliveries. Both vehicles are initially located at a central depot and we look for a feasible route of the truck and admissible use of the drone such that all customers are served and both vehicles have returned to the depot as quickly as possible, that is, minimal makespan is achieved. Furthermore, consistent with most of the previous assumptions in the literature, we accept the following limitations in regard to the nature of the drone [1,20,30]: • When launched from the truck, the drone can serve exactly one customer before it needs to return.
• We assume that the drone's flight time is limited by  time units per sortie. Furthermore, after returning to the truck, the battery of the drone is recharged (or swapped) instantaneously. • Launching and recovering the drone might take t l and t r time units, respectively.
• It is only admissible to operate the drone at vertices (for a relaxation, refer to, e.g., [18,28]).
In the TSP-D, the locations of the depot and the customers are defined by a complete graph (V, A). In this graph, V is the set of vertices and A the set of arcs. The vertex set V contains n vertices associated with the customers, labeled V N = {1, …, n} ⊂ V and two extra vertices 0 and n + 1 that represent the depot location at the start and end of the tour, respectively. To sum up, V = {0, 1, …, n, n + 1}, where 0 ≡ n + 1. We define V L = V⧵{n + 1} and V R = V⧵{0}.
A drone-sortie is characterized by a triple (i, w, j) as follows: the drone is launched from the truck at i ∈ V L , performs a delivery to w ∈ V N , and is retrieved at vertex j ∈ V (i ≠ w, w ≠ j). If j = i, that is, when the sortie is (i, w, i), we call it a round trip; otherwise, if j ∈ V R , i ≠ j, we call it a multileg sortie. Finally, by the parameters d ij and we define the distance required to travel from i to j by truck and drone, respectively. In a similar manner, we use t ij and t to define the time that is spent when traveling from i to j by either vehicle.

A MILP formulation using a disjoint representation of multilegs
For the MILP formulation of the TSP-D, we use the binary decision variables according to Table 1 and the continuous variables according to Table 2. Using these variables, we can formulate the TSP-D as in (1) to (16), where (1) is the objective function and the constraints are given by (2) to (16). In the following, we refer to this model as MILP-A and it is inspired by the one presented in [29]. min The time at which i is reached by either truck or drone. If the drone is retrieved at i, this marks the lower bound at which the last vehicle (truck or drone) has arrived The earliest possible departure time from i In the TSP-D, the truck and the drone should serve all customers in shortest possible time and then return to the depot, which is stated in (1). Constraints (2) and (3) guarantee that the flow of the truck is conserved. Moreover, constraints (4) sustain the flow of the drone during each multileg. As is stated through constraints (5), every customer must be served exactly once. 2 The temporal constraints are defined through (6) to (9). More precisely, (6) as well as (7) and (8) define the transitions of the truck and drone, respectively. Note that any subtour would contradict constraints (6) to (8). Moreover, constraints (9) account for the overhead times as well as the time spent performing round trips for every vertex.
The endurance constraints are given by (10) for multileg and (11) for round trip sorties and incorporate the time during which the drone is in flight. Constraints (12) and (13) account for the availability of the drone at each vertex. In particular, if an arc (i, j) is used by the truck, constraints (12) provide an upper bound on the availability of the drone at j, that is, j . If the drone is available at i, that is, i = 1, the upper bound j ≤ 1 will only hold if either the drone is not launched at i or it is immediately recovered at j; otherwise, j will be bound to zero. In contrast, if the drone is unavailable at i, that is, i = 0, then there must be a recovery at j (from a sortie that began before i) for the drone to be available again. Moreover, constraints (13) are logical constraints: the drone can only be available at vertices that are visited by the truck. Finally, with these limits on the availability of the drone in place, constraints (14) to (16) guarantee that the drone can only be launched from and recovered at vertices where it is also available.
Even though MILP (1) to (16) formulates the TSP-D, some effective valid inequalities exist that significantly strengthen the linear relaxation (refer to [7,27,29]). For the purpose of this work, we integrate them into MILP-A as follows: Here, constraints (17) and (18) account for the time spent traveling on arcs by the truck and drone, respectively. Moreover, these constraints incorporate the time where the truck and drone perform their actions in a synchronized manner, that is, when the drone performs round trips while the truck remains stationary (waiting for the drone to return) and whenever the overhead times t l and t r are applied.

A MILP formulation using a joint representation of multilegs
In MILP-A, we use the variables y iw andỹ (according to Table 1) to differentiate between drone arcs that are part of outbound and inbound flights (each part of a multileg). In principle, instead of using two disjoint sets, it is also possible to use a single set of variables that simply state if an arc is used at all during a multileg flight. Hence, a smaller number of columns is required for formulating the problem. To the best of our knowledge, at this point, there are no models using a two-index formulation for the TSP-D (or related problems) that investigated this possibility.
To this end, we might use the binary decision variables x ij , z iw , and i as well as the continuous decision variables a i and s i according to Tables 1 and 2, respectively. Moreover, we introduce the additional binary decision variables according to Table 3. Then, the alternative MILP formulation of the TSP-D is given by (1) to (3), (6), (11), (13), (16), (19) to (31) and we denote this Through the objective function (1) and constraints (2) and (3), we still aim at minimizing the makespan while preserving the flow of the truck. Moreover, the updated model requires several new constraints that restrict the binary variables r i . Constraints (19) state that every customer must be served exactly once. If the drone is recovered at a vertex j, that is, r j = 1, there are Table 3 Additional binary decision variables used in the joint drone arc formulation of the TSP-D

Variable
Explanation Equal to 1, iff arc (i, j) is part of a multileg sortie (either outbound or inbound) If the drone is recovered after a multileg sortie at i two incoming arcs at j: one for the truck and drone, respectively. In this case, we subtract the binary variable r j to achieve an equilibrium. Constraints (20) and (21) provide conservation of flow for the drone arcs. More precisely, if the drone is not available at some vertex w, that is, w = 0, then the flow must be conserved and both constraints force the number of incoming (drone) arcs to be equal to the number of outgoing arcs. On the other hand, flow must not necessarily be conserved if the drone is available at w, that is, if w = 1 (see also Figure 3). Constraints (22) to (24) determine the value on the variables r j : these constraints force the variables to take value 1 if there is exactly one incoming truck and drone arc and to value 0 otherwise. Moreover, constraints (25) are valid cuts that are logically implied by the remaining model. These constraints state that the drone can only be recovered if it also available at the same vertex.
In this model, the temporal constraints for the truck defined in (6) still hold. Moreover, it suffices to use constraints (26) and (27) for the transitions of the drone. The endurance constraints are given by (11) and (28) for round trip and multileg sorties, respectively. Note that, in this model, (28) now requires an additional Mr w term, that is, the constraint only applies for two consecutive drone arcs that visit a customer (and not the truck) in between (see also Figure 3). Finally, constraints (13), (16), and (29) to (31) are used to determine the availability of the drone and to restrict operations accordingly. If the formulation presented in this section is used, before integrating the VIEQs (17) and (18) into MILP-B, slight adjustments are necessary. In particular, due to the nature of the variables y ij , it is no longer possible to associate the overhead time t l and t r with outgoing and incoming drone arcs, respectively. However, during each multileg, the drone is launched and recovered exactly once. Hence, the total number of multileg arcs y ij with value 1 will always be even and thus we can adapt these VIEQs as follows:

Discussion
As we will see in Section 5, only small-sized instances can be solved through MILP-A and MILP-B. In particular, several constraints involve large M-coefficients; hence, the linear relaxation is quite weak, which results in a slow convergence of the LB. Nevertheless, the proposed valid inequalities strengthen the linear relaxation significantly and, according to Lemma 1, these VIEQs are almost tight.

Lemma 1.
In an optimal solution, the slack value between left-hand-side of the VIEQs (17) and (32) and the makespan, that is, s n + 1 , is equal to the total time that the truck remains stationary at vertices while waiting for the drone to arrive (from a multileg sortie). Proof. The total operating time of the truck can be decomposed into three components as follows: 1. First, the travel time, that is, the time spent moving on arcs.
2. Second, the synchronized waiting time, that is, the time shared by the truck and drone in a coordinated manner while performing interdependent actions: • during launch and retrieval of the drone, that is, whenever the overhead times are applied, • whenever the truck is waiting for a drone to return from a round trip.
• Finally, the nonsynchronized waiting time, that is, the time spent at each vertex by the truck while waiting for the drone to arrive (from a multileg).
Based on the problem definition in Section 3, no other mode of operation does exist for the truck. Note that the VIEQs (17) and (32) account for the travel time and the synchronized waiting time. Therefore, the slack must be equal to the nonsynchronized waiting time. This concludes the proof of Lemma 1. ▪ It is also worth to mention that, based on VIEQs (18) and (33), a similar lemma might be proposed for the drone. However, compared to Lemma 1, one further mode of operation exists for which these VIEQs do not account for: the time during which the drone is carried along an arc by the truck.

A BRANCH-AND-CUT APPROACH FOR THE TSP-D
The MILP formulations presented in Section 3 might be solved by any standard MILP solver, for example, Gurobi Optimizer, which uses a branch-and-cut approach [14]. However, in order to gain more efficiency, we introduce a third MILP formulation that permits a customized design of the branch-and-cut algorithm. This proposal is motivated by three main observations: 1. In optimal solutions, typically, the travel time and synchronized waiting time have the largest contribution to the makespan and, due to Lemma 1, they can be approximated well through the linear constraints (17) and (32). 2. The temporal constraints in MILP-A and MILP-B, presented in Section 3, introduce several M-coefficients (as is the case in related models, e.g., [7,8,15,20]). Consequently, the quality of the linear relaxation is quite weak and the convergence of the LB is slow. However, these formulations are compact, that is, of polynomial size. In fact, in our formulations, only (|V| 2 ) variables and constraints are required for formulating the problem. 3. In the set-covering based ILP formulation proposed by Agatz et al. [1], the so-called operations are used as the essential building block. Operations already explicitly incorporate the nonsynchronized waiting times; therefore, no big-M constraints are necessary. However, there is an exponential number of operations for a given instance; hence, a static formulation does not scale very well to a branch-and-cut framework.
Based on these observations, we introduce a MILP formulation that attempts to leverage the strengths of the MILP and ILP approaches listed above through Lemma 1. Before we present the model, we adapt the concept of an operation o for our purpose (based on [1]). Let an operation o contain a feasible sequence (w.r.t. Section 3) of one or more truck-arcs that we call a directed path P. Each operation o contains exactly one multileg drone sortie that starts at the beginning P, targets a customer (not contained in P), and concludes at the end of P. We introduce the notation (o) ∈ V R to refer to the vertex at the end of the operation o (the path P). Finally, we denote the cardinality of an operation by |o| and define it as the total number of directed arcs that are contained. Figure 4 illustrates an example of this concept. Let O denote the set of all feasible operations and take note that this set grows exponentially with the instance size, that is, with |V|.
In this section, we introduce a MILP formulation of the TSP-D that features an exponential number of constraints. As a result, we propose these constraints to be separated dynamically, that is, through a branch-and-cut framework. For introducing The nonsynchronized waiting time of the truck at vertex j (while waiting for the drone to arrive) The nonsynchronized waiting time of the drone at vertex j (while waiting for the truck to arrive) Sets the position of vertex i in the delivery sequence this formulation, we use the decision variables according to Tables 1 and 4. Using this notation, the MILP generally follows MILP-A and is given by (2) to (5) Let us explain the model in more detail. Generally, this formulation closely follows MILP-A. In fact, we only replace the objective function and drop all of the temporal constraints (and thus, M-coefficients). Hence, in MILP-BC, the objective function is given by (34) and consists of two terms: and (note that a similar objective is used in [7,8]). The value has two bounds which are defined through constraints (35) and (36) for the truck and drone, respectively. These constraints account for the time spent traveling on arcs as well as the nonsynchronized waiting times w t r and w r at each vertex r ∈ V R . Note that the bound is binding in case of constraint (35) while (36) only provides a LB on (refer to Lemma 1 and the comments thereafter). The total synchronized waiting time is defined through Equation (37). This constraint accounts for the time spent performing round trips and the total overhead time. As a result, the functions (o) and (o) are embedded within the scheme shown in Algorithm 1. Indeed, using a state-of-the-art solver, this algorithm can be applied through a callback whenever a possibly new MIP feasible incumbent solution is identified. In more detail, the algorithm works as follows for each operation o that is contained in the current solution: 1. First, the travel times of the truck and drone are calculated. Here, we do not account for the overhead times, as they are already included in the term in the objective function (see objective (34) and constraint (37)). (o) = (t − t t ) 5:

Afterwards
Add cut (38) for operation o 6: else if t t > t then ⊳ In operation o, the drone arrives before the truck. 7: (o) = (t t − t ) 8: Add cut (39) for operation o 9: end if 10: end for Using this algorithm, the corresponding constraints, that is, (38) and (39), are added only as necessary or not before needed to the model. This scheme is guaranteed to converge towards the optimal solution because of Lemmas 1 and 2 as well as Proposition 1, described in what follows.

Lemma 2. An operation o ∈ O is never a subset of any other operation o
Proof. By contradiction, assume that ∃o ′ ∈ O such that o ⊂ o ′ and o ≠ o ′ (the case of o ′ ⊂ o can be treated analogously; hence, we skip it). We differentiate the following cases: 1. either o ′ contains at least one more drone sortie, 2. or o ′ contains at least one more arc that is used by the vehicle.
The first case contradicts the definition of an operation as at most one (multileg) drone sortie can be contained in each operation. For the second case, assume that we were to add an additional arc x ij to the path P. For the path to remain feasible, the only possible location to append one further arc is at the end or the beginning of P. However, this would also contradict the definition of an operation as the drone would no longer be launched (retrieved) at the start (end) of the path. This concludes the proof of Lemma 2.  Figure 4. Here, the operation is given as o = {x , x , x , y ,ỹ } with a cardinality of |o| = 5. In this example, we assume that the arrival time of the truck is t t = t ij + t jk + t kl and the arrival time of the drone is t = t + t . As the overhead times t l and t r are shared by both vehicles and incorporated in the synchronized waiting time , we do not have to consider them here. If t t > t , then the drone has to wait for (o) = (t t − t ) time units at (o) = l ∈ V R and the following drone-cut can be added (refer to Algorithm 1): Otherwise, if t t < t , the truck has to wait for (o) = (t − t t ) time units at l for the drone to arrive. Thus, we can add the following truck-cut: Note that, in the (unlikely) event where t = t t , the arrival of the truck and drone is perfectly synchronous; hence, no cut must be added as neither vehicle has to wait for the other one.
Finally, the temporal constraints (6) to (9) that also prevent subtours are no longer used in the model MILP-BC. As a result, we adapt and use the family of MTZ constraints (refer to [19]) as follows: This concludes the description of MILP-BC. However, in optimal solutions, many operations are of cardinality |o| = 3, that is, the truck does not serve any additional customer in between launch and recovery of the drone (see also Figure 7). Even though, the constraint generation scheme proposed in Algorithm 1 also accounts for such operations, it might be beneficial to introduce some additional VIEQs that match to these types of operations. Therefore, we propose constraints (47)

and (48) as VIEQs to MILP-BC and call the resulting formulation MILP-BC-V I:
In particular, constraints (47) and (48) account for the waiting time of the truck and drone, respectively, at each vertex j, if a sortie o of cardinality |o | = 3 is performed (i.e., if the expression with coefficient M evaluates to 0). Note that, in these cases, a sufficiently large value of M is relatively small; in fact, M = arg max , ∈ 2 ⋅ and M = arg max i, j∈V t ij are sufficient for the value in (47) and (48), respectively.
Finally, as a general remark, note that the variables w t j and w j might also prove useful when some constraints are imposed that might limit the maximum waiting time of the truck or drone or when it is of interest to associate a specific cost with waiting (as it is the case in, e.g., [15]).

COMPUTATIONAL EXPERIMENTS AND THEIR NUMERICAL RESULTS
All experiments were performed on an Intel Xeon E5-2640v3 Processor with access to 8 GB of RAM. As MILP solver, we chose Gurobi Optimizer 9.0.0 [14] with a runtime limit of 1 hour. For the design of the branch-and-cut algorithm introduced in Section 4, a natural choice is to use the callback framework of the same state-of-the-art solver, that is, we apply the cut scheme proposed in Algorithm 1 at each new MIP incumbent node in the branch-and-cut tree. Through this process, we rely on an established external framework for primal heuristics, the generation of general MILP cuts, branching-decisions, and so forth (compare with, e.g., Fischetti et al. [12,13]). As we have outlined towards the end of Section 2, there are no commonly accepted problem assumptions. As a consequence, we selected three different types of benchmark instances to conduct our computational experiments. In Section 5.1, we begin with instances introduced by Bouman et al. [4] that were first used in [1]. Afterwards, in Section 5.2, we test on the instances proposed by Poikonen et al. [22]. Finally, in Section 5.3, we conclude our experiments using instances from Murray and Chu [20].

Instances proposed by Bouman et al.
In this section, we describe our computational experiments on the instances proposed by Bouman et al. [4], which have been used in, for example, [1,3,22]. We begin in Section 5.1.1 by reporting the experimental setup. Afterwards, the numerical results are discussed in Section 5.1.2.

Experimental setup
Bouman et al. [4] provide a wide range of instances with different numbers of vertices. For our experiments, we use instances with n ∈{9, 19} customers (i.e., n + 1 ∈ {10, 20} vertices), where the coordinates were drawn uniformly and independently from {0, …, 100}. It is assumed that both vehicles follow the Euclidean distance metric and the drone's endurance is limited by flight distance. In particular, the velocity of the drone is ∈ {1, 2, 3} times that of the truck (which moves at a velocity of one distance unit per time unit). Moreover, the overhead times are assumed to be negligible, that is, t l = t r = 0. Regarding the endurance, let  = arg max i,j∈V d ij be the edge with maximum length in the graph G. Using this value of , for each instance, the drone's range of operation is restricted by a parameter R ∈ {20%, 40%, 60%, 100%, 150%, 200%} that is multiplied by .
In order to incorporate this parameter into our models, for MILP-A and MILP-BC, constraints (10) are replaced as follows: Moreover, for MILP-B, we adjust constraints (28) as follows: Finally, for all formulations, the endurance for round trips defined in (11) is instead restricted as follows:

Numerical results
The resulting set of 360 instances was solved using formulations MILP-A, MILP-B (with their respective valid inequalities) as well as with MILP-BC and MILP-BC-V I, yielding a total of 1440 experiments. Table 5 provides detailed results on the performance of each approach. Based on the instance size n, the relative velocity , and relative endurance R, this table shows the MIP gap and runtime (averaged over 10 instances) as well as the share of instances solved to optimality by each approach. Moreover, for MILP-BC and MILP-BC-VI, the average number of cuts of types (38) and (39), which are added during the search, are shown. Based on the results presented in Table 5, we can make the following observations. The formulation MILP-A can be used to effectively solve instances with n = 9 customers, requiring no more than 2 seconds on average. In contrast, the alternative formulation MILP-B is noticeably worse in terms of average runtime. Beyond this size, that is, for n = 19, the performance of MILP-A is noticeably better: 16.1% of these instances can be solved optimally with an average remaining MIP gap of 16.1%. Overall, the branch-and-cut algorithm that utilizes MILP-BC shows promising performance. While MILP-A is on average slightly faster for n = 9 customers, the branch-and-cut approach shows a slightly superior performance in tackling instances with n = 19 customers. To be more precise, through MILP-BC we can solve a slightly larger share (22.2%) to proven optimality with an average remaining MIP gap of 16.2%. In contrast, MILP-BC-VI solves fewer instances to optimality (19.2%); however, with an improved MIP gap of 15.7%. A closer look at these two formulations reveals that the cuts of type (38) and (39), which are added during the search, are about half if MILP-BC-VI is used. Table 5 also displays the quality of the relative root relaxation % r associated with each formulation. We provide this information by using the following formula (52) as a performance metric: % r ≔ root relaxation objective value objective value of the best feasible solution after runtime (found through any formulation) Overall, the quality of the root relaxation associated with MILP-B is noticeably worse. In fact, on average, the relative root relaxation of MILP-A is about 13% better which also helps to explain why MILP-A performs significantly better. When we look at the formulations MILP-BC and MILP-BC-VI, we notice that their relative root relaxations are just slightly better (about 1%) than those associated with MILP-A. Table 5 Average MIP gap, average runtime (in seconds), the share of instances solved to optimality (opt.), and the average relative root relaxation % r depending on the number of customers n, the relative velocity , and the relative endurance R on the Bouman et al. [4] instance set (10 instances per row)  In Figure 5, we provide an insight into the savings (w.r.t. the TSP solution) that are calculated as follows:

MILP-
Objective value of the feasible solution after runtime Optimal objective value of the TSP solution (53) Moreover, Figure 6 highlights the share of customers served by the drone. These figures are plotted for the results obtained through MILP-BC-VI. Regarding the savings shown in Figure 5, we can make the following observations. As might be expected, the savings generally increase as the drone's velocity and endurance are increased. However, for a given value of , no further gains can be obtained once a certain threshold of the endurance parameter R is reached. In any case, savings of about 35% w.r.t. the TSP solution are possible. This magnitude of savings is consistent with previous observations (see, e.g., [1,27]). In Figure 6, overall, the number of customers served by the drone is significant and grows as the relative velocity of the drone is increased.
What is interesting about these figures is that the shares of operations do not grow monotonic with and R. In particular, we can observe that as these values change, round trips tend to be substituted with multileg sorties. Finally, let us investigate the makespan components according to the objective function (34). For this purpose, we use the results obtained through the formulation MILP-BC-VI. Table 6 provides a breakdown of the makespan (recall that t l = t r = 0) and is best viewed in conjunction with Figure 6.
The time spent traveling on arcs has the largest contribution with more than 97% on average. Moreover, the time spent waiting by the truck for the drone to join from a multileg is less than 1.5% on average (similar observations were made in [11]). However, there is a strong dependence on the relative velocity as a faster drone (and a drone with an increased relative endurance R) typically also causes the (nonsynchronized) waiting time to increase.

Instances proposed by Poikonen et al.
If we assume that a drone can make an autonomous delivery to a customer, it might need to land at the customer's location. After landing, if we ask that the drone can delay its departure for a sufficient amount of time, only the time in-flight or the distance covered (given that the velocity is constant) are decision-relevant (for a detailed discussion, refer to [27]). This assumption is present in several papers (see, e.g., [1,20]) and was investigated in Section 5.1. However, in some situations, for example, for Table 6 The proportion of the time spent moving, waiting for the drone to arrive from a multileg (nonsynchronized waiting time), and the time spent waiting while the drone is doing round trips  the sake of operational or safety reasons, it might be required that the drone must depart immediately; therefore, the drone only performs a flyover at the customer's location. In such a scenario, if the drone arrives at the retrieval location before the truck and is not permitted to land it will need to hover. While hovering, battery charge is consumed at approximately the same rate as during straight flight (refer to [9]). Such a case is investigated by Poikonen et al. [22] and in this section, we experiment using the instances provided in [22]. We begin by describing the experimental setup in Section 5.1.1. Afterwards, we present the numerical results in Section 5.1.2.

Experimental setup
Poikonen et al. [22] propose a large set of instances where coordinates are distributed uniformly on a 50 by 50 grid. From this set, we test on 25 instances for each size of n ∈ {9, 14, 19} customers. In these instances, as opposed to Section 5.1, it is generally assumed that the truck follows the Manhattan metric instead. The drone is still modelled with a relative velocity of ∈ {1, 2, 3} w.r.t. the truck and the overhead times are furthermore assumed to be negligible, that is, t l = t r = 0. However, in contrast to Section 5.1, the drone has a battery life of  ∈ {10, 20, 30} time units and must be recovered by the truck within a time window of length  after each launch.
Due to the relatively poor performance of MILP-B in Section 5.1 and the computational overhead, for the remainder of this paper, we only consider MILP-A, MILP-BC, and MILP-BC-VI for solving the resulting set of 675 problems; hence, a total number of 2025 experiments were performed. In order to match our models to the changed endurance assumption, some adjustments are necessary. For this purpose, in MILP-A, we should replace the endurance constraints (10) with the following ones: These constraints state that if a drone performs a sortie (i, w, j) then the total flight and hovering time for that sortie may not exceed . Preliminary experiments have shown that, to enable a fairer comparison, it is in fact appropriate to keep constraints (10) in MILP-A as valid inequalities, that is,  is a valid upper bound on the time in motion (excluding hovering). As constraints (10) are not affected by the M-coefficient, this has a positive impact on the effectiveness of Gurobi in solving MILP-A. In addition, for MILP-BC, we propose the following endurance constraints: These constraints state that the flyover time and hovering (nonsynchronized waiting) time at the retrieval location may not exceed the maximum endurance. Note that, round trips do by their nature not include a drone hovering time. Hence, the endurance constraints for round trips displayed in (9) remain unaffected for either formulation.

Numerical results
The detailed numerical results are presented in Tables 7 and 8. To be more precise, Table 7 shows the MIP gap, runtime, the share of optimal solutions, and the relative root relaxation as average values. In addition, for formulations MILP-BC and MILP-BC-V I, the average number of cuts of type (38) and (39) is shown. Moreover, Table 8 provides an insight in the composition of the makespan and the share of customers served by the truck and drone, respectively. Compared to the observations made in Section 5.1, as Table 7 reveals, the change in the assumptions has a noticeable influence on the solver. In more detail, small instances with n = 9 customers are still solved in less than 2 seconds on average by either formulation. For n = 14, almost all instances can be solved to proven optimality within 1 hour. In particular, MILP-BC-V I is more effective than MILP-BC and solves the same amount of instances (a share of 98.2%) to optimality as MILP-A in about half the time (237.5 seconds vs 383.1 seconds) on average. In contrast, MILP-BC solves a fraction of instances less to optimality (a share of 95.6%) with an average runtime that is similar to MILP-A (393.7 seconds). However, the difference in performance is especially visible for n = 19. Here, MILP-BC-V I dominates the other approaches and we can solve about 63.1% of all instances to proven optimality within 1 hour (in 1836.1 seconds with a remaining gap of just 5% on average). This is about 21% (respectively, 13%) more than can be solved by MILP-A (respectively, MILP-BC) in much shorter time. In general, the relative root relaxations are of a similar order of magnitude for all formulations with slight differences in favor of MILP-BC-V I. In particular, the relative velocity and endurance R have a noticeable influence on the quality of the root relaxation. When we compare the cuts inserted through Algorithm 1, we notice that MILP-BC-VI generates noticeably less cuts than MILP-BC.
Noteworthy, for several cases in Table 7 where the relative velocity ∈ {2, 3}, instances with  = 20 appear to be more difficult to solve (based on increased runtimes or gaps) than instances with  = 30. Intuitively, one might argue that a reduced endurance should also make it easier to solve the TSP-D, as the resulting solution space is more restricted. However, a change in endurance often results in structural changes to the nature of optimal solutions. Indeed, when the endurance is large, the number of round trips is more limited, many long-range sorties are performed, and the drone is rarely taken along by the truck. As an example, Figure 7 shows two sample solutions. Figure 8 provides insights into the savings Δ (refer to formula (53)). At first glance, the strong dependence on both the endurance R and relative velocity is visible. Overall, for these instances where the truck follows the Manhattan instead of the Euclidean metric, we see that the savings are slightly larger than those observed in Figure 5.
Finally, Table 8 reveals that the time in motion has the largest contribution to the makespan. Moreover, waiting times are generally small and increase as R and increase-that also causes an increase in the share of customers served through multilegs. Similar observations apply to round trips, even though, their share is generally larger compared to Table 6. Table 7 Average MIP gap, average runtime (in seconds), the share of instances solved to optimality (opt.), and the average relative root relaxation % r depending on the number of customers n, the relative velocity of the drone , and the endurance R on the Poikonen et al. [22] instance set (25 instances per row)   Note: Moreover, the share of customers served by the truck and drone (through multilegs and round trips, respectively) is shown. The presented results were obtained through MILP-BC-V I and have been averaged over 25 instances per entry.

Instances proposed by Murray and Chu
In this section, we complete our numerical study by testing on the instances proposed by Murray and Chu [20]. We describe the experimental setup in Section 5.3.1. Afterwards, the numerical results will be presented in Section 5.3.2.

Experimental setup
Murray and Chu [20] propose small instances with n = 10 customers for the FSTSP and some larger instances with n = 20 customers for the parallel drone scheduling problem (refer to [20]) that we adapt. In these instances, customers are uniformly and independently distributed across an 8-mile square region. Moreover, the depot location corresponds either to the center of gravity, the average of the customer's x-coordinate with y-coordinate of zero (edge), or at the southwest corner of the region (origin). For these instances, it is assumed that the truck moves at 25 miles per hour on the Manhattan metric. When n = 10, a drone flying with 15, 25, or 35 miles per hour on the Euclidean metric is considered. In contrast, for n = 20, the drone is fixed at 25 miles per hour [20]. Furthermore, the endurance is either 20 or 40 minutes. Finally, for these instances, it is also assumed that the overhead times are t l = t r = 1 minute. We follow the assumption of Murray and Chu [20] and do not permit round trips, that is, we set z iw = 0, ∀i ∈ V L , w ∈ V N , i ≠ w. Additionally, in these instances, only a limited subset V D ⊂ V N of customers (that contains 80% to 90% of all customers) Figure 8 The average savings Δ w.r.t. the TSP solution depending on the drone's relative velocity and its endurance  for the results obtained through MILP-BC-VI on the instances proposed in [22] [Color figure can be viewed at wileyonlinelibrary.com] Table 9 Average MIP gap, average runtime (in seconds), and the average relative root relaxation % r depending on the drone velocity v d and endurance  on the Murray and Chu [20] instance set (12 instances per entry) is eligible to be served by the drone. Hence, we must add the following restrictions to the models:

MILP-A MILP-BC-VI
In total, Murray and Chu [20] provide 72 instances with n = 10 customers and 240 instances with n = 20 customers. To limit the number of experiments, we only solve these instances through MILP-A and MILP-BC-VI, as they have shown the best performance.

Numerical results
Tables 9 and 10 present the results for the instances with n = 10 and n = 20 customers, respectively. In these tables, we present the MIP gaps, runtime, and relative root relaxation (as well as the cuts for MILP-BC-VI) as average values. Moreover, we also indicate the components that contribute to the makespan as well as the share of customers served through multilegs.
In Table 9, we observe that both formulations can solve instances with n = 10 customers to optimality in short computation time. As in the previous sections, here, MILP-BC-VI also outperforms MILP-A and can solve all small instances in 16 seconds on average. However, on these instances, the time to compute the optimal solution is almost one order of magnitude larger compared to the small instances (n = 9) considered in Sections 5.1 and 5.2.
In Table 10, we present the results for n = 20 customers. As the drone speed is fixed to 25 miles per hour on these instances, the table highlights performance w.r.t. the depot location. Generally, we observe that MILP-BC-VI also performs better than MILP-A on these instances: we observe improved MIP gaps, runtimes, and a larger share of instances solved to optimality. In particular, we observe a strong impact of the depot location and endurance on the difficulty of the problem. Generally, going from  = 20 to  = 40 makes the problem more difficult. Moreover, instances where the depot is located along the center of an edge appear to be the most difficult.
Finally, it is worth highlighting that the share of customers served by the drone (through multilegs) is generally smaller in both tables (i.e., Tables 9 and 10) than what we observed in previous sections. The reason might be related to the presence of Table 10 Average MIP gap, average runtime (in seconds), the share of optimal solutions (opt.), and the average relative root relaxation % r depending on the depot location and endurance  on the Murray and Chu [20] instance set (40 instances per entry) overhead times. Even though their value is relatively small, summed up over all operations, they can account for up to 12.7% of the total makespan (refer to Table 9).

CONCLUSION
In this paper, we investigated the TSP-D. More precisely, in Section 2, we reviewed existing (M)ILPs and other exact methods that are used for solving variants of the problem. Given the current state of research, only small-sized instances can be solved to proven optimality-even for the special case of serving every customer exactly once that is the most common one studied in the literature. Motivated by this fact, we proposed two new MILP formulations in Section 3. As an alternative, attempting to leverage the respective strengths of existing MILP and ILP approaches in the literature and motivated by our valid inequalities, in Section 4, we proposed a branch-and-cut algorithm for solving the TSP-D. Afterwards, in Section 5, we conducted an extensive computational study in order to analyze the quality of the different formulations in finding optimal TSP-D solutions on various benchmark instances from the literature. As our detailed numerical study reveals, the performance of our formulations in solving the different benchmark instances varies. However, we have confirmed that the branch-and-cut approach MILP-BC-VI showed strong performance throughout the different instance sets. Indeed, using this approach, we can solve small instances with 9 or 10 customers in just a few seconds and several larger instances with 19 or 20 customers within an hour. Thus, this branch-and-cut algorithm challenges the existing state-of-the-art in solving the TSP-D. From a modelling perspective, naturally, we believe that our formulation might also be used to study variants that consider, for example, multiple trucks or drones (refer to [23,30]). Moreover, the additional variables in Section 4, that model the waiting time of the truck and drone, could be helpful in cases where these times might be constrained or associated with some cost (as it is the case in, e.g., [15]). We also believe that this formulation might be adjusted in a way to achieve the possibility of allowing each vertex to be visited more than once by the truck (refer to [1]).
From a computational perspective, using formulation MILP-BC-VI and a customized branch-and-cut scheme lead to the best performance overall. However, we also showed that performance varied over the different benchmark sets. Therefore, it might be worth to further investigate how the different assumptions affect the difficulty of the problem. Finally, in order to obtain a further boost, it might be worthwhile to investigate if the cutting scheme proposed in this work can be enhanced by merging it with other type of cuts proposed in, for example, [7,8]. That way, one might integrate a separation procedure into the branch-and-cut framework in order to cut off infeasible subtours as required instead of using explicit MTZ constraints.