### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

The double traveling salesman problem with multiple stacks consists in determining a pair of routes (pickup and delivery) for a unique vehicle in two different and disjoint networks. It models a realistic transportation problem with loading/unloading constraints imposed by having a set of last-in-first-out (LIFO) stacks used for storing the goods being transported. The arrangement of the items in the container determines the loading plan that in terms constrains both routes. In this paper, we propose a novel local search approach. The local search heuristic is applied to the loading plan instead of working directly on the routes. A dynamic programming algorithm is used to map the loading plan solution into corresponding optimal routes. Computational results show that the proposed approach is competitive with state-of-the-art heuristics for the problem.

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

The double traveling salesman problem with multiple stacks (DTSPMS) models a realistic routing and packing problem. It consists in determining a pair of routes (pickup and delivery) for a unique vehicle in two different and disjoint networks. Items are collected in the pickup route, stored in one of several identical stacks in the vehicle, and then delivered in the other network using the delivery route. Each network has a depot where the corresponding route starts and ends. The stacks have a limited size. When items in the stacks are delivered, they must obey the last-in-first-out order. Therefore, given an assignment to a stack for every collected item, a pair of pickup and delivery routes are feasible if and only if, for every pair of items stored in the same stack, the relative order in which they appear in the pickup route is reversed in the delivery route.

The problem was first introduced by Petersen and Madsen (2009) and, since it has the traveling salesman problem as a special case (when there are at least as many stacks as there are clients), is NP-hard.

The DTSPMS models a realistic problem where transportation of goods has to be performed between two cities. A vehicle is used to collect the items in a container in the supplying city. The container is then transported (in a ship, plane, train, etc.) to the demanding city where the items are distributed by a different vehicle respecting the package policy applied in the pickup route, that is, always delivering items from the top of one of the stacks.

DTSPMS instances consist of a given number of clients, two distance matrices specifying the distances between every pair of clients as well as the distance of every client to the depot in the pickup and delivery networks and two integer numbers specifying the number of available stacks and their capacity. Solutions to the problem may be seen as two permutations of the clients representing the pickup and delivery routes and a stack assignment for each client.

Three sets of benchmark instances were introduced in Petersen and Madsen (2009). Each set is composed of 20 instances with, respectively, 12, 33, and 66 clients. All instances have three stacks and the capacity of every stack is so that all of them are full when the pickup route is complete. That is, the capacity of each of the stacks is four for the 12 clients instances, 11 for the 33 client instances, and 22 for the 66 clients instances. A normal semitrailer in Norway loads 3 × 11 pallets, making the medium-size test instances real world. Some trucks may stack the pallets in two levels, having room for 66 pallets, but this gives a slightly different configuration than the larger test cases. In the rest of the paper, we assume that the stacks are full after the pickup phase for every instance.

In Petersen and Madsen (2009), besides proposing the problem, the authors describe four heuristic algorithms based on iterated local search, tabu search, simulated annealing and large neighborhood search. Two simple route operators are proposed to create neighborhoods that are used to explore the solution space. Four complex route operators (and their associated neighborhoods) as well as a variable neighborhood search heuristic are introduced in Felipe et al. (2009a, 2009b). A new approach by the same authors is presented in Felipe et al. (2011), in which solutions where the stacks are overflowed are temporally permitted in order to diversify the search process. In Urrutia et al. (2012), the stack loading/unloading constraints were temporally relaxed in order to use fast route operators borrowed from the classical traveling salesman problem. In Casazza et al. (2012), a set of problem properties is analyzed and a fast heuristic is developed using polynomial algorithms derived from those properties. In Côté et al. (2012), the pickup and delivery traveling salesman problem with multiple stacks is proposed. The authors heuristically approach the problem by large neighborhood search. Since this problem generalizes the DTSPMS, they were able to test their approach on benchmark DTSPMS instances obtaining competitive results.

Exact algorithms have also been proposed for the problem (Carrabs et al., 2013; Lusby and Larsen, 2011; Lusby et al., 2010; Alba Martínez et al., 2013; Petersen et al., 2010). The branch and cut in Alba Martínez et al. (2013) is, to the best of our knowledge, the best exact algorithm for the problem at the time of writing. Concerning instances with three stacks, this algorithm was able to solve instances with up to 25 clients and stack sizes up to 9 in one hour of computation. On the other hand, it was unable to solve to optimality some instances with 21 clients in one hour of execution.

In the rest of the paper, *n* is the number of clients; *s*, the number of stacks; and *c* is their capacity. For the instances considered in this work, holds. Informally, we do not distinguish between a client and the demand of the client, so we may say that a client is stored in a given stack.

Let us define a loading plan as a matrix with *s* rows and *c* columns. The loading plan determines the storage arrangement of the clients once the pickup tour is finished. The matrix contains a permutation of the *n* clients and each row represents the state of each stack. Figure 1 shows a loading plan and a pair of compatible routes for that loading plan.

In this work, we propose a new heuristic for the DTSPMS. The new heuristic is based on a local search in the loading plan. A dynamic programming (DP) algorithm then computes the optimal pickup and delivery routes for the resulting loading plan. This novel approach differs from the rest of the literature that works directly on the pickup and delivery routes.

The rest of this paper is organized as follows. In the next section, we describe our approach and analyze the solution space of the problem in terms of loading plans. Section 'The DP algorithm' describes the DP algorithm and discusses its integration with a local search heuristic. Section 'The core heuristic algorithm' describes the core heuristic being proposed and all its components. Section 'New starting solution from path relinking' introduces the path-relinking enhancement and its usage in the core heuristic. In Section 'Computational results', we show the computational results and in the last section we provide some concluding remarks.

### 2. Optimal routes from loading plan via DP

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

On the other hand, such an approach has a set of advantages when compared with approaches working directly on the routes. First, the size of the solution space is *n*! corresponding to the number of permutations of clients in the loading plan. This size is much smaller than the size of the solution space in route approaches that equals (*n*!)^{2} because for any of the *n*! permutations of clients on the pickup route, there are other *n*! permutations (not all of them feasible) in the delivery route.

Second, contrary to any route approach, there are no infeasibility issues in the loading plan approach. In fact, any loading plan has a pair of associated optimal feasible routes. In the route-based approaches, depending on the number of stacks, the major part of the solution space is composed of infeasible solutions. This fact leads to the development of complex move operators to avoid the evaluation of infeasible solutions (Felipe et al., 2009a, 2009b).

Finally, the average quality of the solutions in a loading plan approach is expected to be much better than in the route approaches. Each pair of routes deriving as a result of the execution of the DP algorithm is optimal among all pair of routes that can be obtained from that loading plan.

The proposed approach falls into the math-heuristic paradigm (Maniezzo et al., 2009) in the sense that it uses an exact mathematical programming tool such as DP inside a metaheuristic algorithm. It has some similarities with the corridor method (Caserta et al., 2010; Caserta and Voß, 2010, 2012; Sniedovich and Vos, 2006) in which subproblems are solved exactly by restricting the search of solutions to a small portion of the search space. In the corridor method, an exact approach for the problem at hand is adapted to search for the optimal solution inside a search space consisting of candidate solutions that are not too different from a given incumbent solution. Redefining how much a solution can differ from the incumbent, the “corridor” can be widened or narrowed. In our approach, instead of having an incumbent solution we have a loading plan. In this case, the “corridor” has constant size and consists of all pairs of feasible routes that can be obtained from the given loading plan.

### 3. The DP algorithm

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

The base step of the DP algorithm consists in computing the optimal costs of states in which just one item was picked up (resp. delivered), namely, *S*(1, 0, 0, 0), *S*(0, 1, 0, 1), and *S*(0, 0, 1, 2). This is done by computing the distance from the depot to the first client on each stack:

Then, the rest of *S* can be computed as follows:

The optimal cost associated with the loading plan *L* can then be computed as the minimum sum over all three stacks of the cost corresponding to picking up (resp. delivering) the overall last item at the last client in that stack and the distance from that client to the depot:

#### 3.1. Working with the DP algorithm in a local search algorithm

As mentioned in Section 'Optimal routes from loading plan via DP', the DP algorithm evaluates neighbor solutions inside a local search procedure. Therefore, given the complexity of the DP algorithm, it is natural to think that its execution will be the bottleneck of the algorithm in terms of computational time. To tackle this issue, we explore here four possible ways of action:

- Implement the DP algorithm as efficiently as possible. We comment on our actual implementation in this section and discuss some further possible improvements in Section 'Conclusions and future work' as future work.
- Try to take advantage of the similarity between the loading plans for which the optimal routes are to be computed. We discuss Δ-evaluation (Hoos and Stützle, 2004) for a swap neighborhood in this section.
- Solve the problem of obtaining the optimal routes heuristically instead of using the DP algorithm while evaluating neighbors, and use the DP algorithm only once per iteration of the search after the new current solution is selected. We comment on this option in this section.
- Evaluate only promising neighbors during the search. This idea is discussed in Section 'Estimating the cost of a move' and is a fundamental part of the proposed algorithm.

On the benchmark instances used in this work (see Section 'Computational results'), the current implementation of the DP algorithm is around five times faster than a naive implementation developed with a first version of the heuristic algorithm. There are two main reasons for that difference in performance.

First, instead of creating or updating states using a transition function while browsing already solved states, our DP implementation browse the yet unsolved DP states in such a way that when browsing a given state all state costs needed for its computation are already computed. This allows us to solve the great majority of the states with a single min operation on three summations that is computed much faster than the conditional branching instructions needed for verifying the existence of a given state and the need of updating its cost.

Therefore, our DP algorithm consists only of a series of cycles; all of them performing and summation operations without the use of any conditional branching instructions besides the (inevitable) cycle termination conditions. Browsing states instead of creating them accelerates the DP algorithm execution but, on the other hand, precludes the use of fathoming of suboptimal states (Morin and Marsten, 1976). Note that the described implementation of the DP algorithm does not compute the optimal routes but only their costs. Whenever the optimal routes need to be computed, a slightly more time-consuming version of the DP algorithm that maintains state-to-state pointers, has to be used.

Now, suppose that the DP algorithm had been run for a given loading plan and also consider a new loading plan constructed by swapping the content of any two positions of the original loading plan. It is clear that the cost of states in which the swapped clients were not yet served do not change. Also, note that the cost of reaching the final states from states in which both swapped clients were already served neither change. The costs of reaching the final stages can be computed for all states by executing the DP algorithm in an inverse fashion (from the final states to the initial states).

Therefore, once the DP algorithm has been run in a direct and inverse fashion for a given loading plan, it is possible to compute the cost of optimal routes of loading plans in which two clients had been swapped by recomputing only the cost of states where one of the clients that were swapped was already served and the other was not served yet. Note that this strategy is general and can be applied to moves other than swap. In the instances considered in this paper (with 33 clients and three stacks), the DP algorithm has 4752 states. Only 8.59% of those states must be recomputed in the best case (the two clients being swapped are consecutively stored in the same stack). On the other hand, in the worst case (the two clients are stored in the first and last positions of different stacks), 86.83% of the states must be recomputed. The average percentage of states that have to be recomputed over all the swap neighborhood is equal to 45.11%. Therefore, one can argue that this method would be able to reduce the computational time associated with the computation of the cost of the whole swap neighborhood by a little more than a half.

On the other hand, as noted before, the computational time of the DP algorithm greatly depends on implementation issues. The need of maintaining two state matrixes for the current solution (with the cost of reaching each state from the initial and the final states) plus the need of updating some of the states for the solution being evaluated imply the need of some degree of copying and conditional branching. As a consequence, in our implementation, the Δ-evaluation underperforms the DP algorithm executed from scratch when computing the cost of each candidate solution in the whole swap neighborhood. Also note that for more complex neighborhoods, with more clients being moved, the percentages given in the previous paragraph would increase, making the Δ-evaluation less appealing.

Another way to deal with this high computational cost would be obtaining the costs of the routes associated with the neighbor loading plans heuristically. This idea was actually implemented in the context of this work by replacing the DP algorithm with a 2-exchange local search algorithm developed for the problem in Urrutia et al. (2012). In this approach, every time a neighbor solution is evaluated, the move is performed in both routes instead of being performed in the loading plan. Then, the local search is applied on both routes and the cost of the locally optimal solution found is then returned as the cost of the neighbor. The neighbor solution that is finally selected in each iteration has its optimal routes computed via DP. Note that, due to the restrictions of the problem and the fixed loading plan, the size of the 2-exchange neighborhood used in the local search is linear with respect to the number of clients opposed to the quadratic size of the same neighborhood in the context of the traveling salesman problem. In consequence, the local search is faster than the DP algorithm. Although when applying this idea, computational results showed a decrease of as much as 50% in cost evaluation time, the decrease on quality on the selection of the next current solution within the search usually deteriorated the quality of the best solution found in a fixed period of time.

### 4. The core heuristic algorithm

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

Tabu search is a metaheuristic based on neighborhood search Glover and Laguna (1997). It starts from an initial feasible solution and iteratively moves to the best solution in its neighborhood that is not considered tabu. The tabu status is used to avoid cycling and guide the search through different parts of the solution space.

Our proposal is a multistart algorithm using tabu search followed by a single-step neighborhood search procedure to improve the initial solutions. Both searches use different neighborhoods and are combined to improve the solution as much as possible before a new initial solution is constructed. Algorithm 1 shows a pseudocode of the proposed heuristic. Both search procedures are described later in this section.

**Algorithm 1:** The core heuristic |
---|

**Input:** DTSPMS instance |

**Output:** best solution found before stopping criteria is met |

1 **While** *stopping criteria not met* **do** |

2 |

3 **repeat** |

4 |

5 |

6 **until** *both improving procedures fail improving* |

In the rest of this section, we first describe the randomized constructive heuristic used to obtain feasible initial solutions. Then, we introduce two move operators to be used on the loading plan. Next, the two improving procedures are described. In the next section, we describe a path-relinking approach to be used instead of the constructive heuristic for creating initial solutions once a set of elite solutions is established. Parameter settings are detailed in Section 'Computational results'.

#### 4.1. Initial solution

As noted in Petersen and Madsen (2009), a solution for one stack instance of the DTSPMS is extendible to a feasible solution for the same instance with multiple stacks. In such a single stack instance, the delivery route is always equal to the reversed pickup route. In multistack instances, considering such routes, any stack assignment for the clients would yield a feasible solution with exactly the same cost. In consequence, one possible way to obtain reasonable feasible solutions for the DTSPMS is to consider the sum of the pickup and delivery matrices and apply any constructive heuristic for the traveling salesman problem considering the depot as a client. Then, use the obtained solution as the pickup route and the reverse of it as the delivery route. Finally, assign the customers to the stacks sequentially respecting their capacity.

In our approach we used a randomized version of a simple TSP insertion heuristic. First, we form a partial cycle including only the two clients farthest apart. Then, while there are clients outside the route under construction, we randomly choose a client not in the route and insert it in the best possible position in the current route. Finally, we assign a stack to every client randomly, observing the stack capacities.

#### 4.2. Neighborhoods and local search

Two simple neighborhoods are proposed in this work to explore the loading plan solution space. In the swap neighborhood, given a current solution, the clients in two different positions of the loading plan are swapped to obtain a neighbor solution. Observe that we do not distinguish between interstack and intrastack swaps, all of them are part of the swap neighborhood. The size of this simple neighborhood is .

In the 3-exchange neighborhood, given a current solution, three clients in three different positions of the loading plan exchange their positions. Either the first client moves to the position of the second one, the second one moves to the position of the third one, and the third one goes to the position of the first one or the first client moves to the position of the third one, the second one moves to the position of the first one and the third one goes to the position of the second one. Observe again that no distinction is made between interstack and intrastack moves. The size of the neighborhood is .

#### 4.3. Estimating the cost of a move

As seen, computing the cost of all the solutions in the whole 3-exchange neighborhood can be extremely expensive in terms of computational cost. In order to tackle this, we propose a neighbor cost estimator function to assist the local search procedure in identifying promising neighbors.

An *O*(1) move estimator may be implemented by simply assuming that the loading and unloading sequences from the stacks are unchanged after the move is performed. That is, if the client in position *p* of the stack *t* is placed in the *k*th position of the pickup route (resp. delivery route), after the move is performed, the client in position *p* of the stack *t* is still placed in the *k*th position of the pickup route (resp. delivery route) even if the client in that position of the stack has changed.

Following this idea, the move is performed not only in the loading plan but also in both the pickup and delivery routes. The resulting routes are feasible but not necessary optimal for the obtained loading plan. In fact, the new cost of the routes is an upper bound on the cost of the optimal routes that can be obtained via DP using the obtained loading plan. Observe that the estimator can be computed in *O*(1) per neighbor since it only requires to recompute the client-to-client trips that are modified (up to 6 in the 3-exchange neighborhood) in both routes.

As an example, consider client 8 in Fig. 1. There are three alternative positions for that client in the pickup route that do not generate any infeasibility. The positions are those currently used by other clients that are in between the positions of clients 3 and 5, predecessor and successor of client 8 in the second stack.

In our improved move cost estimator, after performing the move in the routes, we evaluate the cost of moving every client involved in the move to every alternative feasible position in both routes and return the best cost found. This cost still is an upper bound on the cost of the optimal routes for the modified loading plan and it is usually much better than the one first proposed.

#### 4.4. Tabu search

Tabu search heuristics are iterative procedures that, starting from an initial current solution, find in each iteration the best nontabu solution in a given neighborhood. The best solution found before the stopping criterion is met is returned. The tabu criterion, which determines which neighbor solutions are considered tabu, is defined by the heuristic developer and is meant to avoid cycling and to guide the heuristic to different areas of the solution space.

In this work, we design a tabu search procedure using the 3-exchange neighborhood. Since evaluating each solution of the 3-exchange neighborhood via DP is extremely expensive in terms of computational cost ( for the three stack instances of the work), we use the move estimator described in Section 'Estimating the cost of a move' to evaluate only the most promising part of the neighborhood. Each iteration of our tabu search is divided into three steps.

###### 4.4..0.1 *Estimation step*:

###### 4.4..0.3 Move step

In this step, the move selected in the previous step is performed and the tabu status is updated. In our heuristic, the tabu criterion is the position of a client in the loading plan, that is, any solution in which a client involved in the move is back in the position of the loading plan it just left is considered tabu for a number of iterations plus a small number of iterations chosen at random from 0 to . Observe that, in this way, a client can be moved in consecutive iterations as long as it does not return to a position of the loading plan where it has recently been unless the aspiration criterion is applied, that is, the solution found is better than the best known so far.

The tabu search stops after a number of iterations without improving the best-known solution. This parameter is set to the first time the tabu search is called. Then, the parameter is increased by one after each restart. In this way, each restart is expected to be more time consuming than the previous ones but, on the other hand, it is expected to search more intensively in the solution space and have a greater chance of finding better solutions. Algorithm 2 depicts the tabu search procedure.

**Algorithm 2:** The tabu search procedure |
---|

**input :** sol: an initial solution, limit: maximum number of iterations without improving the best-known solution |

**output:** best solution found |

1 |

2 **while** *iterations without improvement* **do** |

3 **if** *complete re-estimation iteration* **then** |

4 estimate the cost of all moves |

5 **else** |

6 estimate the cost of the moves with the best estimated cost |

7 sort all moves by estimated cost and consider them in that order |

8 **while** *less than* *nontabu moves has been evaluated and no solution better than* *was found* **do** |

9 **if** *move is not tabu or less than* *tabu moves have been evaluated* **then** |

10 evaluate the move by calling dynamic programming |

11 **if** *solution better than* *was found* **then** |

12 perform the best of the evaluated moves in |

13 **else** |

14 perform the best of the nontabu evaluated moves in sol |

15 update tabu status |

The tabu search final solution is not necessarily locally optimal for the swap neighborhood. Therefore, after the tabu search heuristic stops, an improving solution is sought for in the swap neighborhood of the solution returned by the tabu search heuristic. If an improving solution is found, this solution is used as a new initial solution for a new execution of the tabu search.

### 5. New starting solution from path relinking

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

Path relinking (Resende et al., 2010) is a technique that explores the solution space between an initial and a guiding solution. By using a given neighborhood, path relinking creates a path of solutions connecting the initial and the guiding solutions. Starting from the initial solution, path relinking evaluates all moves in the neighborhood that decrease the distance from the current solution to the guiding solution. The best of those moves is selected until the guiding solution is reached. If both the initial and the guiding solutions are of good quality, it is expected that the solutions in the path that share properties of both, will also be of good quality.

Path relinking is sometimes used as a postoptimization procedure, see for example Ribeiro and Resende (2012). In our approach, after a set of elite solutions is constructed, path relinking is used for the construction of initial solutions in replacement of the randomized greedy heuristic introduced in Section 'Initial solution'.

#### 5.1. Elite pool management

#### 5.2. Integration of the path relinking with the core heuristic

Finally, steps of path relinking are executed starting from the initial solution toward the guide solution using the swap neighborhood. In each step, each swap movement decreasing the number of differences between the current and the guide solution is evaluated via DP and the best among them is selected. The solution obtained after steps is then used as the initial solution for the current iteration of the core heuristic.

### 6. Computational results

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

The proposed heuristic was coded in C++ and executed on a Intel Core i5 2.3 GHz with 4 GB of RAM. The experiments were conducted on the 20 instances with 33 clients proposed in Petersen and Madsen (2009). Parameters were experimentally tuned to: .

Table 1 shows the computational results. The first two columns show the instance name and the cost of its best-known solution. The next column considers the core heuristic introduced in this paper without the path-relinking approach. It shows the quality of the best solution found in 10 executions of 3 minutes of computation compared with the best-known cost for that instance (). The last column shows the same information for the core heuristic with the path-relinking approach.

Table 1. Computational resultsInstance | Best known | TS min | TS-PR min |
---|

R00 | 1063 | 1.002 | **1.000** |

R01 | 1032 | **1.000** | **1.000** |

R02 | 1065 | **1.000** | **1.000** |

R03 | 1100 | 1.001 | **1.000** |

R04 | 1052 | 1.005 | **1.000** |

R05 | 1008 | **1.000** | **1.000** |

R06 | 1110 | **1.000** | **1.000** |

R07 | 1105 | **1.000** | **1.000** |

R08 | 1109 | **1.000** | **1.000** |

R09 | 1091 | **1.000** | **1.000** |

R10 | 1016 | **1.000** | **1.000** |

R11 | 1001 | **1.000** | **1.000** |

R12 | 1109 | 1.002 | **1.000** |

R13 | 1084 | **1.000** | **1.000** |

R14 | 1034 | **1.000** | **1.000** |

R15 | 1142 | **1.000** | **1.000** |

R16 | 1093 | **1.000** | **1.000** |

R17 | 1073 | 1.007 | 1.004 |

R18 | 1118 | 1.018 | **1.000** |

R19 | 1089 | 1.004 | **1.000** |

Considering the heuristic without the path-relinking approach, a solution as good as the best-known solution was found for 13 over 20 instances when executing the heuristic 10 times for each instance. The largest difference in quality between the best solution found and the best-known solution was 1.8% for instance R18. The average cost of the solutions found by the core heuristic without the path relinking in one execution of at most 3 minutes was 0.8% above the best-known cost. Considering the heuristic with the path-relinking approach the results are much better. A solution as good as the best-known solution was found for 19 of the 20 instances. The largest difference in quality between the best solution found and the best-known solution was 0.4% for instance R17. The average cost of the solutions found by the core heuristic with path relinking in one execution of at most 3 minutes was 0.17% above the best-known cost.

Even if the obtained results are not as good as the current state-of-the-art heuristic (Felipe et al., 2009a, 2009b), they are competitive. The results compare favorably with other heuristic approaches (see, e.g., Petersen and Madsen, 2009; Urrutia et al., 2012).

With the current parameter setting, the proposed heuristic spends the great majority of its computation time on the tabu search procedure. The path relinking and the one step swap neighborhood search account for less than 10% of the time. Inside the tabu search procedure most of the time is spent in the computation of the cost of optimal routes for neighbor loading plans via DP.

### 7. Conclusions and future work

- Top of page
- Abstract
- 1. Introduction
- 2. Optimal routes from loading plan via DP
- 3. The DP algorithm
- 4. The core heuristic algorithm
- 5. New starting solution from path relinking
- 6. Computational results
- 7. Conclusions and future work
- Acknowledgments
- References

This paper introduced a novel local search approach to the DTSPMS. Instead of working directly on the routes, the local search works on the loading plan and a DP algorithm is used to construct optimal solutions for each loading plan.

The main drawback of the proposed approach is the high computational cost associated with the evaluation of each move via DP. To overcome this problem a move cost estimator was introduced. The estimator, combined with a partial neighbor evaluation policy, made the approach feasible and competitive with the best heuristics in the literature. Although our novel approach does not outperform the best route search based heuristic algorithms, it obtains, in a short period of time, good-quality solutions for instances for which, currently, no exact algorithm is able to obtain proven optimal solutions and no heuristic can consistently obtain the best-known solution cost over the same time period.

Being a restart algorithm, the core heuristic developed in this work suffers from the absence of a long-term memory policy. With this in mind, a path-relinking approach was proposed to enable the heuristic to use previously obtained good-quality solutions in the construction of new initial solutions. Computational results showed that this approach enhanced the performance of the heuristic.

Despite the competitive computational results obtained for medium-size instances with 33 clients, the heuristic, in its current form, is not suitable for much larger instances. This fact is due to the high computational cost of the DP algorithm used to evaluate move costs during local search.

Future works should aim overcoming this last issue. As noted in Section 'Working with the DP algorithm in a local search algorithm', there are several ways to improve the computational cost of evaluating neighbor solutions via DP. Some of them were studied in this work and one of them, the partial evaluation of the neighborhood using a cost estimator, is a fundamental part of the proposed heuristic. The possibility of fathoming dynamic programming states that cannot influence in the optimal cost of final states as proposed in Morin and Marsten (1976) is a strategy that should be investigated in the context of the DTSPMS. The Δ-evaluation strategy as mentioned in Section 'Working with the DP algorithm in a local search algorithm' can also be further investigated and incorporated into the algorithm. Also, a memory approach could be used to keep track of loading plans already solved. In this way, before calling the DP algorithm, the heuristic could check if the same or an equivalent loading plan had been already solved. Finally, the use of parallel processing may also reduce the computational time for the proposed heuristic.