Avoiding unnecessary demerging and remerging of multi‐commodity integer flows

Resource flows may merge and demerge at a network node. Sometimes several demerged flows may be immediately merged again, but in different combinations compared to before they were demerged. However, the demerging is unnecessary in the first place if the total resources at each of the network nodes involved remains unchanged. We describe this situation as “unnecessary demerging and remerging (UDR)” of flows, which would incur unnecessary operations and costs in practice. Multi‐commodity integer flows in particular will be considered in this paper. This deficiency could be theoretically overcome by means of fixed‐charge variables, but the practicality of this approach is restricted by the difficulty in solving the corresponding integer linear program (ILP). Moreover, in a problem where the objective function has many cost elements, it would be helpful if such operational costs are optimized implicitly. This paper presents a heuristic branching method within an ILP solver for removing UDR without the use of fixed‐charge variables. We use the concept of “flow potentials” (different from “flow residuals” for max‐flows) guided by which underutilized arcs are heuristically banned, thus reducing occurrences of UDR. Flow connection bigraphs and flow connection groups (FCGs) are introduced. We prove that if certain conditions are met, fully utilizing an arc will guarantee an improvement within an FCG. Moreover, a location sub‐model is given when the former cannot guarantee an improvement. More importantly, the heuristic approach can significantly enhance the full fixed‐charge model by warm‐starting. Computational experiments based on real‐world instances have shown the usefulness of the proposed methods.


INTRODUCTION
Many routing and logistics optimization problems can be modeled by multi-commodity flows. When flows merge, peak demands could be satisfied; and demerging allows more efficient use of resources in periods of lower demands. Optimization models may need additional controls over how multi-commodity flows merge/demerge. Examples (A) and (B) of Figure 1 illustrate what flow merging and demerging are.
In particular, an unacceptable situation in practice is the remerging of some demerged flows over a group of nodes. The flows could be recombined into combinations that are equivalent to the case without any demerging/remerging. We call this phenomenon unnecessary demerging and remerging (UDR). Figure 1C gives such an example of flow demerging/remerging over four nodes p, q, r, and s. The additional controls may be modeled by means of binary fixed-charge variables, but they would increase the computational burden tremendously. Examples of such multi-commodity flow problems with potential UDR include (with different characteristics in how flows merge/demerge) Train Unit Assignment [5], Train Unit Shunting [11,17], Rolling Stock Rescheduling during Disruptions [24], and there are some closely related classic problems such as the Multivehicle Tanker Scheduling Problem [4] and the Fixed-charge Transportation Problem [25]. This paper is inspired from the Train Unit Scheduling Problem (TUSP) [21], where unit coupling/decoupling operations correspond to flow merging/demerging. Typically, we focus on a kind of integer multi-commodity flow problem based on a directed acyclic graph (DAG) G = (N, A) where for some nodes j ∈ N, there is a demand r j to be satisfied by the flows, and a capacity u j . Moreover: i. Each arc a ∈ A has an attribute l a ∈ L G derived from G, which can be a location, a time slot, or a person in charge, and so on. We use location to refer to the attribute hereafter. ii. For all arcs associated with location l ∈ L G , denoted by a ∈ A l , we can collect the nodes of the arcs in set N l = {j 1 , j 2 , …, j n }, which is partitioned into two groups I l and J l where the former contains the tails of all arcs in A l and the latter the heads, that is, A l ⊆ I l × J l . iii. I l , J l and A l form a bipartite graph ("bigraph" hereafter), denoted by B l = (I l , J l , A l ). From the perspective of a bigraph, resources flow from I l to J l in a single direction. iv. Often |L G | > 1. When there is only one location, the problem is reduced to a variant of the transportation problem (with possible forbidden arcs).
More than one flow can be merged together to satisfy the demand of a node, and the reverse process is the demerging of flows from a node. The term "merging/demerging" will sometimes be shortened as "M/D" in this paper. Flow M/D activities are often essential for satisfying high demands at nodes by providing more flows there. In addition, flow M/D can also be used as a way of redistributing resources across the overall network regardless of demands.
The control of M/D activities is not trivial in producing a realistic and operable flow solution from a mathematical programming based model. UDR in a flow solution occurs when an alternative solution exists that can achieve the same or similar cost and effect while having a fewer number of M/D activities. Theoretically by using "fixed-charge" variables, the costs due to M/D operations can be minimized explicitly in the objective, for example, in [22] (where M/D corresponds to train unit coupling/decoupling). However, in practice, the difficulty of solving the model with fixed-charge variables prevents its application to medium/large instances. Processing UDR at a later stage is an alternative approach that is shown to be workable and effective for real-world instances [20]. However, it cannot guarantee global optimality. This paper proposes a heuristic branch-and-bound (BB) method for removing UDR without using fixed-charge variables. We also propose a warm-start framework that will significantly improve the performance of the exact fixed-charge flow model. This method partly relies on the fact that the number of M/D operations can be expressed in terms of flow potentials of a feasible solution, and by favoring full arc utilization heuristically on the BB tree, the occurrences of UDR would be reduced to near-optimal. It should be noted that here "flow potential" is a new concept and should not be confused with "flow residuals" used for finding maximum flows, and the calculation processes of the two are different. It will be shown that the exact method based on the fixed-charge flow model would significantly benefit from this heuristic by warm-starting strategies and it even enables the exact fixed-charge model to solve instances that were once intractable due to their sizes.
To the best knowledge of the authors, the problem investigated in this paper is almost an uncharted area and little existing literature can be found. A first attempt is tried in [21] to remove UDR (as train unit coupling/decoupling operations), where the concept of flow potential and flow connection groups (FCGs) are introduced and implemented into the BB process to heuristically remove UDR. A sufficient condition for guaranteeing an improvement on the M/D count is also proposed, which is used for guiding heuristic branching. Nonetheless, we point out certain limitations of [21]: • The sufficient condition is restricted to cases with interchangeable commodity types.
• The sufficient condition and branching scheme only consider fully utilizing balanced arcs, while unbalanced underutilized arcs are ignored. • FCGs, due to their local nature, are unable to handle more complex UDR involving several connection groups at a location.
In this paper, to deal with the above three limitations, we give a sufficient condition guaranteeing an improvement for non-interchangeable commodity types, a generalized sufficient condition for unbalanced arcs, and a location sub-model as a pure fixed-charge integer multi-commodity transportation problem that will consider the entire location. We also provide theoretical background about the difficulty in determining the existence and possibility of improvement for UDR in general, and thus justify the rationale of the location sub-model. More importantly, we propose a warm-start framework using sub-optimal solutions given by the heuristic model that enables the fixed-charge model to solve medium instances exactly with high-quality solutions. Finally, the problem described in [21] is limited within train unit scheduling. We realize that it can be generalized into a broader scope of application areas where multi-commodity flows are used. Therefore, this paper presents the problem and its solution approaches in a generic multi-commodity flow perspective and will only come to real-world instances in the computational experiment section with results from the TUSP. This paper is organized as follows. Section 2 surveys relevant literature. Section 3 describes the issue of UDR yielded from certain network flow formulations. Section 4 introduces an enhanced formulation with "fixed-charge" variables included, and discusses its pros and cons. Section 5 proposes a significantly upgraded heuristic branching method (compared with [21]) for removing UDR, including the concept of flow potentials and their relations to the number of M/D operations, the idea of FCGs, arc full utilization, and two alternative ways for validity checking of arc full utilization. The details of the heuristic branching scheme, Arc Full Utilization Branching (AFUB), will also be presented. Section 6 reports experiments conducted based on real-world instances of some UK rail networks. Finally, Section 7 concludes this paper and discusses future research.

Multi-commodity flow problems and their fixed-charge variants
The multi-commodity flow problem can be traced back to the pioneering work by Ford and Fulkerson [10]. In spite of many researches on the continuous and binary versions of the problem, general integer multi-commodity flow problems where variables can take general integral values have received relatively less attention, as reported in [2]. In Barnhart et al. [3], an origin-destination integer multi-commodity flow problem is solved by branch-and-price-and-cut, where the path variables are binary and each commodity can only use one path. In Alvelos [2], an approach based on branch-and-price for general integer multi-commodity flow problems is proposed, where invalid columns from subproblems are prevented explicitly by adding branching constraints, and customized branching techniques are used taking advantage of the problem structure.
The fixed-charge single−/multi-commodity flow problem is characterized by adding extra fixed-charge variables. For instance, a description of a single commodity fixed-charge flow problem can be found in [16]. Different terms referring to the same or similar problems with "fixed-charge" and "multi-commodity" features exist, for instance, the Network Design Problem [1,12,19]. Similar to multi-commodity flows, a mixed integer version and an all-integer version can be further categorized for the fixed-charge variants [15]. The fixed-charge multi-commodity flow (FCMF) problems are more challenging compared with a standard multi-commodity flow problem, even for the mixed integer version. Most of the current solution methods for FCMF problems are (meta)heuristic-based. Moreover, while there are plenty of researches on the mixed integer FCMF, the all-integer FCMF version has received less attention. In [15], Hewitt et al. propose a hybridization method combining exact algorithms with heuristic search techniques that produces provably high-quality solutions quickly for the all-integer FCMF problem with binary flow variables. The method searches over neighborhoods involving carefully chosen integer programming models derived from the arc formulation of the FCMF. To get lower bounds, the linear programming relaxation of the path formulation of FCMF is used and strengthened with cuts derived the neighborhood search. Later in Hewitt et al. [14], an algorithm for fixed-charge FCMF with binary flow variables uses restrictions of the problem to produce high-quality solutions quickly. Column generation is used both for generating problem restrictions and for producing bounds on the value of the optimal solution.
The focus of this paper is not on solving a generic all-integer FCMF problem, but only on integer multi-commodity flow problems with potential unnecessary flow demerging/remerging issues, which could be formulated as a special kind of the integer FCMF problem with the weights on the fixed-charge variables uniquely determined. (see Section 4).

Unnecessary flow demerging/remerging problem
Previous literature solely dedicated to UDR is very scarce, and [21] is likely to be the only one. There are relevant researches where similar observations are reported, and all of them are from the field of rolling stock scheduling. Coupling/decoupling (C/D) operations among train multiple units precisely correspond to flow merging/demerging when train unit scheduling is formulated by multi-commodity flow models. As far as the authors are aware, the only closely relevant discussions are found in [11,17] in train unit shunting at the station level, and in [24] in rolling stock rescheduling.
Coupling/decoupling costs are considered in shunting of train units overnight or during long idling time in the day. Freling et al. [11] propose a two-stage integer linear programming approach for the Train Unit Shunting Problem at a station within 24 hours (mainly overnight). The main focus is on shunting train units that are temporarily unused between two trips to sidings (shunt yard). Instances for the station with up to 80 train units to be parked at 19 sidings have been tested where near-optimal solutions can be found. The definition of a unit block as "an entity of one or more train units that remain together for the entire planning period" is given in [11], and this idea will be also used in this paper. At the first stage of matching supply to demand, the issue of "keeping train units together as much as possible…the number of blocks has to be minimized" is discussed. This is essentially a special case in dealing with unnecessary coupling/decoupling for train units with long idling time and shunting movements. At the first stage, each shunting movement from an arrival trip to a departure trip is assigned with a weight to be minimized in the objective, thus realizing the goal of keeping units together as much as possible. Kroon et al. [17] later propose an integrated approach for the same Train Unit Shunting Problem as in [11] where unit matching and shunting assignment are processed together, hence global optimality is guaranteed. In both papers, since the problem studied are different from our work, it requires only binary variables to formulate, and thus unnecessary coupling/decoupling operations can be removed by directly adding relevant binary variables to the objective.
How to best reschedule a fleet of rolling stock units during a disruption is an optimization problem regularly faced by railway operators. Lusby et al. [24] propose a branch-and-price method based on a path formulation over a time-space network. Near-optimal solutions can be found within a few seconds on real-world instances. Coupling/decoupling is included as a kind of cost in the objective. Instead of the exact number of coupling/decoupling operations, an approximated number is used that the cost due to C/D for each unit's workload (represented by a path) is estimated in advance by the number of times a unit enters and leaves the (sub)trips it covers (or equivalently the times a unit returns to depots implied by its path).
The unnecessary C/D investigated in this paper's experiment section for real-world TUSP covers a broader range of trip connections compared with [11,17]. In their papers, only the C/D operations from long dwelling connections requiring shunting activities are considered, while in this research C/D operations from all kinds of connections in a train unit schedule are considered. Note that in the UK, C/D operations can be done at platforms straightforwardly without extra shunting activities in a relatively short time duration, for example, 5 minutes. In contrast to [24], the exact number of C/D operations is used in this paper, and we do not assume that depot-return activities are necessarily associated with C/D operations.
An earlier short conference paper [21] by the authors firstly discussed the unnecessary coupling/decoupling problem as a new kind of network flow problem. Nonetheless, as is already mentioned in Section 1, [21] has several limitations to be overcome by this paper. It also has not got the warm-start framework that is shown to be pivotal in getting high-quality solutions compared with the manual results in the TUSP.

PROBLEM DESCRIPTION
Assume that an integer multi-commodity flow problem is based on a DAG  = ( , ). The commodities are denoted by k ∈ K.  = N ∪ {s, t} represents the nodes where s, t are the source and sink with a flow of amount b k to be routed from s to t for each commodity k ∈ K. 1 N is the set of intermediate nodes j each with a demand r j and a capacity u j .  = A ∪ A 0 is the arc set where A = {(i, j) | i, j ∈ N} contains connection arcs and A 0 = {(s, j) | j ∈ N} ∪ {(j, t) | j ∈ N} contains source/sink arcs. A path is defined restrictively as an s-t path from the source to the sink. Let P k be the set of paths associated with commodity k (i.e., the paths can be used to route commodity k) and x p be the flow amount in p ∈ P k . An integer multi-commodity flow model based on path variables can be formulated as: x p ∈ Z + , ∀p ∈ P k , ∀k ∈ K.
In Objective (1), x p gives the flow amount along path p and c p is the cost of using a unit of the flow represented by p. Constraints (2) require the used flows to be no more than the total amount b k for each commodity k. Constraints (3) ask the flow provision (q k being the unit provision contribution of commodity k) for each node to be no less than the demand r j . Constraints (4) limit the used flow resource at j to be no more the capacity u j , where v k is the flow resource consumed by a unit flow of commodity k. Finally Constraints (5) give the variable domain.
The above formulation can have flow demerging/remerging at nodes. If no special control is taken into account, UDR may incur additional costs and result in impractical solutions unacceptable for real-world operations. Let , and x has UDR.
When planning is done by manual practitioners and M/D activities incur costs, the UDR as in Figure 2A is unlikely to appear, as it is simply too clumsy. However, when an optimization model such as (M x ) is used, without special control, UDR is possible to appear. This is because the path flow variables x p (or equivalently, arc flow variables x a if an arc based formulation is used) alone are difficult to measure the exact number of M/D operations. In real-world cases, UDR in optimization models such as (M x ) may raise serious problems on the practicality of the results. For example, Figure 3 shows a scheduling result of how trains are connected at Leeds Station given by a TUSP model similar to (M x ) that does not consider UDR. The left side shows the arrival trains, and the right shows the departure trains. Each link from an arrival train to a departure train corresponds to an arc in A, and the used arcs are marked in yellow. Extensive UDR operations can be observed. Figure 3 also illustrates the five characteristics mentioned in Section 1, where the location l is a train station and the nodes are trips to be covered by train unit vehicles as flows. It can be observed that a bigraph (I l , J l , A l ) is formed by the arrival trains as I l and the departure trains as J l , with single directional arcs from the former to the latter.
To solve the above problem of UDR, we discuss two alternative remedies. The first one is to introduce additional fixed-charge variables; the second one is to use heuristic branching based on the concept of flow potential at the BB stage. Finally, we propose a newly developed framework by warm-starting the fixed-charge model with an initial feasible solution produced from the single variable model with branching heuristics.

FIXED-CHARGE VARIABLE MODEL
This section briefly discusses the integer FCMF model proposed in [22]. We call the arcs in  having no commodity attribute omni-arcs 2 and the arcs in  k , k ∈ K having commodity attribute commodity-arcs. To minimize UDR, based on the original integer multi-commodity flow model (M x ), fixed-charge variables y a ∈ {0, 1}, a ∈  are added to indicate whether an omni-arc a is used (y a = 1) or not (y a = 0). The number of M/D operations can be straightforwardly calculated from the fixed-charge variables, and the number of M/D operations can be minimized as an additional term in the objective.
Let x k a be the flow amount over the commodity-arc with commodity k at arc a, and let x a be the flow amount over omni-arc a, that is, x a = ∑ k∈K x k a . Note that the fixed-charge variables are defined such that y a = 1 if x a > 0 and y a = 0 if x a = 0. Let − (j) and + (j) be the omni-arcs terminated at and originated from of j, respectively. For a node j ∈ N, the number of merging operations can be calculated by # C j = ∑ a∈ − (j) y a − 1, and similarly the number of demerging operations is # D j = ∑ a∈ + (j) y a − 1. Therefore, the total number of M/D operations over the entire DAG is When used as a term in the objective, to avoid possible negative values, it is sufficient to only minimize ∑ a∈A 0 y a + 2 ∑ a∈A y a . Then the integer FCMF formulation as an enhanced version of model (M x ) is given as, x p ∈ Z + , ∀p ∈ P k , ∀k ∈ K; The newly added Constraints (11) are used to force y a to be binary where u a is the largest value the flow amount over arc a can take as implied by x a = ∑ k∈K ∑ p∈P k a x p , where P k a is the set of paths associated with commodity k containing arc a. The term ∑ a∈A 0 y a + 2 ∑ a∈A y a can accurately capture the total number of M/D operations in a feasible solution, making the optimal solution in (M xy ) exact by minimizing both the original operational cost in the first term and the cost due to M/D operations in the second term. By setting z(x) = ∑ k∈K ∑ p∈P k c p x p and by slightly violating the notational convention and setting where y is determined afterwards based on x, model (M xy ) is able to remove UDR operations as in Definition 1.
Nonetheless, the additional variables/constraints with respect to y a in (M xy ) can significantly increase the difficulty in solving the model. For instance, high-quality solutions are difficult to find using variants of (M xy ) proposed in [22] when the number of nodes is over 200.

A HEURISTIC BRANCHING SCHEME FOR REMOVING UNNECESSARY DEMERGING/REMERGING FOR MODEL (M X )
Recall Definition 1 that UDR in a feasible x happens when there exists another feasible In addition, each node is associated with a location, and a bigraph can be formed at each location. We show that arcs with "positive flow potentials" have strong correlations with UDR. Given a feasible solution, by heuristically detecting such arcs with positive flow potentials and by enforcing full utilization where appropriate, UDR operations can be significantly reduced. Verification on whether fully utilizing an arc leads to a reduction of M/D is important and two approaches are thus proposed: (i) Conducting a verification within a subset of nodes of the location bigraph, that is, a "FCG," and (ii) Conducting a verification within all nodes of a location using a location sub-model. Being heuristic though, this process of arc full utilization is embedded into the BB tree of the customized branch-and-price solver designed for models similar to (M x ) to produce high-quality solutions by taking advantage of BB.

Flow potential and total number of M/D operations
The concept of flow potential is proposed in [21]. Let w j be the flow amount on j ∈ N in a feasible solution x. For each intermediate arc a = (i, j) ∈ A, let a = min(w i , w j ) and M a = max(w i , w j ). For each source/sink arc a ∈ (s, j) or a ∈ (j, t), j ∈ N, define a = w j . M a is not needed for source/sink arcs.

Definition 2.
Given a feasible solution x over  = ( , ), the flow potential over arc a ∈  is defined as Figure 4 gives an example of the flow potential over arc (i,j). The flow amount over (i,j) is x ij = 1. The flow amounts on the two endpoints are w i = ij = 2, and w j = M ij = 3. This means that one additional flow over (i,j) could have been exploited, such that the flow potential on(i,j) is ij = ij − x ij = 2 − 1 = 1. The definition of "flow potential," although sharing some similarity, should not be confused with the commonly used "flow residual" as in Augmenting Path Algorithm for the max-flow problem [1]. The two have different definitions and purposes where regarding flow potential, we only focus on unused capacity over used arcs, not all arcs (as unused arcs all have zero potentials), and we only use flow potential for identifying UDR operations, rather than finding a max-flow.
A used arc with a positive flow potential is called underutilized and a used arc with a zero flow potential is called fully utilized. Intuitively, a positive flow potential over an arc indicates that the arc's capacity is not fully exploited, and the flows that cannot use the residual capacity (i.e., the potential a ) may have to use other arcs, implying a possible increase in the number  Table 1, where a potential relationship between UDR and positive flow potential can be observed. Based on Definition 2, we can represent fixed-charge variables in terms of a , x a and a , that is y a = a +x a a . Therefore, by employing (6), the total number of M/D operations, Σ#, can be expressed in terms of flow potentials: See the Appendix for a detailed explanation on how node flows (w j ) are replaced by two endpoint flows a and M a using an illustrative example.
An arc (i,j) in a feasible flow solution is called balanced if w i = w j and unbalanced if w i ≠ w j . From (14), it can be seen that the total number of M/D operations consists of two parts. The first part ∑ a∈A 0 a a + 2 ∑ a∈A a a relates to underutilized arcs and the second part x a relates to unbalanced arcs. If an arc is unbalanced, M/D operations are inevitable. If we assume that w j , ∀ j ∈ N is already the best provision for all nodes, then M/D caused by unbalanced arcs may be regarded as "reasonable." On the other hand, underutilized arcs tend to be more responsible for UDR where better alternatives can be constructed to retain exactly the same flow provision for the relevant nodes. Figure 2 and Table 1 give such an example.
Equation (14) does not guarantee that fully utilizing an arbitrary underutilized arc will definitely reduce the number of M/D operations, since a decrease on the flow amount over one arc may cause an increase on another arc's flow. Therefore, it is important to further explore under which conditions such an action will ensure a decrease on the number of UDR.

Flow connection group
Recall that in our problem, each arc a has an associated location l a ∈ L G , and a bigraph B l can be correspondingly derived at each location l. A flow connection bigraph derived from a used connection arc a = (i, j) ∈ A in solution x (x a > 0), denoted by B(x a ) = (I, J, A IJ ) as a subset of the location bigraph B l a , is a bipartite graph formed by all arcs and nodes "having relevant flows" with x a at the location associated with arc a that can be separated from flows of other nodes and arcs. Additionally, information on node flow amount of the relevant nodes w i , w j , i ∈ I, j ∈ J is also included in a flow connection bigraph. I,J are called sending nodes and receiving nodes, respectively, and Finally from Algorithm 1, it can be verified that for any flow connection bigraph (I, J, A IJ ), Figure 5 gives an example of how two flow connection bigraphs are derived from a flow solution at a location associated with eight nodes i, j, k, l, m, n, p, q, and the arcs among them. Note that since a source arc (s,k) is used (x sk = 1), s should be included into the sending nodes of B(x ik ) with w s = 1.
When there is no s and t contained in an FCG, the above can be further simplified into Figure 6 gives an example of Equation (16). Note that in (15) and (16), only M/D operations within the current location, that is, those yielded by using the arcs in A IJ , are concerned in the above definition. There can be M/D operations prior to I or beyond J in x, but they are not included in calculating #[x(B)]. We call an FCG without s and t as the case in (16) an interior FCG.
In a feasible flow solution x, let w k j be the flow amount achieved by commodity k at j and let w j be the flow amount of all commodities achieved at node j, that is, w j = ∑ k∈K w k j . The flow contribution in satisfying the demand at node j achieved by x is denoted by j ≔ ∑ k∈K q k w k j , where q k is the unit contribution of commodity k. Two flow solutions x and x ′ are said to have equivalent flow contribution on j, denoted by w j ≅ w j ′ , if (i) w j = w j ′ , (ii) j ≥ r j and j ′ ≥ r j , where r j is the flow demand on j as defined in (M x ). A special case of equivalent flow contribution is that all the commodities involved are "interchangeable" within an FCG in providing enough flow contribution (i.e., flow amount multiplied by q k ) while keeping the flow amount per node unchanged. In other words, condition w j ≅ w j ′ can be simplified to w j = w j ′ if commodities are interchangeable. This special case is referred to as "commodity interchangeability condition." If further requirement w k j = (w k j ) ′ , ∀k ∈ K is imposed, then x and x ′ are said to have identical flow contribution on j, denoted by w j ≡ w j ′ . Indeed this is a stronger condition since (14) that arcs with positive flow potentials are a major source of UDR in terms of excessive M/D. If appropriate underutilized arcs (a, a > 0) in a feasible integer solution of (M x ) are identified and modified to be fully utilized (x a = a ), UDR is likely to be reduced. FCGs are useful for the above process, where in each iteration we focus on an individual FCG and fully utilize one of its underutilized arcs while keeping all the other nodes' flow contribution still equivalently provided. The advantage of restricting the scope to an FCG is that it is possible to obtain some sufficient conditions for guaranteeing an improvement. Before going into details of such sufficient conditions, we first show that an improvement cannot be always guaranteed by randomly forcing one arc to be fully utilized within an FCG.
Let us consider a case where the "commodity interchangeability condition" is met, such that it is sufficient to only keep the flow amount over the nodes unchanged. Given an FCG x(B) based on B = (I, J, A IJ ) with a > 0, and let x ′ (B) be a resulting FCG by fully utilizing a, that is, let x ′ a = a . When comparing the two FCGs, we assume that the parts outside B are not affected since the node flow amount w j , ∀ j ∈ I ∪ J are kept unchanged. We have the following observations that a single operation by fully utilizing a may not guarantee:

Sufficient conditions of guaranteed improvements by full utilization within an interior FCG
For denotational convenience, in this section w j can be written as w(j), j ∈ N, and x ij can be written as x(i, j), (i, j) ∈ A. Recall the classic result from the Transportation Problem with m sources and n sinks [8], we have the following Lemma 1. Given an underutilized balanced arc (u, v) ∈ A IJ in an FCG x(B), B = (I, J, A IJ ), let I uv ≔ {i ∈ I\{u}| x iv > 0}, J uv ≔ {j ∈ J\{v}| x uj > 0} and A uv ≔ {(i, j) ∈ A IJ \{(u, v)} | i ∈ I uv , j ∈ J uv }. Letw i ≔ x , ∀i ∈ I andw j ≔ x , ∀j ∈ J . The resulting bipartite subgraph B uv ≔ (I uv , J uv , A uv ) with the adjusted node demandw i ,w j is called the potential graph of x(B) derived from (u,v). Figure 8 gives an example where B uv (middle) is the potential graph of x(B) derived from (u,v) (left). 3 To make x ′ (B) feasible without affecting the flow contributions on the nodes outside B, source arc (s, k) and sink arc (j, t) can be used as depicted, which is generally undesirable as they indicate an increase in the number of used commodity flows by one.

Fully utilizing balanced arcs
The following Theorem 1 is proposed in [21] as a sufficient condition for commodities that are interchangeable and for an underutilized arc that is balanced. (u,v). Assume that the "commodity interchangeability condition" is satisfied. There exists at least one x ′ (B) by fully utilizing (u,v) such that x ′ (B) is feasible,

Theorem 1. Let x(B) be an interior FCG with a balanced underutilized arc
, if the potential graph B uv = (I uv , J uv , A uv ) is complete.  When an interior FCG does not satisfy the commodity interchangeability condition, which is more often seen in real practices, Theorem 1 cannot be applied. Here we give a sufficient condition that can be used in the FCGs where (i) commodities are not necessarily interchangeable, (ii) the component of each node strictly remains the same in the new FCG. An underutilized arc (i,j) that is not only balanced (w i = w j ) but also has exactly the same component on its two end points (i.e., w i ≡ w j ) is called strongly balanced. Suppose we want to fully utilize a strongly balanced arc within an FCG while strictly retaining the component of each node. Let C max denote the maximum flow amount allowed at nodes (which is implicitly included in Constraints (4)). We have the following conclusions. 2. When C max ≥ 5, the above property (1) no longer holds.
Proof. Similar reasoning as in the proof for Theorem 1 [21] can be used, such that an improvement can be guaranteed if there exists an arc assignment y ′ a , a ∈ A where the number of used omni-arcs ∑ a∈A y ′ a in the potential graph is less than the number of removed omni-arcs as a result of separating (u,v) from the rest parts of x(B). This is equivalent to ∑ a∈A y ′ a <| I | + | J |. LetW = ∑ i∈Iw i = ∑ j∈Jw j be the total number of flows in I uv (or J uv ). Note that 1 ≤| I |=| J |≤W, x ≥ 1, and it is sufficient to always assume w u = w v = x +W = C max .
1. The first conclusion can be verified by enumerating all possible scenarios. Note that since w u ≡ w v and the potential graph B uv is complete, for any flow in I uv with its own commodity type, it can always be matched to at least one unit in J uv .
i. When C max = 2,W = 1, | I |=| J |= 1, ∑ a∈A y ′ a = 1 is the only possibility, which satisfies ii. When C max = 3, it only needs to considerW = 2 (W = 1 is similar to (i)), such that 1 ≤ | I uv | = | J uv | ≤ 2. When at least one of |I uv | or |J uv | is 1 (without loss of generality, assume |I uv | = 1), the only possible flow assignment on y is to use exactly |J uv | omni-arcs, such that ∑ a∈A y ′ a =| I | + | J | −1. The last option is |I uv | = | J uv | = 2, giving ∑ a∈A y ′ a = 2. iii. When C max = 4, it only needs to considerW = 3, such that 1 ≤ | I uv | = | J uv | ≤ 3. When at least one of |I uv | or |J uv | is 1, similarly we have ∑ a∈A y ′ a =| I | + | J | −1. When at least one of |I uv | or |J uv | is 3, then we have ∑ a∈A y ′ a = 3 such that ∑ a∈A y ′ a <| I | + | J |. The last option is |I uv | = | J uv | = 2. In this case the number of used omni-arcs must satisfy ∑ a∈A y ′ a ≤ 3 since it is bounded by the fact thatW = 3. Therefore, for all possibilities when C max ∈ {2,3,4}, 2. For C max ≥ 5, we prove that a counter example giving no improvement can always be found. Consider the special case of an x(B) where u and v are consisted of C max different non-interchangeable commodities k 1 , k 2 , … , k C max , each having a flow amount of one. Underutilized arc (u,v) is filled with one flow amount of commodity k 1 where uv = C max − 1. Therefore in B uv , both I uv and J uv have a total flow amount ofW = C max − 1, made up of commodities from k 2 , k 3 , … , k C max each occurring once and only once in I uv and J uv , respectively. To find a feasible flow in B uv constructed as above, since all commodities are non-interchangeable, the only way to allocate one flow of commodity k r , r = 2, …, C max from I uv is to match it to the corresponding flow of commodity k r in J uv .
Additionally we require that |I uv | = | J uv | = 2 where I = {i 1 , i 2 }, J uv = {j 1 , j 2 } in a way that , that is, the flow amounts in w i 1 and w i 2 (w j 1 and w j 2 ) are as close as possible. SinceW ≥ 4, ⌊W 2 ⌋ ≥ 2, meaning any node in B uv should have a flow amount of at least two. Figure 9 gives an example of such an FCG.
Since |I uv | + | J uv | = 4, only a flow assignment with ∑ a∈A y ′ a < 4 can guarantee an improvement. However this is only possible if w i 1 ≡ w j 1 (which also implies w i 2 ≡ w j 2 ), where ∑ a∈A y ′ a = 2 can be achieved as shown on the left of Figure 10. As long as the condition w i 1 ≡ w j 1 (w i 2 ≡ w j 2 ) is not satisfied, since each node contains at least two unit flows ( ≥ 2), all the four arcs in A IJ will have to be used since any node needs to be connected with both the two nodes in the opposite, giving ∑ a∈A y ′ a =| I | + | J |= 4. Such an example is given on the right of Figure 10, where k 2 and k C max are swapped between j 1 and j 2 compared with the potential graph on the left. Therefore, when C max ≥ 5, the property as in (1) for C max ∈ {2,3,4} will no longer hold. ▪ In the TUSP, C max is the maximum number of units that can be coupled at a trip. As far as the authors are aware, C max = 3 is set for all the train operators we have collaborated with so far. Therefore, for real-world TUSPs, it generally only needs to consider Property (1) in Theorem 2 when unit types are not interchangeable within an interior FCG.

Fully utilizing unbalanced arcs
Both Theorems 1 and 2 only consider cases where the focused arc is balanced. Next we further extend the conclusions from the two theorems into general cases such that unbalanced arcs can also be considered appropriately. First let us observe an example as illustrated in Figure 11. In this FCG made with two sending nodes u,v and four receiving nodes v 1 , v 2 , j 1 , j 2 , all arcs involved are unbalanced, and in the original solution on the left, arc (u, v 1 ) is underutilized with a potential of two. If we fully utilize (u, v 1 ) and keep x(u, v 2 ) = 1 unchanged, we will get a new FCG given on the right where the M/D number is reduced from four to two.
In fact, we can generalize the results in Theorem 1 into the following Theorem 3, which provides a useful way to detect such cases as in Figure 11 where fully utilizing an unbalanced arc will give an improvement. To formally state this theorem, we need to introduce the definition of a balanced underutilized arc set solution. Let x(B) be an interior FCG based on B = (I,

J, A IJ ). Suppose there is an arc set
, and is called underutilized if ∑ a ∈ U × V x a < UV , that is, at least one arc in U × V is underutilized (which can be either balanced or unbalanced). A flow potential for the arc set U × V can also be defined accordingly as = We can also define the potential graph of x(B) derived from x(U × V). Let I ≔ {i ∈ I\{U} | x q > 0, q = 1, … , n}, J ≔ {j ∈ J\{V} | x u p j > 0, p = 1, … , m} and A UV ≔ {(i, j) ∈ A IJ \{U × V} | i ∈ I UV , j ∈ J UV }. Letw i ≔ x q , ∀i ∈ I andw j ≔ x u p j , ∀j ∈ J . The resulting bipartite subgraph B UV ≔ (I UV , J UV , A UV ) with the adjusted node demandw i ,w j is called the potential graph of x(B) derived from U × V. For example, B {u}{v 1 ,v 2 } = ({i}, {j 1 , j 2 }, {(i, j 1 ), (i, j 2 )}) is the potential graph of the FCG on the left of Figure 11 derived from the balanced underutilized arc set solution x({u} × {v 1 , v 2 }), with adjusted node demandsw i = 2,w j 1 =w j 2 = 1.

Theorem 3. Let x(B) be an interior FCG with a balanced underutilized arc set solution x(U × V). Assume that the "commodity interchangeability condition" is satisfied. There exists at least one x ′ (B) by fully utilizing all underutilized arcs in
is complete.

Proof.
The proof is similar to the proof for Theorem 1 in [21], by noticing that the number of removed omni-arcs, |I UV | + | J UV |, is always greater than the possibly largest number of newly used arcs in B UV = (I UV , J UV , A UV ), which is |I UV | + | J UV | − 1, since the potential graph is complete and the result given by Lemma 1. ▪

Remark .
Note that if an underutilized arc in x(U × V) is balanced, then the arc itself can make an independent balanced underutilized arc set and can be excluded from x(U × V). Therefore, eventually we can exclude all balanced underutilized arcs and only keep the unbalanced arcs in a balanced underutilized arc set.
An example of the application of Theorem 3 has been already given in Figure 11, where we have a balanced arc set solution that is complete, and therefore an improvement can be made by fully utilizing arc (u, v 1 ).

FIGURE 12 Local model example
We will not present a theorem analogous to Theorem 2 for general cases (including unbalanced underutilized arcs) with non-interchangeable unit types, since the possible scenarios will be rather problem-specific and it is more appropriate to remove UDR by the location pure fixed-charge model proposed in Section 5.2.3, which takes a holistic view over the location.

A pure fixed-charge model for checking arc full utilization at a location
The sufficient conditions given in Theorems 1 to 3 can only guarantee an improvement within an interior FCG itself as a result of fully utilizing an underutilized arc. Nonetheless, from the real-world instances of the TUSP tested by us, cases corresponding to the above conditions are responsible for a large proportion of UDR. Despite of this, Theorems 1 to 3 have certain limitations in dealing with other scenarios. First, they are all sufficient conditions only for certain scenarios. Second, they can only handle interior FCGs, that is, without the source, sink and any source/sink arcs. Finally, even for interior FCGs, there are still cases where these conditions are not satisfied within an individual FCG, but a fewer number of M/D operations can still be achieved if more than one FCGs are considered at the same time.
Such an example is given in Figure 12. Consider the case given in Figure 7, where no resolution can be found for x(B). If another FCG x(B ′ ), B ′ = ({p}, {q}, {(p, q)}) within the same location is available (see Figure 12A), then by fully utilizing (i, l), a reduction on M/D number can still be achieved as illustrated in Figure 12B. The above example indicates that even fully utilizing an arc within an FCG that is not guaranteed by Theorems 1 to 3, by incorporating other FCGs, an improvement is still possible. However due to the "online" nature of the problem, it is unknown beforehand, among all the nodes at the same location, which other FCGs should be considered together. The remedy is to directly use a multi-commodity variant of the pure fixed-charge transportation problem based on an underutilized arc e and the location l(e) associated with e, denoted by (MS e ), and solved directly by a generic integer solver embedded into the BB process. It takes account of all nodes associated with l(e) to determine whether an improvement can be made by fully utilizing arc e.
Before we go to the details of this model based on individual locations, we first briefly give a theoretical result on the computational complexity of both detecting the existence of improvement and minimizing the M/D without changing the flow amount and node components in an interior FCG, which justifies the use of (MS e ), as no "efficient" method exists.
It is worth mentioning the results given by [26] about the constant pure fixed-charge transportation problem, where based on a standard transportation problem [8] with m supplies and n demands, fixed-charge variables y ij , i = 1, …, m, j = 1, …, n are also introduced to indicate whether an arc is used (= 1) or not (= 0). In the objective, only the fixed-charge variables with the same cost are considered (e.g., min ∑ n j=1 f ⋅ y ). The authors have proved that the decision problem of the existence of degeneracy (i.e., if a fewer number of arcs than the size of standard basis, m + n − 1, take positive values in the solution of the transportation problem) is NP-complete, and further have shown that the pure fixed-charge transportation problem in minimizing the number of used arcs is NP-hard.
The problem of minimizing the number of used arcs in an interior FCG, as what Theorems 1 to 3 try to achieve, is not strictly a pure constant fixed-charge transportation problem, since certain arcs are forbidden to be used (not all trips can be connected in the DAG ). Little knowledge about the transportation problems with forbidden arcs is known, even the necessary and sufficient condition for the existence of a feasible solution [9,27]. Therefore, we need to regard the problem within an interior FCG as a generic fixed-charge network flow problem over an incomplete bipartite graph. The objective in this case is concave (with only the fixed-charge variables) and according to the result from [13], this problem with a single commodity is already NP-hard, mainly due to the concavity of the objective, and thus the NP-hardness of our problem, which is with multiple commodities. Therefore, no "easy" approach exists for finding the minimum number of used arcs in an interior FCG, and here we choose to use a generic integer linear program model (MS e ) over the entire location to deal with the more difficult cases where Theorems 1 to 3 cannot be applied to.
Let ,  be the set of origin and destination nodes on the bigraph implied by the location with s and t also included in  and  , respectively.
denotes the involved arcs. Two alternatives can be considered when designing model (MS e ): whether to equivalently keep the flow contributions (i.e., w j ≅ w ′ j , ∀j ∈  ∪  ) or to identically keep the them (i.e., w j ≡ w ′ j , ∀j ∈  ∪  ). We choose the latter to ensure the nodes' compositions are retained as before, such that theoretically it is possible to keep the flow distribution in other parts of  unaffected. Model (MS e ) is given in the following: x k a ∈ Z + , ∀a ∈ A  , ∀k ∈ K; (23)  (23) and (24) give the domains on the decision variables, where x a gives the flow amount of commodity k over arc a and y a is the omni-arc variable over arc a. If it is shown by (MS e ) that fully utilizing e within l(e) will give a new feasible station solution with a fewer number of M/D operations (i.e., #[l(e)] < #[l(e)]), then this full utilization is regarded as successful and will be approved. Otherwise, fully utilizing e will be disapproved by (MS e ).
Using model (MS e ) to verify arc full utilization might be less efficient than Theorems 1 to 3. Nonetheless, it verifies the utilization over an arc exactly within the entire location. To reach a trade-off between efficiency and accuracy, model (MS e ) will only be called if the three flow potential graph based methods cannot guarantee an improvement.

AFUB for (M x )
With the above two alternatives based on potential graphs and the local model for verifying arc full utilization, we propose a BB scheme to heuristically remove UDR operations, namely AFUB. It is a significant enhancement over the heuristic branching method in [21]. AFUB is designed for model (M x ) which does not have fixed-charge variables. Just like fractional arcs are to be converted into integral ones in a standard BB process, here in AFUB, underutilized arcs are regarded as undesirable and will be branched to be fully utilized when there is evidence showing an improvement by doing so. An option is set on whether it is allowed to branch on "non-guaranteed" underutilized arcs, that is, arcs that have been rejected (if attempted) or have not been verified (if not attempted) by Theorems 1 to 3 or model (MS e ). Suppose an underutilized arc a is found and the strategy allows to further fully utilize it, then two branches will be formed under AFUB: • On the left branch, a will be fully utilized.
• The right is a "back-up" branch, retaining everything as in the parent but with a labeled as "failed" one more time.
For the right branch, an arc can be "failed" for at most F max > 0 times, and after that it will no longer be considered for branching even if it is underutilized. This option is set to limit the BB tree to a reasonable size, and to also avoid the search falling into a repeated pattern or a small local neighborhood. The left full utilization branch always has a higher priority to be visited than the right back-up branch, regardless of the specific node selection strategies used (e.g., best-first, depth-first or hybridized).
AFUB will only be activated in the BB process after all the variables are integral. A solution will not be regarded as feasible, or "operable," until all variables are integral and all underutilized arcs are either fully utilized or failed for more than F max times. Normal termination criteria for BB are still used, such as the global optimality gap, the maximum number of tree nodes and maximum time.
The final goal of AFUB is not to find a solution with the theoretically minimum number of M/D, which often leads to a poor quality solution in the ∑ c p x p part (An extreme case would be to serve each node j with flow(s) over the path made by s → j → t, such that overall no M/D exists at all in the graph.). To prevent the objective value from being excessively compromised by AFUB, another upper bound is used to make the optimal solution consistent with Definition 1. Letẑ int = ∑ c pxp be the incumbent best integral (but not necessarily operable) objective value in terms of flow variables, and let z n a be the objective value as a result of branching underutilized a at BB tree node n. Besides the standard cut off by bound criterion based on the global upper boundẑ opr from the incumbent best operable solution, if an extra quality control condition z n a >ẑ int + Q is met, tree node n will also be cut off, that is, the upper bound to cut off a node by bound is set as min{ẑ opr ,ẑ int + Q }. Note that this additional upper bound only applies at the AFUB stage. This extra level of quality control is important for ensuring the balance between z(x) = ∑ c p x p , the original objective and (x) = # (x), the number of M/D operations, as controlled by Q in Definition 1. Unlike a standard BB, AFUB cannot guarantee the exactness in finding an optimal solution as a balance between z(x) and (x) with respect to Q . However, the BB tree's structure will provide wide search space, in contrast to a heuristic method that tries to fixed up underutilized arcs in a uni-directional or local search manner.
Similar to selecting an appropriate fractional variable to branch on in standard BB, for AFUB, it is important to decide which underutilized arc to branch on among a large number of candidates. The following lists two verification methods based on the Theorems 1 to 3, model (MS e ), and a third option to allow to branch on "non-guaranteed" arcs: i. PG: Find guaranteed arcs by complete potential graphs as in Theorems 1 to 3. ii. MS: Find guaranteed arcs by the local model (MS e ). iii. NG: Branch on randomly selected underutilized arcs without guaranteed feasibility or improvement.
When designing a branching variable selection strategy for the AFUB, it is possible to choose a combination of the methods among "PG," "MS," and "NG." When more than one of them are in use, the following priority order is enforced: PG > MS > NG, that is, first the potential graph will be used, and if it fails, location sub-model will be called, and if it fails too, the arc will be used without any guaranteeing. In fact, this priority order gradually widens the scope to be concerned from a small FCG (PG), to a further widened scope, location (MS), and finally to the entire DAG (NG). The last criterion may be occasionally needed in conjunction with (PG) and/or (MS), since a locally unproven underutilized arc may still make a global improvement over the DAG. Algorithm 2 summarizes the major steps of AFUB. Branching on a (strongly) balanced underutilized arc (i,j) does not require adding an explicit constraint such as x ij = ij . It can be realized in a more efficient way by removing all arcs in + (i) ∪ − (j)\{(i, j)}, leaving (i,j) the only choice for sending flows from i and for receiving flows to (j). Similarly, branching on an unbalanced arc (i, j), w i > w j = ij can be realized by removing all arcs in − (j)\{(i, j)}.

A warm-start based solution approach for model (M xy )
A warm-start based approach is proposed for solving the integer FCMF model (M xy ), which is known to be difficult to solve for medium to large scaled instances due to its fixed-charge/concavity nature [13], and most literature uses heuristics instead, for example, [6,7,15]. In our problem, since the fixed-charge term in the objective has its unique form solely used for minimizing the number of M/D operations, and we have designed the AFUB to heuristically reduce the M/D number in the "partial" model (M x ), this local knowledge of our problem is thus used for a two-step warm-start framework introduced here to speed up the process for exactly solving the "full" model (M xy ) to satisfactory optimality.
The principle of this approach is to first solve model (M x ) with AFUB heuristics, obtain its (sub-)optimal solution x † and (sub-)optimal objective z † . Next, solve model (M xy ) with x † as an initial feasible solution to get the final optimal solution (x * , y * ) of model (M xy ). A global upper bound z † + (x † ) on the optimal objective value of (M xy ) will also be derived from x † to be used at the beginning for the BB process in (M xy ). Note that the quality control tolerance Q should be appropriately set in the AFUB to be consistent with W, the weights of M/D operations in (M xy ). Figure 13 illustrates the basic ideas of this warm-start scheme.
In Section 6, experiment results of this warm-starting approach will be discussed following those for the application of AFUB. The contribution of the AFUB method is in particular demonstrated by this warm-start process. As shown by the computational experiments, instances without warm-start were often unable to be solved by the full model (M xy ). On the other hand, with the aid of an initial solution provided by (M x ) with AFUB, these instances were successfully solved by (M xy ) with high-quality solutions.

COMPUTATIONAL EXPERIMENTS ON TUSP
We use the branch-and-price approach in [20] to solve an enhanced version of model (M x ). Note that (M x ) is a much simplified version used to ease the interpretation of the TUSP. In real-world instances, the requirements are more complicated and instead of using (M x ), the model proposed in [20] (denoted by (M ′ x )) is used in our experiments, where Constraints 3 and 4 are replaced by the train convex hull constraints to strengthen LP relaxation bounds and realize the combination-specific coupling upper bounds [20,23]. Moreover, to follow the requirements on train-family compatibility and locations banned for C/D operations, train-family branching and banned location branching are also used besides the conventional branching on variables. Apart from the above three kinds of branching, AFUB is also included. Similarly, the actual model used for the integer FCMF formulation   ), except that the newly added fixed-charge variables y are also branched after all the flow variables x are integral.
We will compare the experiment results with our previous solver (i.e., the one without AFUB in [20]) and the schedules produced by practitioners. As far as we know, there is no other equivalent existing research for making comparisons.
The experiments were conducted on a 64 bit Xpress-MP suite (latest version 8.0) on a Dell workstation with 8G RAM and an Intel Xeon E31225 CPU. The solvers used are two upgraded versions of the branch-and-price solver used in [20], one for solving (M ′ x ) and the other for (M ′ ). In both, only the simplex solver from Xpress-MP were applied to solve LP relaxations during the column generation process without employing the default integer programming solver from Xpress-MP itself.

Applying AFUB on model (M ′ x )
To test how effective the proposed AFUB scheme is for solving model (M ′ x ), we conducted experiments based on four real-world instances of the rail network of East Anglia, TransPennine Express (TPE) and Northern. The details of the four instances are summarized in Table 2. The fleet sizes refer to the number of units used in the manual schedules if available. GE-360 and Northern-156 only contain a single type of unit, and TPE and TPE2 contain multiple unit types. Except TPE2, all the other instances are from previous timetables where the corresponding manual unit schedules are available for comparison. TPE2 on the other hand, is derived from a new timetable without any manual schedule produced. Therefore the fleet size information is not available for TPE2.
The parameter settings for model (M ′ x ) are as follows. The minimum and maximum connection time durations were set to be 4 and 1000 minutes, respectively, plus some manually determined special turnaround rules at certain time and locations. To make fair and clear comparisons with manual solutions focusing on the number of C/D operations, only the fleet size was minimized and therefore Q = 0 (most strict) was set. Also for making fair comparisons, we strictly use the manually scheduled train capacities as the input demand. Note that this restricts the solver from getting better solution qualities such as a fewer number of units used. Only the existing empty-running movements in the manual schedules were allowed. The maximum number of column iterations was set to be 2000 and an iteration will be prematurely stopped if the relative gap between the upper and lower bound is less than 0.1%, and the lower bound will be used as the tree node's LP relaxation value. A maximum total number of 1000 BB tree nodes and 12 hours of running time were set. The branch-and-price was set to terminate as soon as the relative gap between the BB tree's global lower bound and global upper bound is less 0.1%. The maximum failure count for fully utilizing an arc was set as F max = 30. Different strategies were considered (see Section 5.3) when implementing AFUB. For instance, one can only use potential graph (PG), or use both PG and the local model (MS). When non-guaranteed arcs are allowed to be branched (NG), they are selected randomly. Finally we also would like to compare the results between using PG and/or MS and not using them at all by only branching on arbitrary underutilized arcs. Since unnecessary C/D can rarely happen in manually constructed schedules, we use the C/D counts in manual schedules as the benchmark for accessing the quality given by the AFUB in the experiments. Table 3 gives the results on the four instances with different strategies indicated by Yes (N) or No (N) regarding PG, MS, and NG. When non-guaranteed arcs are allowed to be branched, because of the randomness in selecting underutilized arcs, the average values over 100 runs are reported. The column "rule%" gives the percentage of other branching rules (i.e., train-family, banned location and fractional-to-integral) and AFUB in the BB process. "BB#" gives the number of nodes in the BB tree. "cd#" gives the number of C/D operations in the solution, followed by its SD if the average value is taken using NG.
The GE-360 instance is a small instance with 137 trips and only a single unit type Class 360. From Table 3, it can be seen that the number of C/D operations is significantly improved from 50 to 28 even if only the potential graph (PG) is used. If PG and MS are both used, this number can be improved to 24, the same as in the manual solution. Moreover, allowing branching  This implies the usefulness of the verification methods PG and MS such that they are able to cover most underutilized arcs that can guarantee an improvement. Similar phenomena also happened in the experiments for the other instances. If PG and MS are totally abandoned and AFUB only selects non-guaranteed underutilized arcs randomly, the quality on the number of C/D will decline significantly, not to mention the time consumed and the number of BB nodes will be larger by several magnitudes, as shown in Test 5. Figure 14 further illustrates the C/D number and computational time from different strategies, where the advantage of applying AFUB can be observed. The Northern-156 instance has 385 trips and a single unit type Class 156. Figure 15 illustrates the C/D number and computational time from different strategies. Without any AFUB, the number of C/D operations is large (65), and by applying PG or PG + MS, this number is reduced to 35. The earlier Figure 3 is taken from the solution in Test 1 without any AFUB applied. As a comparison, Figure 16 gives the improved results at Leeds Station given by Test 2 (PG only). One can see that all the unnecessary C/D operations in Figure 3 have been removed. Unfortunately, the heuristic AFUB still cannot obtain a very close cd# to the manual schedule's 23. After comparing the schedule in Test 2 and the manual solution, it is found that if the capacity provisions have to be kept as in Test 2, there will be almost no scope to further reduce the C/D#. However, the manual solution achieves a better one due to different capacity provisions from the ones in Test 2, where a small number of unbalanced arcs in Test 2 are replaced by balanced arcs in the manual solution. From Equation (14), flows over unbalanced arcs are the other reason for increasing C/D operations besides underutilized arcs. AFUB, which only focuses on underutilized arcs, is thus difficult to make improvements in the case of unbalanced arcs. However it remains an open question whether a similar scheme as the AFUB is still workable for unbalanced arcs. A more detailed discussion over this issue is given in Section 7.
In instances TPE and TPE2, multiple unit types are involved. Similar patterns in different strategies as in instances GE-360 and Northern-156 can still be found. In instance TPE, the number of C/D can be reduced from 47 to 36, very close to 34, as in the manual schedule. In instance TPE2, the number of C/D can be reduced from 31 to 18 by prioritizing verification rules and to 17 on average if using randomly selected underutilized arcs. Since this is a new timetable, no manual schedules are available for comparison. Figures 17 and 18 further illustrates the above results.

Warm-start for model (M ′ )
Experiments were conducted based on the same instances of GE-360, Northern-156, TPE and TPE2. The branching strategy (e.g., variable selection and node selection) for model (M ′ ) were all inherited from the branch-and-price solver for model (M ′ x ) including the AFUB with GP + MS + NG(F max = 3), and additionally block-arc variables y are also branched when all the other branching targets are satisfied. Because of the difficulty in solving model (M ′ ), we were only interested in finding near-optimal operable solutions in reasonable time. Thus, besides the same stopping criteria of a maximum of 1000 BB tree nodes and 12 hours running time, only a maximum of 800 column generation iterations per tree node were set and the branch-and-price process would also terminate as soon as two operable solutions were found. The weight on C/D operation costs was set as W = 0.01. All the other settings were kept the same as in Section 6.1. x † , the optimal solution from model (M ′ x ) used for warm-starting (M ′ ), was derived by the AFUB strategy of GP + MS. Table 4 summarizes the results from running instances GE-360 (GE), Northen-156 (North), TPE and TPE2 within each a comparison between two runs with and without warm-start (WS) is made. Information on the manual solutions or best solutions from (M x ) (if no manual solution is available) is also provided. In "r%," the percentage of variable branching on block-arc variables is reported as "block." The percentage of solution relative gaps (gap%) and the percentage of the warm-start arcs (i.e., the ones in x † ) found in the final solution (WS%) are also reported.
As for GE-360, due to the exactness of model (M xy ), solutions with fewer numbers of C/D operations and the same fleet size compared with the manual schedule can be obtained, and the one warm-started by x † from using AFUB in (M ′ x ) yields an even fewer number of C/D and a smaller gap percentage while taking only 23.4% of computational time for the one without warm-start. A high percentage of 92.9% of the arcs in x † were found in the final solution given by (M ′ ), which made an important contribution to the success of warm-start for this run. Figure 19 further illustrates the C/D number and computational time from the two runs.
The instance Northern-156 is a difficult one. Compared with 35, the best C/D number that can be yielded by model (M ′ x ), model (M ′ ) can reduce it to 31 with warm-start while retaining the same fleet size, which is still not close enough to the manual On the other hand, the run without warm-start failed within 1000 BB nodes without getting any feasible solution.
As for the results from running instance TPE with three types of units, compared with 34, the manual C/D number and 36, the best C/D number given by model (M ′ x ), model (M ′ ) can further reduce the number to 28 with warm-start while retaining the same fleet size. 83.41% of the arcs in x † were found in the final solution given by (M ′ ). The other run without warm-start failed within 1000 BB nodes without any feasible solution.
Finally as for the results from running instance TPE2, compared with 18, the best C/D number found by model (M x ), model (M xy ) can reduce this number to 8 with warm-start and to 13 without warm-start while retaining the same fleet size. The solution without warm-start gives a gap that is almost 10 times larger than the solution with warm-start. 77.7% of the arcs in x † were found in the final solution given by (M ′ ). Figure 20 further illustrates the C/D number and computational time from different strategies.

CONCLUSIONS AND FUTURE RESEARCH
Unnecessary flow demerging and remerging appears in some network flow problems where additional controls are needed and are often achieved by extra fixed-charge binary variables, which may affect the computational efficiency. In order to better deal with this problem, we have upgraded and significantly enhanced a heuristic branching scheme named AFUB proposed in [21].
The AFUB scheme is based on the observation that the number of M/D operations is closely related to the number of underutilized arcs, and by heuristically detecting and removing such arcs, there is a potential to reduce UDR. It is important to verify whether full utilization on an arc can guarantee an improvement, since applying AFUB to randomly selected underutilized arcs is ineffective and time consuming. Two methods for verifying the validity of fully utilizing a chosen arc are proposed. The first one (PG) is based on the idea of potential graphs and is suitable for an FCG where either the focused arc is balanced and commodities are interchangeable, or the focused arc is strongly balanced and the maximum flow amount for the nodes is up to 4. The use of potential graphs have been also extended to unbalanced cases if commodities are interchangeable. The second one (MS) is based on an integer programming model that tries to find a holistic improvement over the trips of a single location. Note that (MS) is not limited by the arc balance or commodity interchangeability conditions as for (PG).
An example of UDR in practice is unnecessary C/D operations from the TUSP. Experiments based on real-world instances from several UK train operating companies were conducted using model (M x ) plus AFUB. The C/D counts from manual schedules were used as benchmarks for evaluating the quality of AFUB. We found that PG and MS are effective in removing unnecessary C/D compared with the runs without any AFUB or the runs where AFUB was only arbitrarily applied to non-guaranteed underutilized arcs (NG). For some instances, AFUB with PG and MS can find solutions with the same or very close C/D count as in the manual solution (with the same number of units used). Only one instance failed in getting a C/D count close to the manual solutions, although there was still a great improvement compared with the run without AFUB. Experiments for warm-starting the integer FCMF model (M ′ ) by an optimal solution from the model (M ′ x ) plus AFUB were also carried out, and the warm-start scheme is proved to be vitally important in deriving high-quality solutions in model (M xy ) in much shorter time. In conclusion, AUFB can give a step-change in efficiently obtaining train unit schedules with most unnecessary C/D removed either by being solely used in model (M x ) or to assist the solution process in model (M xy ) by warm-starting.
As for future research, the focus will be on unbalanced arcs in relation to the number of C/D operations. As shown by Equation (14), the total number of C/D operations can be decomposed into one part due to underutilized arcs and the other part due to unbalanced arcs. Changing unbalanced arcs into balanced ones might also reduce the C/D count, while our research has not considered this aspect yet. As shown in the experiments on Northern-156, there is still scope for further improvement by changing unbalanced arcs into balanced ones (without violating the mandatory constraints such as passenger demands or coupling upper bounds). Nonetheless, the proportion of unbalanced arcs over all arcs is often higher by magnitudes than the proportion of underutilized arcs, and most of them are solved to correct flows to provide satisfactory capacities. This makes the design of a corresponding algorithm more difficult than AFUB. There seems to be no obvious way, such as the potential graph or model (MS e ) as in the case of underutilized arcs, to verify the validity of balancing an unbalanced arc before actually enforcing it in the next LP relaxation. Moreover, forcing an unbalanced arc into a balanced one will have a much greater effect over the entire network. It is also very likely to be conflicting with other operating costs such as mileage. Therefore, it remains an open question whether it is needed to design a heuristic approach for changing some unbalanced arcs into balanced ones.
Another important direction is to further investigate the warm-start strategy as presented in Section 5.4 in conjunction with other techniques to speed up the solution process for model (M xy ). The experiments in Section 6.2 show that the effectiveness of AFUB, but currently only small to medium instances were tested. The future research in this aspect will look for more effective branching strategies combining both the x and y variables, valid inequalities and a hybridized heuristic framework SLIM [18] designed for large scale TUSP instances.  (14). For every incoming and outgoing arcs associated with node j ∈ N, that is, a ∈ ±(j), the arc is either a source/sink arc (a ∈ A 0 : w j = a ), or a balanced connection arc (a ∈ A : w j = a = M a ), or an unbalanced connection arc (a ∈ A : w j = a ≠ M a or w j = M a ≠ a ). Therefore, we can rewrite ∑ a∈ ± (j) x a w j to, ∑ a∈ ± (j) x a w j = ∑ a∈ ± (j)∩A 0 When summing up the terms as in (A1) for all the intermediate nodes j ∈ N, note that each source/sink arc will only be counted once, while for each connection arc, it will be counted twice. We further have, For instance, after summing up the five summation terms as given in Figure A1, we have ∑ h∈{i,j,k,l,m} ∑ a∈ ± (h) By rearranging the terms in Equation (A2) and continue the calculation process, it will finally lead to the result given in (14).