Inner entanglements: Narrowing the search in classical planning by problem reformulation

In the field of automated planning, the central research focus is on domain‐independent planning engines that accept planning tasks (domain models and problem descriptions) in a description language, such as Planning Domain Definition Language, and return solution plans. The performance of planning engines can be improved by gathering additional knowledge about specific planning domain models/tasks (such as control rules) that can narrow the search for a solution plan. Such knowledge is often learned from training plans and solutions of simple tasks. Using techniques to reformulate the given planning task to incorporate additional knowledge, while keeping to the same input language, allows to exploit off‐the‐shelf planning engines. In this paper, we present inner entanglements that are relations between pairs of operators and predicates that represent the exclusivity of predicate achievement or requirement between the given operators. Inner entanglements can be encoded into a planner's input language by transforming the original planning task; hence, planning engines can exploit them. The contribution of this paper is to provide an in‐depth analysis and evaluation of inner entanglements, covering theoretical aspects such as complexity results, and an extensive empirical study using International Planning Competition benchmarks and state‐of‐the‐art planning engines.


INTRODUCTION
Automated planning is an important research area of artificial intelligence (AI), where an autonomous entity (eg, a robot) reasons about the way it can act in order to achieve its goals. AI planning has therefore a great potential for applications where a certain level of autonomy is required, such as in the Deep Space 1 mission. 1 Classical planning is a subarea of AI planning that deals with a static and fully observable environment and where actions have deterministic and instantaneous effects. Classical planning is, however, intractable (PSPACE-complete). 2 In the last few decades, there has been a great deal of activity in the research community designing planning techniques and planning engines. In 1998, the International Planning Competition (IPC) * was organized and has since been increasingly attracting the attention of the AI planning community. Due to the IPC, we have the Planning Domain Definition Language (PDDL), 3 which is a widely used language for describing planning tasks, and a wide range of benchmarks that can be used for measuring planners' performance. Currently, PDDL is supported by a large number of advanced planning engines. Along with those planning engines, many novel planning techniques have been proposed, such as heuristic search, 4 translating planning tasks into SAT, 5 just to mention a few.
The performance of planning engines can be improved by restricting the search space, ie, by introducing pruning techniques that "cut off " branches that are unnecessary or redundant. Commutativity pruning eliminates all but one permutation of commutative actions (can be applied in any order). 6 Symmetry breaking reuses information about one object to its symmetric "twin" in such a way that "bad" states of one object can be avoided for its symmetric "twin." 7 Reachability analysis can determine whether the goal is unreachable from a current state. 4 Another way how performance of planning engines can be improved is by gathering domain control knowledge (DCK), ie, additional knowledge about planning tasks indicating how solution plans would look like. DCK can be expressed, for instance, in the form of control rules, 8 temporal logic formulas, 9 or decision trees. 10 With growing interest in extracting DCK automatically, emphasis was given on exploiting machine learning techniques that can acquire useful DCK, usually, by analyzing "training plans," which are solutions of simple planning tasks. This motivated the foundation of the learning track in the IPC, which has been organized since 2008. It should be noted that an approach that learns DCK from relaxed plans (obtained by solving planning tasks while omitting negative effects of actions) 11 won the best learner award at IPC 2008. However, such types of knowledge often require specific planning engines such as TALplanner 12 in the case of control rules. Alternatively, DCK can be directly encoded into the domain and problem descriptions (usually in PDDL). Such an approach is planner independent; hence, a standard planning engine can straightforwardly exploit it.
The best-known planning task reformulation technique, macro-operators ("macros"), which encapsulate sequences of PDDL operators, can be encoded as normal planning operators; hence, *http://ipc.icaps-conference.org they can be easily added into domain models. [13][14][15][16] Abstracting planning tasks by their reformulation in order to reveal their hierarchical structures can mitigate the "accidental complexity" of their domain models. †17,18 Apart from macros, another type of domain-independent DCK is Entanglements, 19,20 which represent relations between planning operators and predicates, aiming at eliminating unpromising alternatives in a planning engine's search space. Technically speaking, entanglements are task specific, ie, relations described by entanglements hold in at least one solution plan of a given task. Entanglements usually generalize well, that is, a set of entanglements holds for a class of planning tasks with the same domain model.
Outer entanglements 19 are relations between planning operators and predicates whose instances are present in the initial state or the goal. Inner entanglements, 20,21 on which we focus in this paper, are relations of the exclusivity of predicate achievement or requirement between pairs of operators. Inner entanglements can be encoded in planning tasks, effectively reformulating them, and thus, they are planner independent. Deciding whether a given inner entanglement holds in a given planning task is generally intractable (PSPACE-complete) and, thus, as hard as solving a planning task. Such a theoretical result indicates practical infeasibility of enumerating entanglements for a given task prior to solving it. Since inner entanglements generalize well as reported in the literature, 20,21 ie, they are rather domain specific than task specific, for extracting them, we have exploited the "learning for planning" paradigm, which identifies DCK from a set of "training" planning tasks. Therefore, inner entanglements can be learned on simple training tasks, which are easy to solve, and then used for more complex tasks (in the same class) for speeding up the plan generation process. Our initial work on inner entanglements has been reported in a couple of shorter papers detailing their discovery, use, and effectiveness. 20,21 In this paper, we integrate and extend our previous work, with • a detailed description of the encodings of inner entanglements, including formal proofs of their correctness; • a collected summary of the known complexity results and trivial cases where inner entanglements hold; • case studies in which we investigate the knowledge engineering aspects of (re)using inner entanglements; • an analysis of the potential impact of inner entanglements on the planning process; • an approximation method for extracting entanglements enriched by filtering unpromising inner entanglements; and • an extensive empirical study of the impact of inner entanglements in the planning process using all the domains from the 7th IPC's learning track ‡ and seven state-of-the-art planning engines based on very different principles.
Although our approximation method for learning inner entanglements does not theoretically guarantee that the reformulated tasks remain solvable, the main empirical findings from this paper are that the use of inner entanglements improves the planning process generally through the considered planner and domain model combinations. In addition, in the experimental scenarios we used, the potential for identification of incorrect inner entanglements stemming from our † "Accidental complexity of domain models" means that their inefficient encodings decrease the performance of planning engines. ‡ Learning track benchmarks are more natural, since the inner-entanglement extraction phase can be understood as a learning process. approximation method for their extraction did not explicitly manifest itself in the results. Using the "learning for planning" paradigm, ie, learning domain-specific knowledge on a small set of training tasks, has demonstrated its usefulness in the inner-entanglement case. Noticeably, the issue of making some reformulated tasks unsolvable can be alleviated (i) by running the planner on the original task if the reformulated task was unsolved, (ii) by domain engineers who can verify the correctness of learned inner entanglements, or (iii) by incorporating reformulated tasks along with the original ones into portfolios such as PbP. 22 This paper is organized as follows. After discussing related work, basic terminology is provided. Then, inner entanglements are introduced. After that, the reformulation of planning tasks in order to enforce inner entanglements is presented. Then, a theoretical analysis of inner entanglements is provided, and an approximation algorithm for extracting inner entanglements is presented (including the filtering technique for unpromising inner entanglements). After that, an empirical analysis of the impact of inner entanglements in the planning process is provided. Finally, we give conclusions and present future avenues of research.

RELATED WORK
Generating DCK, which can be exploited by planning engines, dates back to the 1970s, when systems such as REFLECT 23 were developed. Macros are one of the best-known type of DCK in classical planning, because they can be encoded as normal planning operators and, thus, easily added into planning domain models. 14 Macro-FF CA-ED version, 24 which learns macros through an analysis of relations between static predicates; Wizard, 25 which learns macros by genetic algorithms; and BLOMA, 26 which exploits a block decomposition technique 27 to learn "long" macros, are good examples of planner-independent macro learning systems. Although macros and inner entanglements are based on a similar idea, ie, enforcing (primitive) operators to be applied in certain order, inner entanglements do not require the affected operators to be applied strictly consecutively, and inner entanglements can be represented in such a way that the number of operators' instances (after grounding) is not higher than when the original models are considered. The relation between inner entanglements and macros and how inner entanglements can be exploited for macro learning has been studied in the work of Chrpa et al. 28 A general technique, called commutativity pruning, is used to discard all but one permutation of commutative (or independent) actions, which do not influence each other and, thus, can be executed in any order. 6 Graphplan, 29 which is one of the best-known planning algorithms, allows the execution of commutative actions in parallel (in one step). Symmetry breaking is a well-known technique for pruning unneeded alternatives in the search space. In planning, some objects might be symmetric, which can be exploited for avoiding alternatives concerning one object that has been already tried with the object's symmetric "twin." 7 In the spirit of the works of Emerson and Sistla 30 and Rintanen, 31 Pochter et al 32 present a pruning technique that identifies symmetries by exploring automorphisms in state-transition systems. This approach has been recently extended for cost-optimal planning. 33 Motivated by the idea of partial order-based reduction used in model checking, 34 Chen and Yao 35 introduce an Expansion Core method, focusing on cost-optimal SAS+ planning, 36 which, in a node expansion phase (in the A * search), restricts on relevant domain transition graphs rather than all of them. The idea of "expansion cores" is extended into strong stubborn sets that guarantee stronger pruning than "expansion cores." 37 In contrast, inner entanglements prune asymmetrical alternatives. Outer entanglements 19 are relations between operators and the initial or goal atoms that aim to prune unpromising instances of these operators. Outer and inner entanglements are complementary as has already been demonstrated in the literature. 38 A recent work, which is, to some extent, similar to inner entanglements, proposes a method to learn "bad" causal links in order to generate plans of better quality. 39 In contrast to this work, inner entanglements aim to capture possibly "good" causal links that are enforced in the planning process. In addition, "bad" causal links are learned by exploring the differences between (different) plans solving a single planning task, whereas entanglements are learned by exploring similarities in structures of solution plans of several planning tasks.

PRELIMINARIES
This section is devoted to introducing the terminology that will be used throughout this paper.

Classical planning
Classical planning is concerned with finding a (partially or totally ordered) sequence of actions transforming the static, deterministic, and fully observable environment from the given initial state to a desired goal state. 40,41 In the classical representation, a planning task consists of a planning domain model and a planning problem, where the planning domain model describes the environment and defines planning operators, whereas the planning problem defines concrete objects, an initial state, and a set of goals. The environment is described by predicates that are specified via a unique identifier and terms (variable symbols or constants). For example, a predicate at(?t ?p), where at is a unique identifier and ?t and ?p are variable symbols, denotes that a truck ?t is in a location ?p. Predicates thus capture general relations between objects.

Definition 1.
A planning task is a pair Π = (Dom Π , Prob Π ), where a planning domain model Dom Π = (P Π , Ops Π ) is a pair consisting of a finite set of predicates P Π and planning operators Ops Π , and a planning problem Prob Π = (Objs Π , I Π , G Π ) is a triple consisting of a finite set of objects Objs Π , initial state I Π , and goal G Π .
Let ats Π be the set of all atoms that are formed from the predicates P Π by substituting the objects Objs Π for the predicates' arguments. In other words, an atom is an instance of a predicate (in the rest of this paper, when we use the term instance, we mean an instance that is fully ground). A state is a subset of ats Π , and the initial state I Π is a distinguished state. The goal G Π is a nonempty subset of ats Π , and a goal state is any state that contains the goal G Π .
Notice that the semantics of a state reflects the full observability of the environment. That is, that for a state s, atoms present in s are assumed to be true in s, whereas atoms not present in s are assumed to be false in s.
Planning operators are "modifiers" of the environment. They consist of preconditions, ie, what must hold prior operators' application, and effects, ie, what is changed after operators' application. Specifically, we distinguish between negative effects, ie, what becomes false, and positive effects, ie, what becomes true after operators' application. Actions are instances of planning operators, ie, operators' arguments as well as the corresponding variable symbols in operators' preconditions and effects are substituted by objects (constants). Planning operators capture general types of activities that can be performed. Planning operators can be instantiated to actions in order to capture given activities between concrete objects.
where op_name is a unique identifier and x 1 , … , x k are all the variable symbols (arguments) appearing in the operator, pre(o) is a set of predicates representing an operator's precondition, and eff − (o) and eff + (o) are sets of predicates representing an operator's negative and positive effects. Actions are instances of planning operators that are formed by substituting objects, which are defined in a planning problem, for operators' arguments as well as for the corresponding variable symbols in operators' preconditions and effects. An action a = (pre(a), eff − (a), eff + (a)) is applicable in a state s if and only if pre(a) ⊆ s. The application of a in s, if possible, results in a state (s∖eff − (a)) ∪ eff + (a).
A solution of a planning task is a sequence of actions transforming the environment from the given initial state into a goal state.

Definition 3.
A plan is a sequence of actions. A plan is a solution of a planning task Π, a solution plan of Π in other words, if and only if a consecutive application of the actions from the plan starting in the initial state of Π results in the goal state of Π.
Determining equality of predicates (needed for set operations such as intersection) is done such that predicates are equal if they have the same name and their arguments (including their order) are identical. Hence, an expression p ∈ X ∩ Y, where X and Y are sets of predicates, means that p has the same name and arguments (in the same order) in both X and Y. A predicate p is a variant of a predicate q § if, by renaming p's variable symbols (arguments), we get a predicate equal to q.

Relations between actions and operators
By analyzing the preconditions and effects of actions or operators, we can identify how these influence each other. As discussed in Chapman's earlier work, 42 an action having some atom in its positive effects is a possible achiever of that atom for some other action having that atom in its precondition. The opposite for being a possible achiever is being a possible clobberer (below referred to simply as clobberer), which means that action a i deletes atom(s) that a j has in its precondition. Note that being a clobberer refers to the notion of "threat" in plan-space planning. 43 Definition 4. Let a i and a j be actions. We say that a i possibly achieves an atom p for a j if and only if p ∈ eff + (a i ) ∩ pre(a j ). We say that a i is a possible clobberer for a j if and only if eff − (a i ) ∩ pre(a j ) ≠ ∅.
Notions of a possible achiever and clobberer can be easily extended for planning operators.

Definition 5.
Let o i and o j be planning operators and p be a predicate. We say that o i possibly achieves a predicate p for o j if and only if there exist a i , a j , and p g , instances of o i , o j , and p, respectively, such that a i possibly achieves p g for a j , ie, p g ∈ eff + (a i ) ∩ pre(a ). Similarly, we say that o i is a possible clobberer for o j if and only if there exist a i and a j , instances of o i and o j , respectively, such that eff − (a i ) ∩ pre(a j ) ≠ ∅.
In every solution plan, every atom in a precondition of an action a j is (necessarily) achieved in the sense that there exists a possible achiever action a i for the atom before a j and that there is no action in between a i and a j , which deletes the atom (here, the initial state can be viewed as § We can also say that p is unifiable with q. the initial action that only adds atoms, and the goal can be viewed as the final action that only has precondition atoms). Notice that being an achiever relates to the notion of "causal link" in plan-space planning. Definition 6. Let ⟨a 1 , a 2 , … , a n ⟩ be a solution plan of some planning task. We say that an action a i achieves an atom p for an action a j if and only if i < j, p ∈ eff + (a i ) ∩ pre(a j ) and p ∉ eff − (a k ) for every k ∈ {i + 1, … , j − 1}.
Of course, an action can achieve an atom that then appears in preconditions of several following actions. Likewise, several actions can achieve an atom for one action. For the purpose of defining inner entanglements, we have to introduce special cases of the achiever relation. If a i achieves an atom required by a j and no action in between them also achieves the atom, then a i is the primary achiever of the atom. In another case, where an action a i achieves an atom for another action a j and no other action in between has that atom in its precondition or its positive effects, we say that a i first achieves the atom required by a j . If a i first achieves an atom, it follows that it is also the primary achiever of it.

Definition 7.
Let ⟨a 1 , a 2 , … , a n ⟩ be a solution plan of some planning task. We say that an action a i is the primary achiever of an atom p for an action a j if and only if a i achieves p for a j , and p ∉ eff We also say that an action a i first achieves an atom p required by an action a j if and only if a i achieves p for a j , and p ∉ eff

BlocksWorld domain
We briefly introduce the BlocksWorld domain, 44,45 which is one of the best-known planning domains, that will be used as a running example in this paper.
The BlocksWorld domain describes an environment where we have a finite number of blocks, one table with unlimited space, and one robotic hand. A block can be either stacked on another block, placed on the table, or held by the robotic hand. No block can be stacked on more than one block at the same time as well as no more than one block can be stacked on a block at the same time. The robotic hand can hold, at most, one block. The BlocksWorld domain consists of four operators: pickup(?x) refers to a situation when the robotic hand picks up a block ?x from the table, putdown(?x) refers to a situation when the robotic hand puts down the block ?x it is holding on the table, unstack(?x ?y) refers to a situation when the robotic hand unstacks a "clear" block ?x from a block ?y, and stack(?x ?y) refers to a situation when the robotic hand stacks the block ?x it is holding to a "clear" block ?y. As mentioned before, planning operators are instantiated by substituting constants (objects) for variable symbols that appear in operators' definition. For example, putdown(?x) can be instantiated by substituting a, which refers to a concrete block "a," for ?x. We then obtain an action putdown(a) that requires the robotic hand to hold the block a, and the effect is that the block a is placed on the table, the block a is clear (no other block is stacked on it), and the hand no longer holds it.

INNER ENTANGLEMENTS
Inner entanglements are relations between pairs of planning operators and predicates. Inner entanglements, informally speaking, represent the exclusivity of "achieving" or "requiring" predicates between operators. That is, that for a given planning task, there exists at least one solution plan where a given inner entanglement holds. In other words, considering that inner entanglement while solving the task will not prune all possible solution plans. Typically, a predicate can be achieved by more than one operator, as well as more than one operator might require the same predicate. However, it is often the case that some combinations "achiever-requirer" are not useful.
Specifically, we have two types of inner entanglements, entanglements by succeeding and entanglements by preceding. An entanglement by succeeding represents the exclusivity of achievement of a predicate p by an operator o i for an operator o j . For a planning task, where such an entanglement holds, there exists a solution plan such that instances of o i first achieve instances of p exclusively only for instances of o j . An entanglement by preceding, on the other hand, represents the exclusivity of requirement of a predicate p by an operator o j from an operator o i . For a planning task, where such an entanglement holds, there exists a solution plan such that only instances of o i are exclusive primary achievers of instances of p for instances of o j .
For example, in the BlocksWorld domain, it may be observed that operator pickup(?x) possibly achieves predicate holding(?x) for operators stack(?x ?y) and putdown(?x). Similarly, it may be observed that predicate holding(?x) is possibly achieved for operator putdown(?x) by operators unstack(?x ?y) and pickup(?x). We may require that every instance of pickup(?x) first achieves an instance of holding(?x) exclusively for a corresponding instance of stack(?x ?y) since putdown(?x) would just reverse the effects of pickup(?x) (see Figure 1, right). In other words, pickup(?x) is entangled by succeeding stack(?x ?y) with holding(?x). Analogously, we may require that for every instance of putdown(?x), a corresponding instance of unstack(?x ?y) is the exclusive primary achiever of an instance of holding(?x) because, again, putdown(?x) would just reverse the effects of pickup(?x) (see Figure 1, left). In other words, putdown(?x) is entangled by preceding unstack(?x ?y) with holding(?x).
Roughly speaking, inner entanglements provide restrictions to the plan generation process since they allow only some combinations of action sequences while not affecting the solvability of considered planning tasks. Whereas the BlocksWorld example (see Figure 1) indicates one possible nature of inner entanglements, in the general case, the reason why given inner entanglements hold in a given domain model might vary. Hence, our definition of inner entanglements does not explicitly capture their nature and "maintains" only solvability of considered planning tasks.
We distinguish two variants of them, namely, strict and nonstrict. The strict variant captures the exclusivity of predicate achievement strictly between involved operators, whereas the nonstrict variant allows situations where some instances of the predicates are present in the initial state or can be present in the goal state. For example, if the initial state of some planning task contains an atom holding(a), then the strict version of the above entanglement by preceding prevents applying putdown(a) in the initial state, whereas the nonstrict variant of the entanglement allows to apply putdown(a) in the initial state. Both strict and nonstrict variants of inner entanglements are defined as follows. Notice that we assume that operators o 1 and o 2 share arguments that are relevant to p. For example, pickup(?x) and stack(?x ?y) share the argument ?x, since it is relevant for holding(?x). We also say that o 2 is strictly entangled by preceding o 1 with p in Π if and only if there exists a solution plan of Π, and for each a 2 ∈ being an instance of o 2 , there exists a 1 ∈ being an instance of o 1 such that a 1 is the primary achiever of an atom p gnd , where p gnd is an instance of p, for a 2 .
Henceforth, strict entanglements by preceding and succeeding are denoted as strict inner entanglements. Definition 9. Let Π be a planning task. Let o 1 and o 2 be planning operators and p be a predicate (o 1 , o 2 , and p are defined in the planning domain model of Π) such that p ∈ eff + (o 1 ) ∩ pre(o 2 ). We say that o 1 is nonstrictly entangled by succeeding o 2 with p in Π if and only if there exists a solution plan of Π, and for every a 1 , a 2 ∈ such that a 1 first achieves an atom p gnd , where p gnd is an instance of p, required by a 2 ; it holds that if a 1 is an instance of o 1 , then a 2 is an instance of o 2 .
We also say that o 2 is nonstrictly entangled by preceding o 1 with p in Π if and only if there exists a solution plan of Π, and for every a 1 , a 2 ∈ such that a 1 is the primary achiever of an atom p gnd , where p gnd is an instance of p, for a 2 ; it holds that if a 2 is an instance of o 2 , then a 1 is an instance of o 1 .
Henceforth, nonstrict entanglements by preceding and succeeding are denoted as nonstrict inner entanglements.
Inner entanglements (both strict and nonstrict) can be used for pruning some unpromising alternatives in the search space, in other words, reducing the branching factor. Notice that a predicate involved in some inner entanglement relation might be true for some time after it is achieved; in other words, the predicate does not have to be "used" immediately after being achieved. Since the previous example of BlocksWorld might be confusing in this sense (the predicate holding(?x) is immediately "used" after being achieved), we provide another example in a modification of the BlocksWorld domain that considers more than one robotic hand. Let pickup(?h ?x) be strictly entangled by succeeding stack(?h ?x ?y) with holding(?h ?x) in some planning task. If action pickup(h1 a) is applied at step i, then action stack(h1 a ?y) (any other block than a can be substituted for ?y) must be applied at step j such that j > i. The entanglement prohibits applying action putdown(h1 a) at step k such that i < k < j. On the other hand, other actions that utilize different robotic hands than h1 can be applied in between the ith and the jth step.
A single inner entanglement requires only the existence of one solution plan of the given planning task where the entanglement conditions are met. However, different entanglements might hold in different solution plans. To consider multiple (different) inner entanglements rather than a single one, there must exist a solution plan in which all considered entanglements hold. Moreover, in practice, inner entanglements are domain specific or class of problems specific rather than problem specific. The above definition can be extended to reflect these aspects.

Definition 10.
Let Π be a planning task. Let ENT Π be a set of inner entanglements, where each element of ENT Π is specified by a type of the inner-entanglement relation and involved a pair of planning operators and predicate. We say that a set of inner entanglements ENT Π holds for Π if and only if there exists a solution plan of Π in which all the entanglements from ENT Π hold.
Similarly, ENT  holds for a set of planning tasks  sharing the same planning domain model if and only if ENT  = ⋂ Π∈ ENT Π . Both the BlocksWorld related entanglements hold for every BlocksWorld planning task. By adding two more inner entanglements, namely, unstack(?x ?y) to be (strictly) entangled by succeeding putdown(?X) and stack(?x ?y) to be (strictly) entangled by preceding pickup(?X), we restrict to solution plans where blocks are always put down on the table after being unstacked from other blocks and, eventually, picked up from the table and stacked on some other blocks. This might be useful since it introduces more restrictions on decisions the planner has to take during the search. With unlimited table space, these inner entanglements hold for every task.

REFORMULATING PLANNING TASKS
To exploit inner entanglements during the planning process, we have to develop a specific planner, modify an existing one, or reformulate a planning task in such a way that the entanglements hold in every solution plan retrieved by a planner. The last option is planner independent: in fact, it involves the reformulation of domain and problem models using features of the PDDL (actually, STRIPS) language (see Section 3).
Hence, after inner entanglements are identified, we encode them directly into the planning task. The reformulated planning task is passed to a generic planning engine in order to generate a solution plan, which is also a solution plan of the original planning task. Encoding of inner entanglements as we show in this section prevents planning engines from exploring branches of the search space that violate these entanglements. In other words, reformulated tasks "narrow" the search space for planning engines for improving their performance.
Encoding inner entanglements is done by introducing supplementary predicates, "locks," that ensure that we cannot apply certain instances of operators in some stage of the planning process in order to enforce inner entanglements. Let Π be a planning task and Ops be the set of operators defined in the domain model of Π. Let an operator o 1 ∈ Ops be (strictly or nonstrictly) entangled by a succeeding operator o 2 ∈ Ops with a predicate p (defined in the domain model of Π) in Π.  Proof. Hereinafter, the modified operators o 1 and o 2 will be denoted as o ′ 1 and o ′ 2 . The strict entanglement by succeeding (see Definition 8) says that if an instance of o 1 that achieves an atom p gnd that is an instance of p is applied at step i and a corresponding instance of o 2 that requires p gnd is applied at step j, or never in the case of the nonstrict entanglement, so j = ∞, then no corresponding instance of any operator other than o 2 having p gnd in its precondition can be applied at step k unless p gnd is re-achieved by any operator different than o 1 at step l.
Applying an instance of o ′ 1 results in removing an atom p ′ gnd that is an instance of p ′ having the same arguments as p gnd (notice that all the possible instances of p ′ are present in the initial state of Π ′ ). From step 4 of the reformulation, p ′ is put into the precondition of any operator that has p in its precondition (both p and p ′ have the same arguments) except o 2 . Hence, only instances of o ′ 2 having p gnd in its precondition can be applied, since actions having p gnd in their precondition that are not instances of o 2 have p ′ gnd in their preconditions as well. If o ′ 2 is applied or p is re-achieved by any other (modified) operator than o ′ 1 (see step 5 of the reformulation), then a corresponding instance of p ′ is re-achieved as well.
For the strict version of the entanglement, all the instances of p ′ must be present in the goal state; hence, o ′ 2 must be applied at some point after o ′ 1 . For the nonstrict version of the entanglement, there is no need to re-achieve all the instances of p ′ ; hence, o ′ 2 does not have to be applied at some point after o ′ 1 in order to "use" the corresponding instance of p achieved by o ′ 1 ; however, no other operator can "use" it. Straightforwardly, if ′ is a solution plan of Π ′ , then ′ is a solution plan of Π that satisfies the entanglement conditions. The provided reformulation prevents only the application of operators in Ops∖{o 2 } having p in their precondition after o 1 achieved p. Therefore, if ′ is a solution plan of Π that satisfies the entanglement conditions, then ′ is a solution plan of Π ′ .
Similarly, we use supplementary predicates, "locks," to enforce entanglements by preceding. Let Π be a planning task and Ops be the set of operators defined in the domain model of Π. Let an operator o 2 ∈ Ops be (strictly or nonstrictly) entangled by a preceding operator o 1 ∈ Ops with a predicate p (defined in the domain model of Π) in Π. Then, Π is reformulated as follows.
(1) Create a predicate p ′ (not defined in the domain model of Π) having the same arguments as p and add p ′ to the domain model of Π.  with p in Π. Let Π ′ be a planning task obtained by reformulating Π using the previous approach.
′ is a solution plan of Π ′ if and only if ′ is a solution plan of Π that satisfies the entanglement conditions (see Definitions 8 and 9).
Proof. Hereinafter, the modified operators o 1 and o 2 will be denoted as o ′ 1 and o ′ 2 . The strict version of entanglement by preceding (see Definition 8) says that if an instance of o 2 requiring an atom p gnd that is an instance of p is applied at step j and a corresponding instance of o 1 is applied at step i achieving p gnd (i < j), then no corresponding instance of any operator other than o 2 having p gnd in its positive effects can be applied at step k such that i < k < j.
Adding p ′ into o 2 's precondition results in the situation that any instance of o ′ 2 can be applied only after the corresponding instance of o ′ 1 since p ′ is in o ′ 1 's positive effects. In particular, an instance of o ′ 1 that achieves an atom p gnd (an instance of p) achieves also p ′ gnd that is an instance of p ′ having the same arguments as p gnd . The instance of o ′ 2 that requires p gnd requires p ′ gnd as well. If p gnd is re-achieved by an instance of other (modified) operators than o ′ 1 , then p ′ gnd is removed (step 4 of the reformulation). Then, o ′ 2 requiring p gnd cannot be applied, since p ′ gnd will not be true. For the strict version of the entanglement, no instance of p ′ is present in the initial state; hence, o ′ 1 must be applied at some point before o ′ 2 . For the nonstrict version of the entanglement, all the instances of p ′ are present in the initial state; hence, o ′ 2 does not have to be applied after o ′ 1 ; however, no other (modified) operator can re-achieve an instance p in between, since, otherwise, the corresponding instance of p ′ is removed.
Straightforwardly, if ′ is a solution plan of Π ′ , then ′ is a solution plan of Π that satisfies the entanglement conditions. The provided reformulation prevents only the application of o 2 having p in their precondition unless o 1 achieved p. Therefore, if ′ is a solution plan of Π that satisfies the entanglement conditions, then ′ is a solution plan of Π ′ . There are also situations where both the (strict) entanglements by preceding and succeeding hold for operators o 1 and o 2 and a predicate p. Of course, we can reformulate the problem according to previous reformulation approaches. On the other hand, it requires two supplementary predicates, and thus, the process might not be very efficient. Given the fact that the exclusivity of achievement and requirement of p is mutual between o 1 and o 2 , we can replace p by its "twin" in the positive effects of o 1 and the precondition of o 2 . Therefore, we introduce a more compact reformulation that exploits such a property.
Formally, let Π be a planning task and Ops be the set of operators defined in the domain model of Π. Let o 1 ∈ Ops be nonstrictly entangled by succeeding o 2 ∈ Ops with p (p is defined in the domain model of Π) in Π and o 2 be strictly entangled by preceding o 1 with p in Π. ¶ Then, Π is reformulated as follows.
(1) Create a predicate p ′ (not defined in the domain model of Π) having the same arguments as p and add p ′ to the domain model of Π.  holding(?x) and stack(?x ?y) is strictly entangled by preceding pick-up(?x) with holding(?x). In our terminology, pick-up(?x) refers to o 1 , stack(?x ?y) refers to o 2 , holding(?x) refers to p, and stack_pick-up_both_holding(?x) refers to p ′ . The correctness of the reformulation is proved as follows.
We can observe that o ′ 1 is the only operator achieving p ′ but no longer achieving p. Similarly, o ′ 2 is the only operator requiring p ′ (having it in the precondition) but no longer requiring p. The entanglement by preceding cannot be affected by applying any (modified) operator o achieving p, since it removes p ′ as well (see step 4 of the reformulation), thus making o ′ 2 inapplicable. Similarly, if p is true before applying o ′ 1 , it is removed after o ′ 1 is applied, and hence, any operator requiring p becomes inapplicable. The strict entanglement by preceding is met since no instance of p ′ is in the initial state of P ′ . There is no restriction that prevents occurrences of p ′ in any of the goal states. Therefore, the entanglement by succeeding is nonstrict.
Hence, if ′ is a solution plan of Π ′ , then ′ is a solution plan of Π that satisfies the conditions of both entanglements. The provided reformulation prevents only the application of o 2 having p in their precondition unless o 1 achieved p as well as the application of any operator other than o 2 having p in its precondition after o 1 achieved p. Therefore, if ′ is a solution plan of Π that satisfies the entanglement conditions, then ′ is a solution plan of Π ′ .

THEORETICAL FOUNDATIONS OF INNER ENTANGLEMENTS
This section is devoted to the theoretical properties of inner entanglements such as complexity results as well as their expected impact on planners.

Landmark theory
Landmark theory 46 is a useful framework for studying structures of planning tasks. We will use a fragment of the landmark theory to prove intractability (PSPACE-completeness) of deciding whether a given inner entanglement holds. The notions we will use are briefly introduced in the following lines (for more details, see the work of Hoffmann et al 46 ).
Landmarks are atoms that must be achieved at some point in every solution plan of a given planning task. Deciding whether atoms are landmarks is PSPACE-complete. 46 Ordering landmarks is useful for computing heuristics. 47 Landmarks p and q are greedily necessarily ordered (we denote it as p→ g q) if, for every solution plan of a given planning task, p is achieved before q is achieved for the first time. Deciding greedy necessary ordering of landmarks is also PSPACE-complete. 46

Intractability of entanglements
The intractability (PSPACE-completeness) of deciding whether a given inner entanglement holds in a given task is proved by the following theorem. Proof. First, we show that the problem of deciding whether o p ′ is strictly entangled by succeeding o q ′ with p in Π ′ as well as the problem of deciding whether o q ′ is strictly entangled by preceding o q ′ with p in Π ′ belongs to the PSPACE class. To do this, we reformulate Π ′ by encoding the given inner entanglement, as described in Section 5. Hence, the decision problem of whether the given inner entanglement holds can be encoded as a planning task, ie, the entanglement holds if and only if the reformulated task is solvable. We know we can solve planning tasks in polynomial space; hence, this decision problem belongs to PSPACE.
We reduce, in polynomial time, the problem of deciding whether landmarks p and q are greedily necessarily ordered, ie, p→ g q, in some planning task Π, which is PSPACE-complete, to the problem of deciding strict entanglements by succeeding or preceding between o p ′ , o q ′ , and p in Π ′ . Without loss of generality, we assume that p and q are nullary predicates (atoms) defined in the domain model of Π.
We create a planning task Π ′ by modifying Π as follows. Let Ops be the set of planning operators defined in the domain model of Π. Let Ops p = {o | o ∈ Ops, p ∈ eff + (o)} be the set of operators achieving p and Ops q = {o | o ∈ Ops, q ∈ eff + (o)} be the set of operators achieving q. We extend the domain model of Π by adding atoms (nullary predicates) r, p ′ , p ′′ , q ′ , and q ′′ (without loss of generality, we assume that none of these are defined in the domain model of Π). Then, we add r into preconditions of every operator from Ops. Then, we modify operators in Ops p and Ops q as follows. For every o ∈ Ops p : replace p by p ′ in eff + (o) and add r into eff − (o). For every o ∈ Ops q : add q ′ into eff + (o) and add q ′′ into eff − (o). The initial state I of Π is modified as follows. If p ∈ I, then replace p by p ′ . If p ∉ I, then add r. If q ∈ I, then add q ′ ; otherwise, add q ′′ (if q ∉ I). Notice that q ′ becomes and remains true when q has been achieved and that q ′′ is true only before q is achieved (if q is true in the initial state, q ′′ is never true). Notice that name(o p ′ ), name(o q ′ ), and name(o q ′′ ) contain only unique operator identifiers (and no variable symbols). We can observe that if o p ′ is strictly entangled by succeeding o q ′ with p ′′ in Π ′ (the modification of Π), then q must be true before or at the same time p is achieved. This is because q ′′ becomes true after q is achieved (as mentioned before), and according to the entanglement, there is a solution plan ′ of Π ′ such that o p ′ always achieves p ′′ for o q ′ . Removing instances of o p ′ and o q ′ from ′ gives us a plan , which is a solution plan of Π. Given the modification of all operators from Ops p , p becomes true in in the same time as p ′ becomes true and r becomes false in ′ . Then, only o p ′ and o q ′ can be applied (in this order) in ′ , because other operators have r in their preconditions, and r can be re-achieved by o q ′ . From this, we can get that q ′ must be achieved before o p ′ is applied in ′ . Therefore, q is achieved before or in the same time as p in , which is a solution plan of Π, and thus, p→ g q does not hold in Π. Hence, o p ′ is strictly entangled by succeeding o q ′ with p ′′ in Π ′ (the modification of Π) if and only if p→ g q does not hold in Π. Analogously to the previous case, we can observe that if o q ′ is strictly entangled by preceding o p ′ with p ′′ in Π ′ (the modification of Π), then q must be true before or at the same time p is achieved. Therefore, there exists ′ , a solution plan of Π ′ where the entanglement holds. Again, removing instances of o p ′ and o q ′ from ′ gives us a plan , which is a solution plan of Π. Analogously to the previous case, after a modified operator from Ops p is applied in ′ , then only o p ′ and o q ′ (in this order) can be applied before any other operator can be applied. Therefore, q ′ must be achieved before o p ′ is applied in ′ , and thus, q is achieved before or in the same time as p in ; hence, p→ g q does not hold in Π. Hence, o q ′ is strictly entangled by preceding o p ′ with p ′′ in Π ′ (the modification of Π) if and only if p→ g q does not hold in Π.
Clearly, the modification of Π in both cases is done in polynomial time. Hence, since the problem of deciding whether landmarks p and q are greedily necessarily ordered in Π is PSPACE-complete, the problem deciding whether o p ′ is strictly entangled by succeeding o q ′ with p in Π ′ as well as the problem deciding whether o q ′ is strictly entangled by preceding o p ′ with p in Π ′ , where both problems belong to PSPACE, is PSPACE-complete as well.

Corollary 1. Let Π ′ be a planning task, o p ′ and o q ′ be planning operators, and p ′′ be a predicate defined in the domain model of Π ′ . The problem of deciding whether o p ′ is nonstrictly entangled by succeeding o q ′ with p ′′ in Π ′ is PSPACE-complete. The problem of deciding whether o q ′ is nonstrictly entangled by preceding o p
Proof. The problem of deciding on either of the nonstrict inner entanglements can be encoded as a planning task (Π ′ is reformulated as described in Section 5); hence, it belongs to PSPACE. Since the strict version of inner entanglements is a special case of the nonstrict version, the problem is PSPACE-complete.
Intractability of deciding whether a single entanglement holds for a given planning task implies intractability of deciding whether a set of inner entanglements holds for that task. Proof. Without loss of generality, let Π e 1 be a planning task obtained by reformulating Π considering e 1 . Then, the problem of deciding whether {e 1 , e 2 } holds in Π is equivalent to the problem of deciding whether e 2 holds in Π e 1 which is PSPACE-complete.
The presented theoretical results say that deciding whether a set of inner entanglements holds in a planning task is (theoretically) as hard as solving the task. Hence, in order to benefit from inner entanglements, we have to spend (much) less time on their generation than how much time we can save by their use. Learning them from simple planning tasks is a viable option, since such tasks can usually be solved and analyzed very quickly.

Trivial entanglements
Despite the complexity results, there are some cases where we can trivially identify inner entanglements (hereinafter referred to as trivial inner entanglements). The following situations refer to special cases where there is no way to violate inner entanglements in the planning process. However, trivial inner entanglements do not provide any new domain-specific information, and hence, we do not have to consider them in the reformulation.
We can observe that having only one achiever or "requirer" of some predicate trivially satisfies the conditions of exclusivity. In other words, if only one operator achieves a certain predicate, then it is its exclusive achiever for all the operators that require this predicate. Similarly, if only one operator requires a certain predicate, then it is its exclusive "requirer" from all the operators that achieve this predicate.

Lemma 1. Let Π be a planning task, Ops be the set of planning operators, and p be a predicate defined in the domain model of Π.
If there exists exactly one o i ∈ Ops such that p ∈ eff + (o i ), then, for every o k ∈ Ops such that p ∈ pre(o k ), it holds that o k is nonstrictly entangled by preceding o i with p in Π.

Lemma 2.
Let Π be a planning task, Ops be the set of planning operators, and p be a predicate defined in the domain model of Π. If there exists exactly one o i ∈ Ops such that p ∈ pre(o i ), then, for every o k ∈ Ops such that p ∈ eff + (o k ), it holds that o k is nonstrictly entangled by succeeding o i with p in Π.

Identifying inner entanglements: case studies
This section is devoted to investigating, identifying, and (re)using inner entanglements from a knowledge engineering perspective. Whereas it is usually feasible to consider inner entanglements as domain specific rather than task specific, even small modifications in domain models can invalidate some of the entanglements and, possibly, introduce some other entanglements.
An illustrative example we used earlier in the text identified two inner entanglements in the BlocksWorld domain, ie, the operator putdown is entangled by the preceding operator unstack with the predicate holding, and the operator pickup is entangled by the succeeding operator stack with the predicate holding. Whether the entanglements are strict or nonstrict depends on whether a block is initially held by the robotic hand or whether the same is required in the goal state. The entanglements, in fact, prevent applying the operators pickup and putdown consecutively since they just reverse each other's effects; thus, doing so is clearly meaningless. Extending the BlocksWorld domain by introducing an operator paint, which paints the block while it is held by the robotic hand, might invalidate the entanglements in some cases. pickup can then achieve holding for both stack and paint; thus, the exclusivity required by the entanglement is not met. putdown can meaningfully use the predicate holding achieved by pickup since we can paint the block (apply the paint operator) in between; thus, the entanglement by preceding might not be met.
The Depots domain is a combination of the BlocksWorld domain and the Logistics domain such that crates are arranged in stacks and operated by hoists in the same way as blocks in BlocksWorld but can be also transported by trucks between different locations. The lift and drop operators correspond with the BlocksWorld's unstack and stack operators, respectively. The load and unload operators are variants of the BlocksWorld's putdown and pickup operators such that instead of putting on and picking up crates from the table, they load crates on or unload crates from trucks, respectively. In Depots, we may observe, for instance, that the operator lift is entangled by the succeeding operator load with the predicate lifting, load is entangled by preceding lift with lifting, the operator drop is entangled by the preceding operator unload with lifting, and unload is entangled by succeeding drop with lifting. If no instance of lifting is present in the initial state or the goal, then the entanglements are strict. If there is no truck defined in the problem, then we cannot apply load or unload; hence, the entanglements do not hold (otherwise, it will not be possible to apply lift and drop consecutively). Modifying the domain model in such a way that particular trucks can move only between some locations might introduce the necessity of reloading crates from one truck to another. This will certainly affect two of the entanglements; in particular, load will no longer be entangled by preceding lift with lifting, and unload will no longer be entangled by succeeding drop with lifting. However, tasks in which some crate(s) have to be reloaded can be easily identified.
It should be noted that the aforementioned examples indicate that the nature of inner entanglements varies per domain model. Therefore, it seems, in our opinion, that refining not very restrictive general rules for identifying inner entanglements might not be a feasible option. Domain model engineers can either identify inner entanglements by hand or exploit our method based on the "learning in planning" paradigm that is presented in Section 7.

Expected impact of inner entanglements on the planning process
Inner entanglements eliminate unpromising alternatives in the search space, which reduces the branching factor in search. Introducing supplementary predicates required for encoding inner entanglements, however, introduces additional facts (atoms) planners have to deal with during search, and moreover, memory requirements might therefore be higher. Hence, the impact of inner entanglements is determined by considering whether the potential benefits of reducing the branching factor outweigh overheads caused by handling supplementary predicates. An analogy can be seen in determining whether a macro-operator is useful, which is also referred to as a utility problem in the literature. 48 Taking a closer look on how inner entanglements are encoded provides insights into how they may influence delete-relaxed heuristics, which is a common technique used in planning engines.
Having an operator o 2 strictly entangled by a preceding operator o 1 with a predicate p captures a situation where an instance of o 2 can be applied only if a corresponding instance of p is achieved by an instance of o 1 . This is enforced by putting a supplementary predicate p ′ into o 1 's positive effects and into o 2 's precondition. In delete-relaxed plans, o 1 must be also applied at some point before o 2 . However, an operator o ≠ o 2 achieving p (and thus removing p ′ ) can be placed in between o 1 and o 2 in delete-relaxed plans, which does not correspond with the entanglement conditions. Entanglements by preceding are therefore only partially taken into account while computing delete-relaxed heuristics. Having an operator o 1 strictly entangled by a succeeding operator o 2 with a predicate p captures a situation where an instance of o 1 achieves a corresponding instance of p for an instance of o 2 . This is enforced by putting a supplementary predicate p ′ into o 1 's negative effects and into preconditions of operators other than o 2 that have p in their preconditions. However, in delete-relaxed plans, applying o 1 does not prevent applying any other operator having p in its precondition. Therefore, entanglements by succeeding are not taken into account while computing delete-relaxed heuristics. Intuitively, only entanglements by preceding might be beneficial on planners based on delete-relaxed heuristics (eg, FF).
However, recent empirical results do not confirm this intuition by showing that, in some cases, entanglements by succeeding can be very beneficial even for planners based on delete-relaxed heuristics. 21 To understand the potential benefits of entanglements by succeeding, we have to take a different view. A heuristic may suggest applying an operator o ≠ o 2 requiring p from o 1 . However, after actual application of o 1 , it will become impossible to apply o (due to the entanglement conditions), since o 2 will be enforced. Although it might cause planners to be "trapped" in a local minimum of the heuristics, it might also prevent planners to get into "deeper" local minima, which might eventually happen if o is applied instead of o 2 .
If both (strict) entanglements by preceding and succeeding hold between o 1 , o 2 , and p, the compact encoding involves replacing p with p ′ in o 1 's positive effects and o 2 's precondition. In delete-relaxed plans, o 2 cannot be applied unless o 1 is, which is similar to the entanglements by preceding case, and moreover, o 1 cannot achieve p for any other operator than o 2 (because p is replaced by p ′ ). Although, as in the entanglements by preceding case, an operator achieving p can be placed in between o 1 and o 2 in delete-relaxed plans, which does not correspond to the entanglement conditions, both the entanglements are taken into account to a reasonable extent while computing delete-relaxed heuristics.
Compact encoding (when both entanglements by preceding and succeeding hold between a pair of operators and a predicate) is intuitively beneficial for planners. The potential impact of inner entanglements seems to be correlated with the shape of search space, in other words, whether inner entanglements can prevent planners to end up in undesirable states (eg, dead ends, "deep" local minima). We believe that maximizing sets of compatible inner entanglements does not imply maximizing planners' performance, because some of the entanglements might, in fact, have a negative impact, for instance, by introducing supplementary predicates planners might deal with or introducing local minima in the heuristics landscape. Possible examples of "bad" inner entanglements are those that consist of operators whose instances appear sporadically in plans, because such inner entanglements bring only little information for possibly high overheads. Moreover, if an inner entanglement prunes only a few alternatives, then overheads introduced with it might be higher than its possible benefit. For example, after picking up a block, we might either put it down or stack it on some other clear block. Clearly, the number of clear blocks might be up to n − 1, where n is the number of all blocks. If pickup(?x) is (strictly) entangled by succeeding stack(?x ?y) with holding(?x), then we cannot apply putdown(?x) after pickup(?x). Hence, we prune one alternative, keeping n − 1 alternatives in the worst case. Similarly, after unstacking a block from another block, we can either put it down or stack it on some clear block. If putdown(?x) is (strictly) entangled by preceding unstack(?x ?y), then we cannot apply stack(?x ?z) after unstack(?x ?y). Hence, we keep only one alternative, pruning n − 1 alternatives in the best case. Given this observation, the latter entanglement is much more informative than the former one. Intuitively, the former entanglement is not helpful and very likely will worsen the planning process. The latter entanglement, on the other hand, seems to be helpful and should improve the planning process.

EXTRACTING INNER ENTANGLEMENTS
Deciding whether a given set of inner entanglements holds in a given task is generally PSPACE-complete (as discussed in Section 6). Moreover, trivial entanglements (see Section 6.3) are not informative and, thus, not considered for task reformulation. Therefore, we have to devise an effective approximation technique for extracting sets of inner entanglements. We assume that tasks having the same domain model have a similar structure; thus, the same set of inner entanglements holds in all of them. We can then select a representative set of simple tasks for each domain model as training tasks; thus, those can be solved easily by standard planning engines. Generated training plans, which are the solutions of these training tasks, are then explored in order to find what inner entanglements hold in them.
The above approach can be formalized as follows. Let  be a class of planning tasks that has the same domain model. Let  T ⊂  be a set of training tasks. In our approximation method, we assume that ENT  T = ENT  ; in other words, a set of inner entanglements holding on training planning tasks also holds on the whole class of planning tasks. This assumption is, of course, a source of incompleteness, since enforcing incorrect entanglements may cause some tasks becoming unsolvable. On the other hand, planning tasks having the same domain model are of similar structure (eg, they differ only by number of objects), which is the case for most of the IPC benchmarks. Hence, we believe that selecting a small set of these tasks such that selected tasks are easy but not trivial can alleviate the incompleteness issue and, thus, support the assumption. Our empirical study that also explores these issues is provided in Section 8.
The method for extracting inner entanglements from (training) plans works as follows. For every action, we check which actions achieved atoms for it, or vice versa. This information is used to determine the cases where the exclusivity of the predicate's achievement or the requirement between a pair of operators applies. This concept is elaborated in Algorithm 1. For this purpose, we define an array counter, which stores information on how many instances of given operators occur in the training plans, 3D arrays entP, entS, which count how many times a given operator achieves/requires a predicate to/from another operator. Function is_inst(arg) returns either an operator if arg (action) is an instance of it or a predicate if arg (atom) is an instance of it. Function last_achiever(p, ⟨a 1 , … , a k ⟩) returns the last action in the sequence (⟨a 1 , … , a k ⟩) that has p in its positive effects, or NULL if no such action exists (ie, p is an initial atom).
Algorithm 1 requires linear time with respect to the lengths of given training plans if the number of atoms in actions' preconditions and effects is much lower than lengths of training plans; thus, it can be bounded by a constant. Notice that information retrieved by the last_achiever function can be stored in a hash table; hence, constant time is needed.

Flaw ratio
From Algorithm 1, it is easy to determine whether a given set of inner entanglements holds in all the training plans. However, it is often not a very efficient way to determine a useful set of inner entanglements. There are two main reasons. First, training plans might contain redundant actions or very suboptimal subplans, which can prevent detecting some useful entanglements. Second, there might be several strategies on how a task can be solved, where only some of these lead into the discovery of some useful entanglements. For example, in BlocksWorld, we might "put aside" blocks in two different ways: put them on the table or stack them on other blocks. Only the former way leads to the discovery of two useful inner entanglements, ie, unstack is (strictly) entangled by succeeding putdown with holding, and stack is (strictly) entangled by preceding pickup with holding.
Introducing a flaw ratio ∈ [0; 1], which is a parameter referring to an allowed percentage of "flaws" in training plans, can identify inner entanglements that can be discovered in plans that are "close" to the training plans. In other words, the exclusivity of predicate achievement or requirement between a pair of operators might only be satisfied to some extent in the training plans, whereas in some other solution plans, the exclusivity can be fully satisfied. For example, in BlocksWorld, the blocks might occasionally be "put aside" to other blocks in the training plans and, thus, cause that the useful inner entanglements (as above) are not detected. By considering flaw ratio, these inner entanglements can be found.
Let be the flaw ratio, then the following equations determine when a given inner entanglement can be considered (sprec and ssucc stand for the strict version of entanglements by preceding and succeeding, respectively):

Filtering unpromising inner entanglements
Following the discussion from Section 6.5, we can derive that the pruning power of inner entanglements is crucial for having a positive impact on the planning process. In other words, inner entanglements are more likely to be beneficial if they can prune a relatively large number of search alternatives. Otherwise, inner entanglements might have a detrimental effect on the performance of planning engines because of the overhead caused by their representation. We identified two main cases in which inner entanglements do not have a strong pruning power. The first case is where inner entanglements with operators that are rarely applied in plans are involved. Training plans can provide a good indication of "rare" operators. Therefore, we can assume that if an operator appears rarely in training plans, then it will be used rarely also for other planning problems in a given domain. Hence, we define a threshold and filter out such inner entanglements where any of the involved operators (o 1 and o 2 ) have less instances in the training plans, ie, The second case is comparing the number of arguments that "entangled" and "prohibited" operators have. Recall the example from Section 6.5, where pickup(?x) is (strictly) entangled by succeeding stack(?x ?y) with holding(?x). The entanglements prohibit applying putdown(?x) after pickup(?x). In our words, stack(?x ?y) is the "entangled" operator, and putdown(?x) is the "prohibited" operator. Clearly, only one alternative is pruned (only one instance of putdown(?x) can be applied after pickup(?x)), whereas up to n − 1 alternatives are allowed (up to n − 1 instances of stack(?x ?y) can be applied after pickup(?x)); hence, the pruning power of the entanglement is poor. The number of operators' arguments is thus a good indicator for estimating the numbers of pruned search alternatives. Hence, if the number of arguments of the "entangled" operator is higher than that of all the "prohibited" operators, then the entanglement is unpromising. Formally, let arg(o) denote the number of arguments of an operator o. Let an operator o 1 be (strictly) entangled by a succeeding operator o 2 with a predicate p, then the entanglement is considered as Analogously, let an operator o 2 be (strictly) entangled by a preceding operator o 1 with a predicate p, then the entanglement is considered as unpromising if Unpromising inner entanglements are filtered out except cases where both types of inner entanglements hold for the operators o 1 and o 2 and the predicate p, and only one of the entanglements is unpromising. Such an exception follows the observation discussed in Section 6.5 that the compact encoding of such entanglements does not introduce more overheads than the encoding of a single (inner) entanglement.

Inner-entanglement extraction
Algorithm 2 wraps up the method for extracting inner entanglements. Given generated training plans, we can fill the arrays entP, entS, and counter by running Algorithm 1. An initial value init-fr of the flaw ratio is assigned. The main loop (Lines 4-12) iteratively validates whether using the given flaw ratio does not lead to the extraction of entanglements that do not hold in the training tasks. The validation is done by extracting the nontrivial inner entanglements using the current flaw ratio (Line 5), filtering unpromising inner entanglements (Line 6), generating reformulated training problems considering the extracted entanglements (Line 7), and running a planner on these reformulated problems (Line 8). Introducing the flaw ratio may cause that the set of extracted inner entanglements does not even hold for the training problems. If such a situation occurs, the flaw ratio is decreased by step (Line 11), and the process (for Line 4) is repeated. Clearly, if = 0, then the set of extracted inner entanglements holds for the training tasks.

EXPERIMENTAL EVALUATION
This section is devoted to the empirical evaluation of the impact of entanglements in the plan generation process. The aims of the experiments are to analyze the impact of inner entanglements on state-of-the-art planning engines and how quality of training plans influences the detection and extraction of inner entanglements. For empirical evaluation purposes, we used all the domains from the learning track of IPC-7; since inner entanglements are automatically extracted domain-specific knowledge, the learning track benchmarks seem to be appropriate. This test set is thus independent, is open, and gives a relatively wide coverage. In each domain, the planning tasks have the same domain model and, thus, differ only by planning problem specifications. Henceforth, training problems denote tasks that are used for learning entanglements, and testing problems denote tasks that are used as benchmarks.

Benchmark planners
In order to perform our analysis, we selected a number of planners according to (i) their performance in the IPCs and (ii) the variety of techniques they exploit. The selected planners are Metric-FF, 49 LPG-td, 50 LAMA, 47,51 Probe, 52,53 MpC, 54,55 Yahsp3, 56 and Mercury. 57 Metric-FF 49 is an extension of the well-known FF planner 58 that won the 2nd IPC. The FF's search strategy is a variation of hill-climbing over the space of the world states, and in FF, the goal distance is estimated by solving a relaxed task for each successor world state. Compared to the first version of FF, Metric-FF is enhanced with the goal ordering pruning technique and with the ordering knowledge provided by a goal agenda.
LPG-td won the 3rd IPC. It uses stochastic local search in a space of partial plans represented through linear action graphs, which are variants of the very well-known planning graph. 29 The search steps are graph modifications, transforming an action graph into a different one.
LAMA 47,51 won the 6th and 7th IPC (sequential satisficing track). LAMA translates the PDDL problem specification into a multivalued state variable representation ("SAS+") and searches for a plan in the space of the world states using a heuristic derived from the causal graph, which is a particular graph representing the causal dependencies of SAS+ variables. Its core feature is the use of a pseudo-heuristic derived from landmarks.
Probe 52,53 was successful in IPC-7 and IPC-8. It implements a dual-search architecture for planning, which is based on the idea of probes: single-action sequences computed without search from a given state that can quickly go deep into the state space, terminating either in the goal or in failure.
MpC 54,55 was a runner-up in the agile track of IPC-8. MpC is a SAT-based planner that exploits an extremely compact SAT representation of planning tasks and an integrated SAT solver.
Yahsp3 56 won the agile track of IPC-8. Yahsp is a heuristic search-based planner that exploits information obtained from the computation of the heuristics, which is similar to the heuristic used in FF. Such information is used to find "lookahead states" that are reachable but "far" from the current state.
Mercury 57 was a runner-up in the satisficing track of IPC-8. Similarly to LAMA, Mercury translates the PDDL representation into a SAS+ multivalued state variable representation. It then exploits the Red-Black heuristics, which uses only partial delete-relaxation.

Experimental setup
In machine learning, it is important to have a good-quality training set in order to maximize the outcome of the learning process. From the planning perspective, training plans should well capture the important structural aspects that are generalizable to the whole class of planning tasks. If training plans are too short, their structure might be over-constrained, and thus, we might extract some inner entanglements that do not hold for many typical tasks of a given class. On the other hand, planning is computationally very expensive, and thus, obtaining long training plans might be too time consuming or even impossible. Hence, we have observed that a reasonable size for a training problem is when the length of its solution plan is between 20 and 100 actions, depending on the number of defined operators in the domain models (having more operators yields longer solution plans). Moreover, the number of training problems does not have to be high. This follows the observation made by Chrpa et al 59 that the set of extracted entanglements often does not change, or changes are very small, with increasing number of training problems. Similar observations have been made when configuring portfolios of planners. 60 On the other hand, using very few training problems increases the risk of extracting inner entanglements that do not hold (we might be "lucky" to have a very atypical problem as a training one). Following these observations, five training problems per domain were used. Notice that in the learning track of IPC-7, 61 a set of training problems is not explicitly provided, and thus, the training problems were generated by existing problem generators.
Strict versions of inner entanglements were learned. # The benchmark planners were used to generate training plans. The flaw ratio ( ) was initially set to 0.2, and, in case where any of the training problems became unsolvable after incorporating entanglements, ‖ the flaw ratio was iteratively reduced by 0.05 until all the training problems became solvable while entanglements were considered, or the flaw ratio dropped to 0.0 (for details, see Algorithm 2). Although, in the previous work, 20 the flaw ratio is set to 0.1, we observed on some preliminary experiments, performed on a small set of benchmarks (not included in the rest of this experimental analysis), that such a value is too conservative. On the other hand, setting the value above 0.2 led to the extraction of inner entanglements that often did not hold in the training problems. The threshold (see Section 7.2) is set to 20, which means that the operator must be used at least four times, on average, in each training plan.
A CPU-time cutoff of 900 seconds (15 minutes, as in learning tracks of IPC) was used for both learning and testing runs. All the experiments were run on a quad-core 2.8-GHz CPU machine with 4 GB of RAM. In this experimental analysis, IPC scores as defined in IPC-7 are used. For a planner  and a problem p, Time(, p) is 0 if p is unsolved, and 1∕(1 + log 10 (T p ()∕T * p )), where T p () is the CPU time needed by planner  to solve problem p and T * p is the CPU time needed by the best considered planner, otherwise. Similarly, Qual(, p) is 0 if p is unsolved, and N * p ∕N p (), where N p () is the cost of the plan, solution of p, obtained by  and N * p is the minimal cost of the solution plan of p among all the considered planners, otherwise. The IPC score on a set of problems is given by the sum of the scores achieved on each considered problem.

Experimental results: the learning phase
As discussed in the literature, 59 the structure of solution plans might differ according to a planner that generated them, and hence, the set of inner entanglements extracted from such plans can differ as well. In order to improve sets of extracted inner entanglements (ie, maximize the number of useful entanglements and minimize the number of "peculiar" entanglements), we selected, for each training problem, the best-quality (shortest) plan from those produced by all the considered planners. These best-quality training plans were then used in the entanglement extraction method (see Section 7). Hereinafter, a set of inner entanglements extracted by exploiting this approach will be denoted as the "best-plan set" of inner entanglements.
Intuitively, using good-quality training plans leads to extracting good-quality DCK (inner entanglements in this case). To test this intuition, we also considered the worst-quality plans from those produced by all the considered planners (hereinafter denoted as "worst-plan set" of inner entanglements).
The results of the learning phase are as follows.
• In Gripper, Rovers, Satellite, and Spanner, no inner entanglements have been extracted, ie, both the best-plan and worst-plan sets are empty. • In Depots, Parking, and TPP, the best-plan and worst-plan sets are the same.
• In BlocksWorld (Bw), the worst-plan set is empty, whereas the best-plan set is not empty.
• In Barman, both the best-plan and worst-plan sets are not empty but different. # Although the compact encoding for situations where both types of inner entanglements are involved requires the nonstrict version of entanglements by succeeding, correctness is not compromised, since the strict versions of inner entanglements are special cases of the nonstrict versions. ‖ By "unsolvable," we mean those problems where the planner did not find a solution in the time limit of 600 seconds.
In the first case, the structure of the domain models prevents capturing any nontrivial inner entanglements. In the second case, the quality of training plans does not make any difference. This is due to the fact that the "important" part of the training plan structure does not change that much than the quality of these training plans, and by using the flaw ratio, small structural changes of training plans are "absorbed." The third case refers to the situation where good-quality plans usually follow the strategy of putting blocks to the table whereas bad-quality plans usually temporarily stack blocks on other blocks. In the last case, the worst-plan set is a superset of the best-plan set. Although such a result is counterintuitive, we observed that in the Barman domain, drinks can be prepared by using clean shots or by reusing "dirty" shots if the same ingredient is put into them. Using always clean shots provides a "narrower" structure of solution plans; however, these plans are of worse quality since shots have to be always cleaned. It should be noted that best-plan and worst-plan sets were different only in two out of nine domains. Although the work of Chrpa et al 59 indicates that the differences should be larger, incorporating the filtering technique for unpromising inner entanglements (see Section 7.2) into the learning method "absorbs" some of these differences.

Experimental results: the testing phase
The results shown in Table 1 demonstrate the positive impact of inner entanglements on the planning process. It can be seen that only Probe solved all original testing problems in Bw and Depots. While inner entanglements (best-plan sets) were considered, in Bw, Depots, and TPP, some planners were able to solve all the testing problems. In Parking and Barman, the results are mixed. In Parking, the overall results are rather negative; in Barman, Probe (best-plan set) and Lama (worst-plan set) benefit from inner entanglements, whereas Mercury, on the other hand, has much worse performance on inner entanglement-enhanced problems. Assuming that we can run all planners with original and inner entanglement-enhanced domain models in parallel, then, by using inner entanglements, we can solve two more problems in Parking and three more problems in Barman. In addition, nine problems in Barman can be solved faster when inner entanglements are considered.
Whereas the results generally support the claim that inner entanglements can effectively prune search space by eliminating unpromising alternatives, some results, however, require more attention. Lama does not perform well for the best-plan set in Bw, whereas it performs considerably well in the worst-plan set in Barman. This might lead to an observation that Lama performs well in the worst-plan sets rather than in the best-plan sets. We, however, believe that this observation is of domain-and planner-specific nature and, thus, might not be generalized. The reason for Lama's good performance in the worst-plan set in Barman is in the fact that enforcing the planner to use only clean shots makes the landmark-based heuristics more informative. On the other hand, the best-plan set in Bw enforces the planner to put blocks on the table before stacking them in goal positions. Lama, however, has already a good performance on the original setting-its heuristics is well informed. Inner entanglements, in this case, might introduce some suboptimalities (as the quality results indicate) and, thus, slow down the planning process of Lama. In Mercury's case, we can observe that it already performs well on the original Barman problems. Inner entanglements, however, seem to introduce overheads and possibly make Mercury's heuristics less informative.
We have also observed that using good-quality training plans is useful for the learning process since the structure of the plans has less noise (eg, redundant actions). Despite some results of    (F)iltering only, and (A)ll (flaw ratio and filtering) on the best-plan sets of inner-entanglement encodings. Δ IPC Score refers to a difference in the International Planning Competition (IPC) score between the reformulated and original encodings (positive values-higher score for the reformulated encoding). "-" means no inner entanglements were produced Coverage IPC Score-Speed IPC Score-Quality P l a n n e r O N R Lama that contradicts the observation, we believe that the "best plan" strategy will be useful also in other learning-based techniques (eg, generating macros). Table 2 provides a comparison of the impact of the heuristics (flaw ratio, filtering) on the "quality" of the learned set of inner entanglements. Only the best-plan sets were considered for this comparison. Noticeably, in Parking and TPP, the sets are the same regardless of which heuristics is used or not, and thus, these domains are not listed in Table 2. Moreover, planners that did not solve any task in any of the encodings in a given domain are not listed in Table 2. The results provide clear evidence, mainly in Barman and Bw, that both heuristics-flaw ratio and filtering-are useful when applied together. Technically speaking, when only flaw ratio is considered, the set of inner entanglements is the superset or equal to the set of inner entanglements without considering flaw ratio. Filtering, on the other hand, removes possibly unpromising inner entanglements from the learned set. In other words, flaw ratio and filtering heuristics provide a useful synergy for maximizing the potential of inner entanglements.

Discussion of results
This subsection is devoted to discussing the interesting aspects of the experimental analysis results.

Summary of performance improvement
Inner entanglements eliminate unpromising alternatives in the search space. As already discussed in the paper, the pruning power of inner entanglements is a key factor for their usefulness. Therefore, we proposed a method for filtering inner entanglements whose pruning power is small (see Section 7.2). Our experiments confirmed that the filtering method often manages to filter out unpromising inner entanglements while keeping the promising ones. Inner entanglements are efficient if the exclusivity of both predicate achievement and requirement between a pair of operators holds. The reason mainly lies in the compact and informative encoding (see Section 5). Such inner entanglements were extracted in Bw (ie, putting the block on the table always after it is unstacked), in Depots (ie, loading a crate always after it is lifted), and in TPP (ie, loading goods always after buying it). Our experiments showed a performance improvement among the planners in these domains. Such results indicate that inner entanglements have a good potential for improvement. We have also identified a few cases where inner entanglements have a detrimental effect on planners (eg, Mercury in Barman). As discussed in Section 6.5, the representation of inner entanglements has an impact on heuristics computation. Generally speaking, despite pruning the search space, the representation of inner entanglements might introduce local minima of heuristic functions that, in consequence, might have a detrimental effect on planning engines since they need to search more nodes to escape such minima.

Completeness issues
As discussed earlier, our method for extracting inner entanglements follows an assumption that a set of inner entanglements that holds for a set of training planning tasks also holds for the whole class of planning tasks (ie, the testing ones). If this assumption does not hold for some tasks in the class, they become unsolvable if inner entanglements are enforced. We observed in our experiments that the majority of reformulated tasks (by encoding inner entanglements) was solved by at least one of the planners. In Barman and Parking, 5 and 16 reformulated tasks, respectively, have not been solved by any of the planners. However, no evidence was obtained whether this was caused by their unsolvability or whether these tasks were too hard for the planners. In other words, the planners on these tasks run out of time or memory.
To alleviate the incompleteness issue, we can try to solve the original task after the reformulated one failed. Specifically, we run the planner on the reformulated task, and if the task is considered unsolvable before the time limit is reached, then we run the planner on the original task. Theoretically, the unsolvability of a planning task can be identified in finite time if a complete planning engine is considered. In practice, we can identify some unsolvable tasks in little time if the reachability analysis reveals that the goal cannot be reached. 4 Since we have not identified any unsolvable reformulated task in the given time limit, the same results as for the best-plan or the worst-plan set of entanglements would have applied for the aforementioned approach. Alternatively, we can alleviate the incompleteness issue by manually verifying the correctness of extracted inner entanglements or by incorporating reformulated tasks along with original tasks into planning portfolios such as PbP. 22

Improvement to the quality of plans generated
In general, inner entanglements do not guarantee the optimality of solution plans. Strengthening definitions of inner entanglements to guarantee plan optimality is, of course, theoretically possible. Given the complexity results of "normal" inner entanglements, we can expect the same for "optimal" inner entanglements. Using the approximation algorithm for extracting inner entanglements on optimal training plans with zero flaw ratio might extract some useful "optimal" inner entanglements. However, we believe that there is a high risk of extracting incorrect "optimal" inner entanglements. For example, the recently mentioned inner entanglements in the Depots domain are "optimal" for problems where each crate must be delivered to a different location. If, in some problem, a crate must be stacked on a different pallet but within the same location, such inner entanglements will force the planner (even the optimal one) to extract suboptimal plans. Speaking about satisficing planning, these entanglements will prevent planners from finding a plan only if no truck is available. Such a problem is very atypical. Hence, there is a very low risk of extracting incorrect "normal" inner entanglements, and similar observations can be made in other domains. Our experimental results have not clearly indicated any case in which the extracted set of inner entanglements did not hold.

Relationship to other pruning or problem reformulation techniques
Although there are several techniques based on pruning or problem reformulation techniques (discussed in the Related Work section), inner entanglements are complementary to these techniques. Pruning techniques are often an inseparable part of advanced planning engines. We used several of such planning engines, which were successful in the past IPCs, for our experiments. We demonstrated that inner entanglements can often significantly improve their performance. Outer entanglements 19,20 prune unpromising instances of the planning operator according to their relations with initial or goal atoms. Inner entanglements are complementary to outer entanglements as has already been demonstrated in the previous work. 20 Another well-known technique for reformulating domain models is learning macros. A recent work introducing ASAP, which is a planner based on the algorithm selection approach that selects the best couple (planner,encoding) for a given domain, has shown that inner entanglements and the combination of outer and inner entanglements often outperformed macros. 38 Exploiting a natural property of inner entanglements, ie, exclusivity of predicate achievement or requirement, has been also used for generating macros. 28 Such macros can be, in some cases, beneficial; however, such an approach cannot be used in cases where operators in an inner-entanglement relation cannot be applied consecutively. Inner entanglements can support also other learning techniques that are used in planning. Roller 10 is a system that learns decision trees that are then used to guide depth-first search. Combining Roller with entanglements (both inner and outer), such a system is called Rollent, brought promising results as well. 62

CONCLUSIONS AND FUTURE WORK
In this paper, we have presented inner entanglements, which are relations between pairs of planning operators and predicates such that an operator exclusively achieves a predicate for another operator or an operator exclusively requires a predicate from another operator. To deal with the intractability of deciding whether a given inner entanglement holds for a given planning task (see Section 6), we used an approximation method for extracting "domain-specific" sets of inner entanglements from training plans, solution plans of simple tasks. Inner entanglements can be encoded into domain models without extending the input language of a planner (see Section 5), and therefore, they can be understood and exploited as planner-independent knowledge.
Inner entanglements are able to considerably improve the planning process as our experiments demonstrated. In Bw, Depots, and TPP, the considerable performance improvement was observed among almost all the planners. As discussed before, inner entanglements are especially powerful if the exclusivity of both predicate achievement and requirement between a given pair of operators holds, which is the case of Bw, Depots, and TPP. Generally, inner entanglements have a good potential for a performance improvement if they are not "clashing" with a given planning technique, as demonstrated in Barman (Mercury) and Bw (Lama).
The pruning power of inner entanglements is a crucial aspect for their success. In particular, we need to avoid creating them with rarely used operators and when the argument count of an entangled operator is higher than certain other operators in the domain model (as explained in Section 7.2). Incorporating the aforementioned filtering technique into the inner-entanglement learning method alleviated most of the performance concerns raised in the previous works. 20,21 We identified several avenues for future research. First, we believe that inner entanglements can be considered directly in heuristics-rather than being encoded in PDDL-by, for instance, penalizing possibilities that violate these entanglements. Second, we believe that inner entanglements can be encoded, for instance, in decision trees or control rules. This might improve the performance of related planners, ie, Roller 10 or TALplanner. 12 Third, we will investigate in which cases deciding nontrivial inner entanglements is tractable. Given the insights in this paper (see Section 6.4), we believe that by analyzing the domain structure, we can identify some useful inner entanglements in polynomial time. Finally, given the encouraging spread of results among sets of planners and domains, we intend to work toward including an inner entanglement-generating facility as part of a knowledge engineering workbench.