How Intractability Spans the Cognitive and Evolutionary Levels of Explanation

Abstract The challenge of explaining how cognition can be tractably realized is widely recognized. Classical rationality is thought to be intractable due to its assumptions of optimization and/or domain generality, and proposed solutions therefore drop one or both of these assumptions. We consider three such proposals: Resource‐Rationality, the Adaptive Toolbox theory, and Massive Modularity. All three seek to ensure the tractability of cognition by shifting part of the explanation from the cognitive to the evolutionary level: Evolution is responsible for producing the tractable architecture. We consider the three proposals and show that, in each case, the intractability challenge is not thereby resolved, but only relocated from the cognitive level to the evolutionary level. We explain how non‐classical accounts do not currently have the upper hand on the new playing field.

The class P is the class of decision problems that can be solved using a polynomial-time algorithm, i.e., an algorithm that takes on the order of n c basic computational steps, for some constant c (also denoted as O(n 2 )). The class NP is the class of decision problems with the property that if the answer is Yes, then there exist a proof that the answer is Yes (called a 'witness' or 'certificate') that can be verified in polynomial-time. It is known that P ⊆ NP, and conjectured that P = NP (Fortnow, 2009). Since this conjecture is widely believed by computer scientists and mathematicians, and to the best of our knowledge none of our intended interlocutors question this conjecture, we assume it here and throughout the main text of the article.
Using the P = NP conjecture it can be proven that some problems in (or outside) NP are not in P, and hence not polynomial-time solvable. Among them are the NP-hard problems. NP-hard problems have the property that if any one of them were to be polynomial-time solvable then P = NP (which would contradict the P = NP conjecture). Hence, assuming P = NP a problem can be shown to be intractable (i.e., not polynomial-time solvable) by proving it NP-hard.
A problem can be proven NP-hard using the technique of polynomial-time reduction. It works as follows: Let F : I → O be a known intractable (NP-hard) problem, and let F : I → O be a new problem of interest. Then we say I polynomial-time reduces to F if there exist two tractable (i.e., polynomial-time) algorithms A and B where A can transform any input i ∈ I into an input A(i) = i ∈ I such that B(F (A(i))) = F (i). Observe that if F were to be tractable, then algorithms A and B could be used to tractably solve F . Since F is known to be intractable, a polynomial-time reduction from F to F proves that F must be intractable too (see Figure 1 for an illustration).
Problems that are both NP-hard and in NP are called NP-complete. The example decision problem that we presented, Exact Cover by 3-Sets, is a known NP-complete problem (Garey & Johnson, 1979). We will use this decision problem as a starting point for our proofs. In a sense, it does not matter which problem we use, since it is known that every NP-complete problem can be polynomial-time reduced to every other NP-complete problem.
In the main text of the paper, we consider search problems rather than decision problems. That is, the problems are not concerned with answering a binary Yes-No question for a given input, but are concerned with computing a different output. For example, C-Architecture Adaptation deals with the problem of computing an architecture C ∈ C with certain properties-if it exists.
Notions of intractability that are typically used in computational complexity theory are based on decision problems-e.g., NP-hardness. In the proofs that follow in this supplementary material (in Section 2), we use these hardness notions to show that search problems are not tractably solvable. We do this by showing that if (1) we could solve the search problems tractably, then (2) we could also solve some NP-hard decision problem in polynomial time. Formally, we show this implication by using what is called a polynomial-time Turing reduction. 1 This gives us as result the statement that we cannot solve these search problems tractably, unless P = NP.

Proofs
Theorem 1. C-Architecture Adaptation is NP-hard.
Proof. We show NP-hardness by a polynomial-time reduction from X3C, that works as follows. We take an input of X3C, and we use this input to construct an input of C-Architecture Adaptation. Then, we show that we can use any polynomial-time algorithm for C-Architecture Adaptation to decide in polynomial time whether the answer for the original input of X3C is "yes" or "no." 2 Let (U, F) be an instance of X3C, where U = {u 1 , . . . , u n }, where F = {F 1 , . . . , F m }, and where |F j | = 3 and F j ⊆ U for each 1 ≤ j ≤ m. We construct an input of C-Architecture Adaptation as follows.
We take the set {s 0 , s 1 , . . . , s n } of relevant situations, and we take the set A = {a 1 , a 2 } of actions. All representations in R are binary strings of length m. The perception function p : S → R is defined as follows. We let p(s 0 ) = 000 . . . 0, i.e., the string consisting of m zeroes. For each 1 ≤ i ≤ n, we let p(s i ) be the string In other words, for situation s 0 , only the singleton set {a 2 } gives value 1, and all other sets of actions give value 0, and for all other situations s i , only the singleton set {a 1 } gives value 1, and all other sets of actions give value 0. This construction is illustrated for an example instance of X3C in Figure 2.
In this example, the architecture C {1,5} achieves a maximal cumulative value for m.
As the class C of architectures, we consider the set of all singleton sets C L = {c L } where L ⊆ {1, . . . , m} is a set of size exactly n/3. Each such c L : R → A is the function that is defined as follows. For L = {i 1 , . . . , i n/3 }, the function c L takes as input a binary string r ∈ R of length m, and it returns a 1 if there is some i j ∈ L such that the i j -th bit of r equals 1, and it returns a 2 otherwise. Moreover, each such C L = {c L } has a fixed constant cost, say, cost(C L ) = 0. Finally, we let v min = 1 and d max = 0.
We now show that we can use any polynomial-time algorithm A for C-Architecture Adaptation to decide whether there exists a subset F ⊆ F of size |F | = n/3 such that F = U . We do so by running the algorithm A on the instance of C-Architecture Adaptation that we constructed. This algorithm then either (i) outputs some C L ∈ C, or (ii) it outputs something else, e.g., the string "none found." We consider these two cases separately.
In case (i), we check whether s∈S m(C L (p(s)), s)/|S| ≥ v min = 1. If this is the case (i.a), we know that in each situation s ∈ S, the architecture C L outputs a set B of actions such that m(B, s) = 1. Then, by construction of the input, this can only be the case if F = {F j | j ∈ L} has the property that F = U . Moreover, we also know that |F | = n/3. Thus, we can conclude that the answer to the original input for X3C is "yes." Next, we show that in case (i.b), where s∈S m(C L (p(s)), s)/|S| < 1, and in case (ii), where the algorithm A outputs something that is not in C, the answer to the original input for X3C is "no." In both cases, the algorithm does not output an architecture C with the property that s∈S m(C(p(s)), s)/|S| ≥ v min = 1. We show that if the answer to the original input for X3C would be "yes," then there exists an architecture C with the property that s∈S m(C(p(s)), s)/|S| ≥ v min = 1, which contradicts our assumption that the algorithm A works correctly to solve C-Architecture Adaptation.
Suppose that there is some F ⊆ F such that |F | = n/3 and F = U . Then take the architecture C L , where L contains all 1 ≤ j ≤ m such that F j ∈ F . Then |L| = n/3, since |F | = n/3. Moreover, one can straightforwardly verify that in each s ∈ S the architecture C L outputs a set B of actions such that m(B, s) = 1. In other words, s∈S m(C L (p(s)), s)/|S| = 1. This concludes our proof.
Proof. To prove this, we can directly use the proof of Theorem 1. The class C of architectures used in this proof is a subset of all architectures implemented by a toolbox that looks at at most n/3 cues. Moreover, all arguments in the proof carry through if instead we consider the class of all fast-and-frugal trees of size at most n/3 (each of which has constant cost). Therefore, this proof also shows that C-Architecture Adaptation is NP-hard for C = {C | C is an adaptive toolbox}.
Note that the number k = n/3 grows linearly in the size of the number |S| of relevant situations. However, by making copies of situation s 0 in the proof of Theorem 1, we can make the number k (i.e., the number of bits any given heuristic can access) much smaller than the number |S| of relevant situations.
Theorem 3. C-Architecture Adaptation is NP-hard for C = {C | C is a massively modular architecture}.
Proof. To prove this, we use a modified version of the proof of Theorem 1. What we change in the proof of Theorem 1 is the following. where c 0 is the function that looks at the first bit, and always outputs {a 2 } (regardless of the value of the first bit), and where for each 1 ≤ j ≤ m, the function c j looks at the j-th bit, outputs the action a 1 if the bit equals 1, and outputs the action a 2 if the bit equals 0. Then, as the class of architectures, we take C = {C | C ⊆ C all , |C| ≤ |S|/3}. Each such C ∈ C has a fixed constant cost, say, cost(C) = 0.
All arguments in the proof of Theorem 1 carry through (in an analogous form) if we consider this modified construction and the adapted class C of architectures. Moreover, these architectures fit the conditions of massively modular architectures. Therefore, this proof also shows that Massively Modular Architecture Adaptation is NP-hard.
Theorem 4. C-Architecture Adaptation is NP-hard for C = {C | C is a resource rational architecture}.
Proof. To prove this, we use a modified version of the proof of Theorem 1. What we change from the proof of Theorem 1 is the following. As class C of architectures, we consider the set of all singleton sets C = {c} such that c is any polynomial-time computable function c : R → A. For each such C = {c}, the cost cost(C) is the maximum number of bits of p(s) ∈ R that c looks at in any situation s ∈ S.
By using arguments that are entirely analogous to the arguments used in the proof of Theorem 1 for this class C of architectures, we get that the answer to the original input of X3C is "yes" if and only if there is an architecture C of cost n/3 that has the property that s∈S m(C(p(s)), s)/|S| ≥ v min = 1. Moreover, by construction there is no architecture C of cost < n/3 that has the property that s∈S m(C(p(s)), s)/|S| ≥ v min = 1. Therefore, analogously to the arguments in the proof of Theorem 1, we can use any polynomial-time algorithm A for C-Architecture Adaptation to decide X3C in polynomial time.
Since this class C of architectures fits the conditions of resource-rational architectures, this proof shows that if C = {C | C is a resource rational architecture} then C-Architecture Adaptation is NP-hard.
Theorem 5. C-Architecture Adaptation is polynomial-time solvable for C = {C | C is a classically rational architecture}.
Proof. We show that C-Architecture Adaptation is polynomial-time solvable for C = {C | C is a classically rational architecture} by describing a polynomial-time computable algorithm A that solves this problem. The algorithm A takes as input a description of the sets S, R and A. It also takes as input a description of the functions p : S → R and m : 2 A × S → [0, 1]without loss of generality, we suppose that these functions p and m are given in the form of (a description of) Turing machines that compute these functions.
The algorithm A outputs an architecture C = {c} that achieves a maximal value of s∈S m(C(p(s)), s)/|S|. The function c : R → A is computed by an algorithm B-that does not necessarily run in polynomial time. This algorithm B does the following: By doing so, B maximizes the value of s∈S m(C(p(s)), s)/|S|.
Given a description of the sets S, R and A, and a description of the algorithms that compute the functions m and p, algorithm A can in polynomial time construct a description of a Turing machine that computes algorithm B as given above. This is the case, because the algorithm consisting of steps (1)-(4) is always the same, modulo some input parameters-only S, R, A and the descriptions of algorithms computing m and p change. These changing parameters can easily be plugged into a pre-defined algorithm in polynomial time. By doing so, A solves the problem C-Architecture Adaptation in polynomial time for C = {C | C is a classically rational architecture}.