Concentration estimates for functions of finite high-dimensional random arrays

Let $\boldsymbol{X}$ be a $d$-dimensional random array on $[n]$ whose entries take values in a finite set $\mathcal{X}$, that is, $\boldsymbol{X}=\langle X_s:s\in \binom{[n]}{d}\rangle$ is an $\mathcal{X}$-valued stochastic process indexed by the set $\binom{[n]}{d}$ of all $d$-element subsets of $[n]:=\{1,\dots,n\}$. We give easily checked conditions on $\boldsymbol{X}$ that ensure, for instance, that for every function $f\colon \mathcal{X}^{\binom{[n]}{d}}\to\mathbb{R}$ that satisfies $\mathbb{E}[f(\boldsymbol{X})]=0$ and $\|f(\boldsymbol{X})\|_{L_p}=1$ for some $p>1$, the random variable $f(\boldsymbol{X})$ becomes concentrated after conditioning it on a large subarray of $\boldsymbol{X}$. These conditions cover several classes of random arrays with not necessarily independent entries. Applications are given in combinatorics, and examples are also presented that show the optimality of various aspects of the results.

1. Introduction 1.1. Motivation. The concentration of measure refers to the powerful phenomenon asserting that a function that depends smoothly on its variables is essentially constant, as long as the number of the variables is large enough. There are various ways to quantify this "smooth dependence" (e.g., Lipschitz conditions, bounds for the L 2 norm of the gradient, etc.). Detailed expositions can be found in [Le01,BLM13].
It is easy to see that this phenomenon is no longer valid if we drop the smoothness assumption. Nevertheless, one can still obtain some form of concentration under a much milder integrability condition.
Theorem ([DKT16, Theorem 1 ′ ]). For every p > 1 and every 0 < ε ⩽ 1, there exists a constant c > 0 with the following property. If n ⩾ 2/c is an integer, X = (X 1 , . . . , X n ) is a random vector with independent entries that take values in a measurable space X , and f : X n → R is a measurable function with E[f (X)] = 0 and ∥f (X)∥ Lp = 1, then there exists an interval I of [n] with |I| ⩾ cn such that for every nonempty J ⊆ I we have where E[f (X) | F J ] stands for the conditional expectation of f (X) with respect to the σ-algebra F J := σ({X i : i ∈ J}).
(Here, and in what follows, [n] denotes the discrete interval {1, . . . , n}.) Roughly speaking, this result asserts that if a function of several variables is sufficiently integrable, then, by integrating out some coordinates, it becomes essentially constant. It was motivated by-and it has found several applications in-problems in combinatorics (see [DK16]).
1.1.1. The goal of this paper is twofold: to develop workable tools in order to extend the conditional concentration estimate (1.1) to functions of random vectors X with not necessarily independent entries, and to present related applications. Of course, to this end some structural property of X is necessary. We focus on high-dimensional random arrays whose distribution is invariant under certain symmetries. Besides their intrinsic analytic and probabilistic interest, our choice to study functions of random arrays is connected to the density polynomial Hales-Jewett conjecture, an important combinatorial conjecture of Bergelson [Ber96]-see Subsection 1.5.
1.2. Random arrays. At this point it is useful to recall the definition of a random array.
Definition 1.1 (Random arrays, and their subarrays/sub-σ-algebras). Let d be a positive integer, and let I be a set with |I| ⩾ d. A d-dimensional random array on I is a stochastic process X = ⟨X s : s ∈ I d ⟩ indexed by the set I d of all d-element subsets of I. If J is a subset of I with |J| ⩾ d, then the subarray of X determined by J is the d-dimensional random array X J := ⟨X s : s ∈ J d ⟩; moreover, by F J we shall denote the σ-algebra σ(⟨X s : s ∈ J d ⟩) generated by X J .
Of course, one-dimensional random arrays are just random vectors. On the other hand, two-dimensional random arrays are essentially the same as random symmetric matrices, and their subarrays correspond to principal submatrices; more generally, higherdimensional random arrays correspond to random symmetric tensors. We employ the terminology of random arrays, however, since we are not using linear-algebraic tools.
1.2.1. Notions of symmetry. The study of random arrays with a symmetric distribution is a classical topic that goes back to the work of de Finetti; see [Au08,Au13,Kal05] for an exposition of this theory and its applications. Arguably, the most well-known notion of symmetry is exchangeability: a d-dimensional random array X on a (possibly infinite) set I is called exchangeable if for every finite permutation π of I, the random arrays X and X π := ⟨X π(s) : s ∈ I d ⟩ have the same distribution. Another well-known notion of symmetry, which is weaker than exchangeability, is spreadability: a d-dimensional random array X on a (possibly infinite) set I is called spreadable 1 if for every pair J, K of finite subsets of I with |J| = |K| ⩾ d, the subarrays 2 X J and X K have the same distribution. Infinite, spreadable, two-dimensional random arrays have been studied by Fremlin and Talagrand [FT85], and-in greater generality-by Kallenberg [Kal92].
Beyond these notions, in this paper we will also consider the following approximate form of spreadability, which naturally arises in combinatorial applications.
Definition 1.2 (Approximate spreadability). Let X be a d-dimensional random array on a (possibly infinite) set I, and let η ⩾ 0. We say that X is η-spreadable (or, simply, approximately spreadable if η is clear from the context), provided that for every pair J, K of finite subsets of I with |J| = |K| ⩾ d we have (1.2) ρ TV (P J , P K ) ⩽ η, where P J and P K denote the laws of the random subarrays X J and X K respectively, and ρ TV stands for the total variation distance.
We recall that the total variation distance between two probability measures P and Q on a measurable space (Ω, F) is the quantity ρ TV (P, Q) := sup |P (A) − Q(A)| : A ∈ F . We also note that if Ω is discrete, then the total variation distance is related to the L 1 norm via the identity ρ TV (P, Q) = 1 2 ∥P − Q∥ L1 = 1 Proposition 1.3. For every triple m, n, d of positive integers with n ⩾ d, and every η > 0, there exists an integer N ⩾ n with the following property. If X is a set with |X | = m and X is an X -valued, d-dimensional random array on a set I with |I| ⩾ N , then there exists a subset J of I with |J| = n such that the random array X J is η-spreadable.
1.3. The concentration estimate. We are ready to state one of the main extensions of (1.1) obtained in this paper; the question whether (1.1) could hold for random vectors with not independent entries, was asked by an anonymous reviewer of [DKT16] as well as by several colleagues in personal communication. In this introduction we restrict our discussion to boolean two-dimensional random arrays, mainly because this case is easier to grasp, but at the same time it is quite representative of the higher dimensional case. The general version is presented in Theorem 5.1 in Section 5; further extensions/refinements are given in Section 6.
Theorem 1.4. Let 1 < p ⩽ 2, let 0 < ε ⩽ 1, let k ⩾ 2 be an integer, and set C = C(p, ε, k) := exp 3200 ε 8 (p − 1) 2 · k 2 . (1.3) Also let n ⩾ C be an integer, let X = ⟨X s : s ∈ [n] 2 ⟩ be a {0, 1}-valued, (1/C)-spreadable, two-dimensional random array on [n], and assume that Then for every function f : {0, 1} ( [n] 2 ) → R with E[f (X)] = 0 and ∥f (X)∥ Lp = 1 there exists an interval I of [n] with |I| = k such that for every J ⊆ I with |J| ⩾ 2 we have Recall that F J denotes the σ-algebra generated by X J (see Definition 1.1). Thus, Theorem 1.4 asserts that the random variable f (X) becomes concentrated after conditioning it on a subarray of X. Also observe that (1.4) together with the (1/C)-spreadability of X imply that for every i, j, k, ℓ ∈ [n] with i < j < k < ℓ we have (  Figure 1). As we shall shortly see, as the parameter C gets bigger, the estimate (1.6) forces the random variables X {i,k} , X {i,ℓ} , X {j,k} , X {j,ℓ} to behave close to independently. (It also implies that the correlation matrix of X is close to the identity.) Therefore, we may view (1.6) as an (approximate) box independence condition for X. We present various examples of spreadable random arrays that satisfy the box independence condition in Section 7.
Finally we point out that (1.6) is essentially an optimal condition in the sense that for every integer n ⩾ 4 there exist -a boolean, exchangeable, two-dimensional random array X on [n], and -a multilinear polynomial f : R ( [n] 2 ) → R of degree 4 with E[f (X)] = 0 and ∥f (X)∥ L∞ ⩽ 1, such that the correlation matrix of X is the identity, and for which (1.6) and (1.5) do not hold (see Proposition A.1; the case "d ⩾ 3" is treated in Proposition A.2).
1.4. Basic steps of the proof. The first step of the proof of Theorem 1.4-which can be loosely described as its analytical part-is to show that the conditional concentration of f (X) is equivalent to an approximate form of the dissociativity of X; this is the content of Theorem 2.2 in Section 2. The proof of this step is based on estimates for martingale difference sequences in L p spaces, and it applies to random arrays with arbitrary distributions (in particular, not necessarily approximately spreadable). The main advantage of this reduction is that it enables us to forget about the function f and focus exclusively on the random array X.
The second-and more substantial-step is the verification of the approximate dissociativity of X. This is a consequence of the following theorem, which is one of the main results of this paper. (As before, at this point we restrict our discussion to boolean two-dimensional random arrays; the general version is given in Theorem 3.2.) Theorem 1.5 (Propagation of randomness). Let n ⩾ 8 be an integer and 0 < η, ϑ ⩽ 1. Also let X = ⟨X s : s ∈ [n] 2 ⟩ be a {0, 1}-valued, η-spreadable, two-dimensional random array on [n] such that for every i, j, k, ℓ ∈ [n] with i < j < k < ℓ we have  Theorem 1.5 shows that the box independence condition 3 propagates and forces all, not too large, subarrays of X to behave close to independently. Its proof is based on combinatorial and probabilistic ideas, and it is analogous 4 to the phenomenon-discovered in the theory of quasirandom graphs [CGW88,CGW89]-that a graph G that contains (roughly) the expected number of 4-cycles must also contain the expected number of any other, not too large, graph H. We comment further on the relation between the box independence condition and quasirandomness of graphs and hypergraphs in Subsection 7.1.
1.5. Connection with combinatorics. We proceed to discuss a representative combinatorial application of our main results.
1.5.1. Families of graphs. We start by observing that for every integer n ⩾ 2 we may identify a graph G on [n] with an element of {0, 1} ( [n] 2 ) via its indicator function 1 G . (More generally, for every nonempty finite index set I we identify subsets of I with elements of {0, 1} I .) Thus, we view the set {0, 1} ( [n] 2 ) as the space of all graphs on n vertices and we denote by µ the uniform probability measure on {0, 1} ( [n] 2 ) . Our application is related to the following conjecture of Gowers [Go09, Conjecture 4].
Conjecture 1.6. Let 0 < δ ⩽ 1 and assume that n is sufficiently large in terms of δ.

Then for every family of graphs
Conjecture 1.6 is a special, but critical, case of the density polynomial Hales-Jewett conjecture [Ber96]; for a detailed discussion of its significance we refer to [Go09] where Conjecture 1.6 was proposed as a polymath project.
Despite the fact that there is considerable interest, there is nearly no information on Conjecture 1.6 in the literature (see, however, the online discussion in [Go09]). This is partly due to the fact that, while the understanding of quasirandom graphs is very satisfactory, it is unclear what a quasirandom family of graphs actually is. Our results are pointing precisely in this direction 5 .
1.5.2. Quasirandom families of graphs. In order to motivate the reader, let us say that a family of graphs A ⊆ {0, 1} ( [n] 2 ) is isomorphic invariant 6 if for every permutation π of [n] 3 Note that in Theorem 1.5 we only need the one-sided version (1.7) of (1.6). Of course, in retrospect, Theorem 1.5 yields that (1.7) is actually equivalent to (1.6) albeit with a slightly different constant. 4 In fact, this is more than an analogy; indeed, it is easy to see that Theorem 1.5 yields the aforementioned property of quasirandom graphs. 5 Here, it is important to note that this is a rather basic step of the analysis of Conjecture 1.6; indeed, the combinatorial core of almost every problem in density Ramsey theory is to isolate its quasirandom and structure components-see, e.g., [Tao08] for an exposition of this general philosophy. 6 Isomorphic invariant families of graphs are also referred to as graph properties. It may be argued that Conjecture 1.6 is more natural for isomorphic invariant families of graphs, but we do not impose such a restriction in our results. 2 ) is an arbitrary isomorphic invariant family of graphs, then denoting by γ(A) the unique nonnegative real such that On the other hand, notice that if A ⊆ {0, 1} ( [n] 2 ) is selected uniformly at random, then clearly γ(A) = µ(A) 4 + o n→∞ (1).
Keeping these observations in mind, we view as quasirandom those families of graphs A whose parameter γ(A) is not significantly larger from the corresponding parameter of a random family of graphs with the same density. This is, essentially, the content of the following definition.
Definition 1.7 (Quasirandom families of graphs). Let n ⩾ 2 be an integer, let θ > 0, and let A ⊆ {0, 1} ( [n] 2 ) be a (not necessarily isomorphic invariant) family of graphs. We say that A is θ-quasirandom if, denoting by U the set of all U = {i < j < k < ℓ} ∈ The reader might have already observed the similarity between Definition 1.7 and the classical 4-cycle condition of quasirandomness of graphs [CGW88,CGW89].
1.5.3. The following theorem-which relies on both conditional concentration and Theorem 1.5, and whose proof is given in Section 8-shows that Definition 1.7 is indeed a sensible notion.
Theorem 1.8 asserts that every non-negligible quasirandom family A of sufficiently large graphs contains a graph W for which there is a large set K such that the induced subgraph W [K] of W on K is empty, while at the same time, adding any single edge from K 2 to W does not leave the family A. Note, in particular, that Theorem 1.8 yields an affirmative answer to Conjecture 1.6 for quasirandom families of graphs in a strong sense: we can select the graphs G and H so that the difference G \ H is a single edge. Finally, we point out that the proof of Theorem 1.8 is effective; see Remark 8.6 for its quantitative aspects.
1.6. Related work. Although Theorem 1.4 (as well as its higher dimensional extension, Theorem 5.1) is somewhat distinct from the traditional setting of concentration of smooth functions, it is related with several results that we are about to discuss.
Arguably, the one-dimensional case-that is, the case of random vectors-is the most heavily investigated. It is impossible to give here a comprehensive review; we only mention that concentration estimates for functions of finite exchangeable random vectors have been obtained in [Bob04,Ch06].
The two-dimensional case is also heavily investigated, in particular, in the literature around various random matrix models. However, closer to the spirit of this paper is the work of Latala [La06] and the subsequent papers [AdWo15, GSS19,V19], which obtain exponential concentration inequalities for smooth functions (e.g., polynomials) of highdimensional random arrays whose entries are of the form where (ξ 1 , . . . , ξ n ) is a random vector with independent entries and a well-behaved distribution. Note that all these arrays are dissociated 7 , and are additionally exchangeable if the random variables ξ 1 , . . . , ξ n are identically distributed. That said, the study of concentration inequalities for functions of more general finite high-dimensional random arrays is nearly not developed at all, mainly because the structure of finite high-dimensional 8 random arrays is quite complicated (see, also, [Au13, page 16] for a discussion on this issue). We make a step in this direction in the companion paper [DTV21]. 1.7. Organization of the paper. We close this section by giving an outline of the contents of this paper. It is divided into two parts, Part 1 and Part 2, which are largely independent of each other and can be read separately.
Part 1 consists of Sections 2 up to 6. The main result in Section 2 is Theorem 2.2, which reduces conditional concentration to approximate dissociativity. The next two sections, Sections 3 and 4, are devoted to the proof of Theorem 1.5 and its higherdimensional extension, Theorem 3.2. In Section 3 we introduce related definitions and we also present some consequences. The proof of Theorem 3.2 is given in Section 4; this is the most technically demanding part of the paper. In Section 5 we complete the proofs of Theorem 1.4 and its higher-dimensional extension, Theorem 5.1. Lastly, in Section 6 we present extensions/refinements of Theorems 1.4 and 5.1 for dissociated random arrays (Theorem 6.1), for vector-valued functions of random arrays (Theorem 6.3) and a simultaneous conditional concentration result (Theorem 6.4).
Part 2 consists of Sections 7 and 8 and it is entirely devoted to the connection of our results with combinatorics. In Section 7 we give examples of combinatorial structures for which our conditional concentration results are applicable, and in Section 8 we give the proof of Theorem 1.8.
Finally, in Appendix A we present examples that show the optimality of the box independence condition.
Part 1. Proofs of the main results 2. From dissociativity to concentration 2.1. Main result. Let d be a positive integer, and recall that a d-dimensional random array X on a (possibly infinite) subset I of N is called dissociated 9 if for every J, K ⊆ I with |J|, |K| ⩾ d and max(J) < min(K), the σ-algebras F J and F K are independent, that is, for every A ∈ F J and B ∈ F K we have P(A ∩ B) = P(A) P(B). Dissociativity is a classical concept in probability (see [MS75]); we will need the following approximate version of this notion.
Definition 2.1 (Approximate dissociativity). Let n, ℓ, d be positive integers such that n ⩾ ℓ ⩾ 2d, and let 0 ⩽ β ⩽ 1. We say that a d-dimensional random array X on [n] is (β, ℓ)-dissociated provided that for every J, K ⊆ [n] with |J|, |K| ⩾ d, |J| + |K| ⩽ ℓ and max(J) < min(K), and every pair of events A ∈ F J and B ∈ F K we have The following theorem-which is the main result in this section-provides the link between conditional concentration and approximate dissociativity.
Theorem 2.2. Let d be a positive integer, let 1 < p ⩽ 2, let 0 < ε ⩽ 1, let k ⩾ d be an integer, and set β = β(p, ε) := ε 10 10 p−1 , Also let n ⩾ ℓ be an integer, and let X be a (β, ℓ)-dissociated, d-dimensional random array on [n] whose entries take values in a measurable space X . Then for every measurable function f : X ( [n] d ) → R with E[f (X)] = 0 and ∥f (X)∥ Lp = 1 there exists an interval I of [n] with |I| = k such that for every J ⊆ I with |J| ⩾ d we have We note that for spreadable random arrays there is a converse of Theorem 2.2, namely, approximate dissociativity is in fact necessary in order to have conditional concentration; see Proposition 2.8 in Subsection 2.6.
2.2. Moment bound. The following moment estimate is the main step of the proof of Theorem 2.2. 9 Notice that this form of dissociativity (as well as the corresponding approximate version in Definition 2.1) is weaker than the standard one in the absence of exchangeability, since we do not require independence of F J and F K for all pairs of disjoint sets J and K. Theorem 2.3. Let d, ℓ, n be positive integers with n ⩾ ℓ ⩾ 2d, let 0 ⩽ β ⩽ 1, and let X be a d-dimensional random array on [n] that is (β, ℓ)-dissociated and whose entries take values in a measurable space X . Then, for every 1 < p ⩽ 2, every measurable function , and every I ∈ [n] ℓ , there exists J ∈ I k with the following property. For any 1 ⩽ r < p, we have where F J denotes the σ-algebra generated by the subarray X J (see Definition 1.1). Moreover, if I is an interval of [n], then J may be chosen to be an interval.
We claim that the interval I 2 is as desired. Indeed, fix a subset J of I 2 with |J| ⩾ d, and observe that F J ⊆ F I2 . Therefore, by (2.6) and the fact that the conditional expectation is a linear contraction on L r , we obtain that By Markov's inequality, this estimate yields that By (2.7), the choice of r and the choice of β and ℓ in (2.2) and (2.3) respectively, we conclude that which clearly implies (2.4). The proof of Theorem 2.3 is completed. □ The rest of this section is devoted to the proof of Theorem 2.3, which is based on inequalities for martingales in L p spaces. Martingales are, of course, standard tools in the proofs of concentration estimates. Typically, one decomposes a given random variable X into martingale increments, and then controls an appropriate norm of X by controlling the norm of the increments. In the proof of Theorem 2.3 we also decompose a given random variable into martingale increments but, in contrast, we seek to find one of the increments that has controlled norm. This method, known as the energy increment strategy, was introduced in the present probabilistic setting by Tao [Tao06] for "p = 2", and then extended in the full range of admissible p's in [DKT16]. Having said that, we also note that the main novelty of the present paper lies in the selection of the filtration.
We now briefly describe the contents of the rest of this section. In Subsection 2.3 we present the analytical estimate that is used 10 in the proof of Theorem 2.3. In Subsection 2.4 we prove an orthogonality result for pairs of σ-algebras that satisfy the estimate (2.1). The proof of Theorem 2.3 is completed in Subsection 2.5. Finally, in Subsection 2.6 we show that, for spreadable random arrays, the assumption of approximate dissociativity in Theorem 2.2 is necessary.
2.3. Martingale difference sequences. It is an elementary, though important, fact that martingale difference sequences are orthogonal in L 2 . We will need the following extension of this fact.
Proposition 2.4. Let 1 < p ⩽ 2. Then for every martingale difference sequence In particular, We note that the constant (p − 1) −1/2 in (2.9) is optimal; this sharp estimate was proved by Ricard and Xu [RX16] who deduced it from a uniform convexity inequality for L p spaces-see [Pi11,Lemma 4.32], and also [DKK16, Appendix A] for an exposition.

Mixing and orthogonality.
In what follows, it is convenient to introduce the following terminology. Let (Ω, Σ, P) be a probability space, and let 0 ⩽ β ⩽ 1; given two sub-σ-algebras A, B of Σ, we say that A and B are β-mixing provided that for every A ∈ A and every B ∈ B we have Notice that in the extreme case "β = 0" the estimate (2.11) is equivalent to saying that the σ-algebras A and B are independent, which in turn implies for every random variable X with E[X] = 0 we have E E[X | A] | B] = 0. The main result in this subsection (Proposition 2.7 below) is an approximate version of this fact. We start with the following lemma.
Lemma 2.5. Let (Ω, Σ, P) be a probability space, let 0 ⩽ β ⩽ 1, and let A, B be two subσ-algebras of Σ that are β-mixing. Then for every real-valued, bounded, random variable X and every 1 ⩽ p ⩽ ∞ we have 10 Square-function estimates could also be used, but they do not yield optimal dependence with respect to the integrability parameter p.
For the proof of Lemma 2.5 we need the following simple fact.
Fact 2.6. Let (X, Σ, µ) be a measure space, and let f : X → R be an integrable function. Then we have In particular, if x 1 , . . . , x m ∈ R, then We proceed to the proof of Lemma 2.5.
Proof of Lemma 2.5. We prove the L 1 -estimate; the L p -estimate for p > 1 follows from the L 1 − L ∞ bound, and the fact that the conditional expectation is a linear contraction on L ∞ . Without loss of generality we may assume that E[X] = 0. (If not, then we work with the random variable If we set x i := P(A i ∩ B) − P(A i ) P(B), we obtain that where we have also used the pointwise bound |a i | ⩽ ∥Z∥ L∞ and Fact 2.6. Finally, setting A I := i∈I A i for every nonempty I ⊆ [N ], then we have since the sets A 1 , . . . , A N are pairwise disjoint and A I ∈ A. We conclude that Since B ∈ B was arbitrary, the result follows. □ We are now ready to state the main result in this subsection.
Proposition 2.7. Let (Ω, Σ, P) be a probability space, let 0 ⩽ β ⩽ 1, and let A, B be two sub-σ-algebras of Σ that are β-mixing. Let 1 ⩽ r < p ⩽ ∞, and let X ∈ L p . Then, Proof. Notice that (2.19) is straightforward if β = 0; thus, we may assume that β > 0. In this case, we will obtain the estimate by truncating X and employing Lemma 2.5. We lay out the details. As in the proof of Lemma 2.5, we may assume that E[X] = 0. Let t > 0 (to be chosen later) be the truncation level, and set X t := X1 [|X|⩽t] . Markov's inequality yields that P(|X| > t) ⩽ t −p ∥X∥ p Lp , thus applying Hölder's inequality we obtain that for any 1 ⩽ r < p. Therefore, where we have used the contraction property of the conditional expectation, Lemma 2.5 for the random variable X t , and the fact E[X] = 0, respectively. Taking into account (2.20), we conclude that It remains to optimize the latter with respect to t; the choice t := β −1/p ∥X∥ Lp yields the assertion. □ 2.5. Proof of Theorem 2.3. We start by observing that the case "β = 0" follows from the case "β > 0" by taking the limit in (2.5) as β goes to zero. Thus, in what follows, we may assume that β > 0. After normalizing, we may also assume that Fix an integer k with d ⩽ k < ⌊ℓ/2⌋ and I ∈ [n] ℓ , and let {ι 1 < · · · < ι ℓ } denote the increasing enumeration of I. Set m := ⌊ℓ/k⌋. Also let K 1 , . . . , K m ∈ [ℓ] k be successive intervals with min(K 1 ) = 1, and set J i := {ι κ : κ ∈ K i } for every i ∈ [m]. Thus, the sets J 1 , . . . , J m are successive subsets of I each of cardinality k; also notice that if I is an interval of [n], then the sets J 1 , . . . , J m are intervals too.
Next, denote by (Ω, Σ, P) the underlying probability space on which the random array X is defined, and for every i ∈ [m] let F Ji be the σ-algebra generated by the subarray X Ji see Figure 3. We will use variants of this filtration in Section 8.
We claim that the set J := J i0 is as desired.
To this end, fix 1 ⩽ r < p. First observe that, conditioning further on F Ji 0 , where we have used the fact that F Ji 0 ⊆ A i0 , the contractive property of the conditional expectation once more, and (2.26). By the triangle inequality and taking into account (2.27) and the monotonicity of the L p -norms, we obtain that Finally, by (2.24) and our assumption that the random array X is (β, ℓ)-dissociated, we see that the σ-algebras F Ji 0 and A i0−1 are β-mixing in the sense of Definition 2.1. By Proposition 2.7, we conclude that and the proof is completed.
2.6. Necessity of approximate dissociativity. We close this section with the following proposition, which shows that the assumption of approximate dissociativity in Theorem 2.2 is necessary.
Proposition 2.8. Let n, d, ℓ be positive integers with n ⩾ ℓ ⩾ d, let 0 < β ⩽ 1, let X be a spreadable, d-dimensional random array on [n] whose entries take values in a measurable space X , and assume that X is not (β, ℓ)-dissociated. Then there exists a measurable function f : Proof. Since the random array X is spreadable and not and A agree almost surely, and we We claim that f is as desired. Indeed, let I ∈ [n] ℓ be arbitrary. We select L ∈ I k with min(L) > j. Invoking the spreadability of X and the choice of A and B, we may also select Γ ∈ F L such that , and using the fact that Γ ∈ F L ⊆ F I , we obtain that Remark 2.9. Notice that if the random array X in Proposition 2.8 is boolean, then the function f defined above is just a polynomial of degree at most ℓ d .

The box independence condition propagates
3.1. The main result. We start by introducing some pieces of notation and some terminology.
We proceed with the following definition. Note that the "(ϑ, S)-box independence" condition introduced below is the one-sided version of (1.6); we will work with this slightly weaker version since it is more amenable to an inductive argument.
Definition 3.1. Let n, d be positive integers with n ⩾ 2d, let X be a nonempty finite set, and let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, d-dimensional random array on [n]. Also let S be a nonempty subset of X .
and every a ∈ S we have (ii) (Approximate independence) Set ℓ := ⌊n/2⌋ d , and let γ = (γ k ) ℓ k=1 be a finite sequence of positive reals. We say that X is (γ, S)-independent if for every nonempty subset F of [n] d such that F has cardinality at most n/2, and every collection (a s ) s∈F of elements of S we have We are ready to state the main result in this section. It is the higher-dimensional version of Theorem 1.5, and its proof is given in Section 4. (The numerical invariants appearing below are defined in Subsection 4.2, and they are estimated in Lemma 4.4.) Theorem 3.2. Let d, n be positive integers with n ⩾ 4d, let 0 < η, ϑ ⩽ 1, and set ℓ := ⌊n/2⌋ d . Then there exists a sequence γ = (γ k (η, ϑ, d, n)) ℓ k=1 of positive reals such that for every k ∈ [ℓ], and satisfying the following property. Let X be a finite set, let S be a nonempty subset of X , and let X be an X -valued, Observe that the estimate (3.5) yields that the quantity γ k (η, ϑ, d, n) tends to zero as n tends to infinity and η, ϑ go to zero.

3.2.
Consequences. The rest of this section is devoted to the proof of two consequences of Theorem 3.2. The first consequence shows that the box independence condition implies approximate dissociativity. Specifically, we have the following corollary.
Corollary 3.3. Let d, ℓ, m be positive integers with ℓ ⩾ 2d and m ⩾ 2, and let 0 < β ⩽ 1. Also let n be a positive integer and 0 < η, ϑ ⩽ 1 with Finally, let X be a set with |X | = m, let S be a subset of X with |S| = |X | − 1, and let X be an The second consequence of Theorem 3.2 shows that the box independence forces all subarrays indexed by d-dimensional boxes to behave independently. More precisely, we have the following corollary.
Corollary 3.4. Let d, m be positive integers with m ⩾ 2, and let 0 < γ ⩽ 1. Also let n be a positive integer and 0 < η, ϑ ⩽ 1 with Finally, let X be a set with |X | = m, let S be a subset of X with |S| = |X | − 1, and let X be an X -valued, η-spreadable, d-dimensional random array on [n]. If X is (ϑ, S)-box independent, then for every d-dimensional box B of [n] and every collection (a s ) s∈B of elements of X we have Remark 3.5. Although Corollary 3.4 is weaker than Theorem 3.2, a direct proof of the estimate (3.8) is likely to require the whole machinery presented in Section 4.
Corollaries 3.3 and 3.4 following from the following consequence of Theorem 3.2.
Lemma 3.6. Let d, m, κ be positive integers with m ⩾ 2, and let 0 < γ ⩽ 1. Also let n be a positive integer and 0 < η, ϑ ⩽ 1 with Finally, let X be a set with |X | = m, let S be a subset of X with |S| = |X | − 1, and let X be an X -valued, η-spreadable, d-dimensional random array on [n]. If X is (ϑ, S)-box independent, then for every nonempty subset F of [n] d with |F| ⩽ κ and every collection (a s ) s∈F of elements of X we have Notice that the conclusion of Lemma 3.6 is essentially (γ, X )-independence for the constant function γ, except that it holds when |F| ⩽ κ instead of | F| ⩽ n/2. We defer the proof of Lemma 3.6 to Subsection 3.3 below. At this point, let us give the proofs of Corollaries 3.3 and 3.4.
Proof of Corollary 3.3. Set κ := ℓ d and γ := 1 3 m −2κ β. Also let J, K be subsets of [n] with |J|, |K| ⩾ d, |J| + |K| ⩽ ℓ and max(J) < min(K), and let A ∈ F J and B ∈ F K . We Since A belongs to the σ-algebra generated by X J , there exists a collection A of maps of the form a : J d → X such that Similarly, there exists a collection B of maps of the form b : K d → X such that For every a ∈ A we set A a : . By Lemma 3.6, for every a ∈ A and every b ∈ B, we have On the other hand, by identities (3.11) and (3.12), we see that A∩B = a∈A,b∈B A a ∩B b ; moreover, the collections ⟨A a : a ∈ A⟩ and ⟨B b : b ∈ B⟩ consist of pairwise disjoint events. Thus, we have Therefore, we conclude that Proof of Corollary 3.4. It follows from Lemma 3.6 applied for "κ = 2 d ". □ 3.3. Proof of Lemma 3.6. The result follows from Theorem 3.2 and the inclusionexclusion formula. We start by setting γ ′ := m −κ γ. By (3.5) and (3.9), we see that n ⩾ max{4d, dκ} and γ k (η, ϑ, d, n) ⩽ γ ′ for every k ∈ [κ]. Therefore, by Theorem 3.2, for every nonempty F * ⊆ [n] d with |F * | ⩽ κ and every collection (a s ) s∈F * of elements of S, with |F| ⩽ κ, and let (a s ) s∈F be a collection of elements of X . Set F ′ := {s ∈ F : a s ∈ S} and G := F \ F ′ ; observe that for every t ∈ G the events ⟨[X t = a] : a ∈ S⟩ are pairwise disjoint and, moreover, (For any event E, by E ∁ we denote its complement.) Thus, for every t ∈ G we have with the convention that the product over an empty index-set is equal to 1. Moreover, Next observe that for every nonempty subset W of G we have and the events ⟨ t∈W [X t = a(t)] : a : W → S⟩ are pairwise disjoint. Hence, by the inclusion-exclusion formula, Combining identities (3.21) and (3.23), we see that with the convention that the intersection over an empty index-set is equal to the whole sample space. Finally, by identities (3.20) and (3.24) and the triangle inequality, we conclude that the quantity The proof of Lemma 3.6 is completed.

Proof of Theorem 3.2
This section is devoted to the proof of Theorem 3.2, which proceeds by induction on the dimension d. In a nutshell, the argument is based on repeated averaging and an appropriate version of the weak law of large numbers in order to gradually upgrade the box independence condition. The combinatorial heart of the matter lies in the selection of this averaging. 4.1. Toolbox. We begin by presenting three lemmas that are needed for the proof of Theorem 3.2, but they are not directly related with the main argument.
Lemma 4.1. Let m be a positive integer, let δ > 0 and let A 1 , . . . , A m be events in a probability space such that for every i, j ∈ [m] with i ̸ = j we have Proof. We have Let m be a positive integer, let η, δ > 0 and let E, A 1 , . . . , A m be events in a probability space such that for every i, j ∈ [m] with i ̸ = j we have Then for every i ∈ [m] we have . Notice that, by the triangle inequality, Invoking the triangle inequality again, we have Finally, by the Cauchy-Schwarz inequality, hypothesis (iii) and Lemma 4.1, The estimate (4.3) follows from (4.4)-(4.7). □ Lemma 4.3. Let m ⩾ 1 be an integer, let η > 0, and let (A i ) m i=1 be an η-spreadable sequence 12 of events in a probability space. Then for every i, j ∈ [m] with i ̸ = j, Then, by η-spreadability, we have Notice that η-spreadability also implies for every 0 < η ⩽ 1, every ϑ > 0 and every pair of positive integers k, n with n ⩾ 2 and k ⩽ n/2. Let d ⩾ 2 be an integer, and assume that the numbers γ k (η, ϑ, d − 1, n) have been defined for every choice of admissible parameters. Fix 0 < η ⩽ 1 and ϑ > 0, and let n be an integer with n ⩾ 4d. We set Moreover, for every positive integer u with u ⩽ n/2 and every choice k 1 , . . . , k u of positive integers with k 1 , . . . , k u ⩽ ⌊(n−2)/2⌋ with the convention that the sum in (4.17) is equal to 0 if u = 1. (Note that the sum above has at most min{u − 1, k 2 + · · · + k u } elements.) Finally, for every positive integer where the above maximum is taken over all choices of positive integers u, k 1 , . . . , k u satisfying u ⩽ n/2 − d, k 1 , . . . , k u ⩽ ⌊(n−2)/2⌋ d−1 and k 1 + · · · + k u = k. (Note that there are at most k k such choices.) The following lemma provides an estimate for the numbers γ k (η, ϑ, d, n) introduced above.
Lemma 4.4. For every 0 < η ⩽ 1, every ϑ > 0, every positive integer d, every integer n ⩾ 4d and every positive integer k ⩽ ⌊n/2⌋ d we have Proof. We start by observing that for every choice of positive integers d and k, the quan- k (η, ϑ, d, n) are all decreasing with respect to n, and increasing with respect to η and ϑ.
It is also convenient to introduce the following notation. For every pair of positive integers n, ℓ, every 0 < η ⩽ 1 and every ϑ > 0 we set Thus, it suffices to prove that for every pair of positive integers n, d with n ⩾ 4d, every 0 < η ⩽ 1, every ϑ > 0, and every positive integer k ⩽ ⌊n/2⌋ d . To that end we proceed by induction on d. The base case "d = 1" follows readily from (4.9). Next, let d be a positive integer with d ⩾ 2 and assume that (4.21) holds for d − 1, every integer n ⩾ 4d − 4, every 0 < η ⩽ 1, every ϑ > 0, and every positive integer k ⩽ ⌊n/2⌋ d−1 . Fix an integer n ⩾ 4d, 0 < η ⩽ 1 and ϑ > 0; by (4.10), (4.11) and (4.12), we have and observe that, by our inductive assumption, Additionally, by (4.13)-(4.16) and the monotonicity properties of γ k (η, ϑ, d, n), for every positive integer k ⩽ ⌊(n−1)/2⌋ and therefore, invoking the fact that γ k ⩽ 1, we obtain that By (4.17) and (4.18) and using the linearity of the upper bound in (4.28) with respect to the parameter k, we conclude that for every positive integer k ⩽ ⌊n/2⌋ 4.3. The inductive hypothesis. For every positive integer d by P(d) we shall denote the following statement.
By Lemma 4.4, it is clear that Theorem 3.2 follows from the validity of P(d) for every positive integer d.
4.4. The base case "d = 1". The initial step of the induction follows from the following lemma.
Lemma 4.5. Let n, η, ϑ, X and S be as in the statement of P(1), and assume that X = (X 1 , . . . , X n ) is an X -valued, η-spreadable, random vector. Assume, moreover, that X is (ϑ, S)-box independent, that is, for every i, j ∈ [n] with i ̸ = j and every a ∈ S we have Then X is (γ, S)-independent, that is, for every nonempty F ⊆ [n] with |F| ⩽ n/2 and every collection (a i ) i∈F of elements of S, we have where γ := (γ k (η, ϑ, 1, n)) ⌊n/2⌋ k=1 is as in (4.9). In particular, P(1) holds true.
Proof. Observe that, by the η-spreadability of X, it is enough to show that for every k ∈ {1, . . . , ⌊n/2⌋} and every a 1 , . . . , a k ∈ S we have To this end, we proceed by induction of k. The case "k = 1" is straightforward. Let k be a positive integer with k < ⌊n/2⌋, and assume that (4.32) has been verified up to k. Fix a 1 , . . . , a k+1 ∈ S. Set m := ⌊n/2⌋ and E : Moreover, since a k+1 ∈ S, we have Applying Lemma 4.2 for "δ = ϑ" and using the definition of A 1 , we see that On the other hand, by our inductive assumptions, we have Combining (4.33) and (4.34), we see that (4.32) is satisfied, as desired. □ 4.5. The general inductive step. We now enter into the main part of the proof of Theorem 3.2. Specifically, fix an integer d ⩾ 2. Throughout this subsection, we will assume that P(d − 1) has been proved.
We also note that, in what follows, we will estimate the difference of various products in terms of the differences of the factors, the number of factors and the L ∞ norm of the factors. The reader should have in mind this remark, as we will use this standard telescoping argument without further notice.

4.5.1.
Step 1: preparatory lemmas. Our goal in this step is to prove two probabilistic lemmas that will be used in the third and the fourth step of the proof respectively. Strictly speaking, these lemmas are not part of the proof of P(d) since in their proofs we do not use the inductive assumptions. (In particular, this subsection can be read independently.) The first lemma essentially shows that the reverse inequality of (3.3) always holds true in the presence of approximate spreadability.
Lemma 4.6. Let n be an integer with n ⩾ 2d, let 0 < η < 1, let X be a nonempty finite set, and let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, η-spreadable, d-dimensional random array on [n]. Then for every t ∈ [n−2] d−1 and every a ∈ X we have Observe that the sequence (A 1 . . . , A n−d+1 ) is η-spreadable 13 . By Lemma 4.3, we obtain that (4.36) By (4.36) and the η-spreadability of X, the estimate (4.35) follows. □ The second lemma shows that the box independence condition (3.3) is inherited by the two-dimensional faces of d-dimensional boxes.
Lemma 4.7. Let n be an integer with n ⩾ 2d, let 0 < η < 1, let ϑ > 0, let X be a nonempty finite set, let S be a nonempty subset of X , and let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, η-spreadable, d-dimensional random array on [n] that is (ϑ, S)-box independent.

4.5.2.
Step 2: rewriting the inductive assumptions. We proceed with the following lemma, which will enable us to use P(d − 1) in a more convenient form.
Step 3: doubling. The following lemma complements Lemma 4.8. It is also based on the inductive hypothesis P(d − 1), but it will enable to use it in a rather different form.
Proof. It is clear that X ′ is (X × X )-valued and η-spreadable. So, we only need to show that X ′ is (ϑ 2 (η, ϑ, d By (4.56) and (4.57) and the definition of ϑ 2 (η, ϑ, d, n), the result follows. □ The following corollary-which is an immediate consequence of Lemma 4.10 and the inductive assumption P(d − 1)-is the analogue of Corollary 4.9.

4.5.4.
Step 4: gluing. This is the main step of the proof. Specifically, our goal is to prove the following proposition.
Proposition 4.12 (Gluing). Let n ⩾ 2d + 2 be an integer, let η, ϑ, X , S be as in the statement of P(d), and assume that X = ⟨X s : s ∈ [n] d ⟩ is an X -valued, η-spreadable, d-dimensional random array on [n] that is (ϑ, S)-box independent. Finally, let r be an integer with d < r ⩽ n/2, let G be a nonempty subset of [r−1] d−1 , let (a t ) t∈G be a collection of elements of S, let F be a nonempty subset of [r−1] d , and let (b s ) s∈F be a collection of elements of S. Then we have |G| (η, ϑ, d, n) + ⌊n/2⌋ −1 + (2|G| + 1)η 1/2 + 2η is as in (4.16).
Proposition 4.12 follows by carefully selecting a sequence of events, and then applying the averaging argument presented in Lemma 4.2. In order to do so, we need to control the variances of the corresponding averages. This is the content of the following lemma.

Proof. Let G be a subset of [n−2]
d−1 with | G| ⩽ (n − 2)/2, and let (a t ) t∈G be a collection of elements of S. By Corollary 4.11, we have Moreover, by Lemma 4.7, Finally, by Corollary 4.9, we see that    We are now ready to give the proof of Proposition 4.12.

4.5.5.
Step 5: completion of the proof. This is the last step of the proof. Recall that we need to prove that the statement P(d) holds true, or equivalently, that the estimate (3.4) is satisfied for the sequence γ = (γ k (η, ϑ, d, n)) ℓ k=1 defined in Subsection 4.2. As expected, the verification of this estimate will be reduced to Proposition 4.12. To this end, we will decompose an arbitrary nonempty subset F of [n] d into several components that are easier to handle. The details of this decomposition are presented in the following definition.
We have the following lemma.
Lemma 4.16. Let n, η, ϑ, X , S be as in P(d). Let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, η-spreadable, d-dimensional random array on [n] that is (ϑ, S)-box independent. Also let u ⩽ (n/2) − d + 1 be a positive integer, and let , then for every collection (a s ) s∈F of elements of S, we have Proof. We proceed by induction on u. The case "u = 1" follows from Corollary 4.9. Let u ⩽ (n/2) − d be a positive integer, and assume that (4.65) has been proved up to u. Let ) denote the slicing of F, and decompose F as F 1 ∪ F 2 , where (4.66) and |G u+1 | = k u+1 . By Proposition 4.12 applied for "r = r u+1 ", "G = G u+1 ", "(a t ) t∈G = (a t∪{ru+1} ) t∈Gu+1 ", "F = F 1 and "(b s ) s∈F = (a s ) s∈F1 ", we have On the other hand, by our inductive assumptions, we obtain that Moreover, since |G u+1 | = k u+1 , by Corollary 4.9, The inductive step is completed by combining (4.68) and (4.69) and using the definition of the constant γ (5) (η, ϑ, d, n, (k i ) u+1 i=1 ) in (4.17). □ It is clear that Lemma 4.16 implies that P(d) holds true. This completes the proof of the general inductive step, and so the entire proof of Theorem 3.2 is completed.

Proof of Theorem 1.4 and its higher-dimensional version
The following theorem is the higher-dimensional version of Theorem 1.4. (Also note that the case "d = 1" corresponds to random vectors.) Theorem 5.1. Let d, m be two positive integers with m ⩾ 2, let 1 < p ⩽ 2, let 0 < ε ⩽ 1, let k ⩾ d be an integer, and set Also let n ⩾ C be an integer, let X be a set with |X | = m, and let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, (1/C)-spreadable, d-dimensional random array on [n]. Assume that there exists S ⊆ X with |S| = |X | − 1 such that for every a ∈ S we have where Box(d) denotes the d-dimensional box defined in (3.2). Then for every function f : X ( [n] d ) → R with E[f (X)] = 0 and ∥f (X)∥ Lp = 1 there exists an interval I of [n] with |I| = k such that for every J ⊆ I with |J| ⩾ d we have Proof. Fix ε and k, let β = β(p, ε) = ε 10 10 p−1 and ℓ = ℓ(p, ε, k) = 4 ε 4 (p−1) k be as in (2.2) and (2.3) respectively, and set (5.4) Claim 5.2. We have C 1 (2 + 2 d ) ⩽ C.

Extensions/Refinements
6.1. Dissociated random arrays. The following theorem is a version of Theorem 5.1 for the case of dissociated random arrays.
Theorem 6.1. Let 1 < p ⩽ 2, let 0 < ε ⩽ 1, and set c = c(ε, p) := 1 4 ε 2(p+1) p (p − 1). (6.1) Also let n, d be positive integers with n ⩾ 2d/c, and let X be a dissociated, d-dimensional random array on [n] whose entries take values in a measurable space X . Then for every measurable function f : X ( [n] d ) → R with E[f (X)] = 0 and ∥f (X)∥ Lp = 1 there exists an interval I of [n] with |I| ⩾ cn such that for every J ⊆ I with |J| ⩾ d we have Proof. Set k := ⌈cn⌉ and ℓ := n, and note that d ⩽ k ⩽ ⌊ℓ/2⌋. Using the continuity of the L p -norms and the fact that the random array X is (β, ℓ)-dissociated for every 0 < β ⩽ 1, by Theorem 2.3 and taking the limit in (2.5) first as β goes to zero and then as r → p − , there exists I ∈ [n] k such that By the contractive property of conditional expectation, this in turn implies that for every J ⊆ I with |J| ⩾ d we have The result follows from (6.4) and Markov's inequality. □ Note that Theorem 6.1 improves upon Theorem 5.1 in two ways. Firstly, observe that in Theorem 6.1 no restriction is imposed on the distributions of the entries of X. Secondly, note that the random variable f (X) becomes concentrated by conditioning it on a subarray whose size is proportional to n.
An important-especially, from the point of view of applications-class of random arrays for which Theorem 6.1 is applicable consists of those random arrays whose entries are of the form (1.12), where (ξ 1 , . . . , ξ n ) is a random vector with independent (but not necessarily identically distributed) entries.
Remark 6.2. Observe that the lower bound on the cardinality of the set I obtained by Theorem 6.1 depends polynomially on the parameter ε and, in particular, it becomes smaller as ε gets smaller. We note that this sort of dependence is actually necessary. This can be seen by considering (appropriately normalized) linear functions of i.i.d. Bernoulli random variables and invoking the Berry-Esseen theorem.
6.2. Vector-valued functions of random arrays. Recall that a Banach space E is called uniformly convex if for every ε > 0 there exists δ > 0 such that for every x, y ∈ E with ∥x∥ E = ∥y∥ E = 1 and ∥x − y∥ E ⩾ ε we have that ∥(x + y)/2∥ E ⩽ 1 − δ. It is a classical fact (see [Ja72,GG71]) that for every uniformly convex Banach space E and every p > 1 there exist an exponent q ⩾ 2 and a constant C > 0 such that for every (see, also, [Pi11,Pi16] for a proof and a detailed presentation of related material). Using (6.5) instead of Proposition 2.4 and arguing precisely as in Section 2, we obtain the following vector-valued version of Theorem 5.1.
Theorem 6.3. For every uniformly convex Banach space E, every pair d, m of positive integers with m ⩾ 2, every p > 1, every 0 < ε ⩽ 1 and every integer k ⩾ d, there exists a constant C > 0 with the following property.
Let n ⩾ C be an integer, let X be a set with |X | = m and let X = ⟨X s : s ∈ [n] d ⟩ be an X -valued, (1/C)-spreadable, d-dimensional random array on [n]. Assume that there exists S ⊆ X with |S| = |X | − 1 such that for every a ∈ S we have where Box(d) denotes the d-dimensional box defined in (3.2). Then for every function f : X ( [n] d ) → E with E[f (X)] = 0 and ∥f (X)∥ Lp(E) = 1 there exists an interval I of [n] with |I| = k such that for every J ⊆ I with |J| ⩾ d we have 6.3. Simultaneous conditional concentration. Our last result in this section can be loosely described as "simultaneous conditional concentration"; it asserts that we can achieve concentration by conditioning on the same subarray for almost all members of a given family of approximate spreadable random arrays with the box independence condition.
Next, set m ′ := ⌊ℓ ′ /k⌋ and observe that, by the choices of ℓ ′ , C ′ and (2.3) and (5.1), Let v ∈ V be arbitrary. For every i ∈ [m ′ ] set J i := k(i − 1) + j : j ∈ [k] and let F v Ji be the σ-algebra generated by the subarray of X v determined by J i (see Definition 1.1). As in (2.24), we define a filtration (A v i ) m ′ i=0 by setting A v 0 := {∅, Ω} and (6.12) . By Proposition 2.4 and our assumptions, and so, there exists i 0 ∈ [m ′ ] such that Set J := J i0 and r := p+1 2 . By Markov's inequality, there exists G ⊆ V with where we have used the monotonicity of the L p -norms, the contractive property of conditional expectation, and the fact that 1 < r < p ⩽ 2. Next observe that since Moreover, the fact that the random array X v is (β ′ , ℓ ′ )-dissociated implies that the σ-algebras A i0−1 and F J are β ′ -mixing in the sense of (2.11). Thus, by Proposition 2.7, (6.16), the triangle inequality and the choice of β ′ , we obtain that for every v ∈ G, The proof is completed by (6.17) and Markov's inequality. □ Remark 6.5. We note that there is also an extension of Theorem 6.1 in the spirit of Theorem 6.4. More precisely, if we assume in Theorem 6.4 that for every v ∈ V the random array X v is dissociated (not necessarily finite-valued), then the interval I can be selected so as |I| ⩾ c ′ n, where c ′ := 1 4 ε 2(2p+1) p (p − 1).
Part 2. Connection with combinatorics

Random arrays arising from combinatorial structures
In this section we present examples of boolean, spreadable, high-dimensional random arrays that arise from combinatorial structures and they satisfy the box independence condition and/or are approximately dissociated. 7.1. From graphs and hypergraphs to spreadable random arrays. Let d ⩾ 2 be an integer, and let V be a finite set with |V | ⩾ d. With every subset A of V d we associate a boolean, spreadable, d-dimensional random array X A = ⟨X A s : s ∈ N d ⟩ on N defined by setting for every s = {i 1 < · · · < i d } ∈ N d , where (ξ i ) is a sequence of i.i.d. random variables uniformly distributed on V . (Notice that if A is a nonempty proper subset of V d , then the entries of X A are not independent.) A special case of this construction, which is relevant in the ensuing discussion, is obtained by considering a d-uniform hypergraph on V . Specifically, given a d-uniform hypergraph G on V , we identify G with a subset G of V d via the rule and we define X G = ⟨X G s : s ∈ N d ⟩ to be the random array in (7.1) that corresponds to the set G. Note that this definition is canonical, in the sense that various combinatorial parameters of G can be expressed as functions of the finite subarrays of X G . For instance, let n ⩾ d be an integer, and let F be a d-uniform hypergraph on [n]; then, denoting by t(F, G) the homomorphism density of F in G (see [Lov12, Chapter 5]), we have and X G,n denotes the subarray of X G determined by [n] (see Definition 1.1). Of course, similar identities are valid for weighted uniform hypergraphs.
As we shall see shortly in Proposition 7.2 below, in this framework the box independence condition of the random array X G is in fact equivalent to a well-known combinatorial property of G, namely its quasirandomness. That said, we point out that the connection between quasirandomness and random arrays with a symmetric distribution has been observed in much greater generality in the general theory of limits of combinatorial structures; see, e.g., [Au08, CR20, DJ08, ES12, J11, Lov12, Ra07, To17]. 7.1.1. Quasirandom graphs and hypergraphs. Quasirandom objects are deterministic discrete structures that behave like random ones for most practical purposes. The phenomenon was first discovered in the context of graphs by Chung, Graham and Wilson [CGW88,CGW89] who build upon previous work of Thomason [Tho87]. In the last twenty years the theory was also extended to hypergraphs, and it has found numerous significant applications in number theory and theoretical computer science (see, e.g., [Rő15]). 7.1.1.1. Much of the modern theory of quasirandomness is developed using the box norms introduced by Gowers [Go07]. Specifically, let d ⩾ 2 be an integer, let (Ω, Σ, µ) be a probability space, and let Ω d be equipped with the product measure. For every integrable random variable f : Ω d → R we define its box norm ∥f ∥ □ by the rule where µ denotes the product measure on Ω 2d and, for every ω = (ω 0 1 , ω 1 1 , . . . , ω 0 d , ω 1 d ) ∈ Ω 2d and every ϵ = (ϵ 1 , . . . , ϵ d ) ∈ {0, 1} d we have ω ϵ := (ω ϵ1 1 , . . . , ω ϵ d d ) ∈ Ω d ; by convention, we set ∥f ∥ □ := +∞ if the integral in (7.5) does not exist. The quantity ∥ · ∥ □ is a norm on the vector space {f ∈ L 1 : ∥f ∥ □ < +∞}, and it satisfies the following inequality, known as the Gowers-Cauchy-Schwarz inequality: for every collection ⟨f ϵ : ϵ ∈ {0, 1} d ⟩ of integrable random variables on Ω d we have For proofs of these basic facts, as well as for a more complete presentation of related material, we refer to [GT10, Appendix B]. 7.1.1.2. The link between the box norms and quasirandomness is given in the following definition.
Definition 7.1 (Box uniformity). Let d ⩾ 2, let V be a finite set with |V | ⩾ d, and let ϱ > 0. We say that a d-uniform hypergraph G on V is ϱ-box uniform (or, simply, box uniform if ϱ is clear from the context) provided that where G is as in (7.2). (Here, we view V as a discrete probability space equipped with the uniform probability measure.) Of course, Definition 7.1 is interesting when the parameter ϱ is much smaller than E[1 G ]. We also note that although box uniformity is defined analytically, it has a number of equivalent combinatorial formulations. For instance, it is easy to see that a graph G is box uniform if and only if it has roughly the expected number of 4-cycles; see, e.g., [ACHPS18, CGW88, CGW89, CG90, KRS02, LM15, Rő15, To17] for more information on quasirandomness properties of graphs and hypergraphs and their relation with analytical properties of box norms. 7.1.2. The box independence condition via quasirandomness. We have the following proposition (see part (i) of Definition 3.1 for the definition of box independence).
Proposition 7.2. Let d ⩾ 2 be an integer, and let V be a finite set with |V | ⩾ d. Also let G be a d-uniform hypergraph on V , let X G = ⟨X G s : s ∈ N d ⟩ be the random array associated with G via (7.1), and for every integer n ⩾ d let X G,n denote the subarray of X G determined by [n]. Finally, let ϱ, ϑ > 0. Then the following hold.
Proof. We start with the following observation, which follows readily from (7.1). Let Box(d) be the d-dimensional box defined in (3.2), and let F be a nonempty subset of Box(d). Then there exists a subset 14 H of {0, 1} d with |F | = |H| and such that (Here, by µ we denote the uniform probability measure on V 2d , and we follow the conventions described right after (7.5).) We proceed to the proof of part (i). Notice that ∥1 G ∥ □ ⩽ ∥1 G ∥ L∞ ⩽ 1 and, moreover, Taking into account these observations and using our assumption, identity (7.8), a telescopic argument and the Gowers-Cauchy-Schwarz inequality (7.6), we obtain that Since the random array X G is spreadable, by Definition 3.1 and (7.9), we see that For the proof of part (ii) we will need the following fact.
Fact 7.3. Let the notation and assumptions be as in part (ii) of Proposition 7.2. Then for every nonempty subset F of Box(d) we have Proof of Fact 7.3. If ϑ > 1, then (7.10) is straightforward; thus, we may assume that 0 < ϑ ⩽ 1. Let n ⩾ 4d be arbitrary. Notice that the random array X G,n is η-spreadable for every 0 < η ⩽ 1 and (ϑ, {1})-box independent. Therefore, the result follows by applying Theorem 3.2 and taking the limit in the left-hand-side of (3.5) as η goes to zero and n tends to infinity. □ Using Fact 7.3, we shall estimate the quantity (Here, as in the proof of Lemma 3.6, we use the convention that the product over an empty index-set is equal to 1.) Fix a nonempty subset H of {0, 1} d , and let F be the subset of Box(d) with |F | = |H| and such that (7.8) is satisfied; since E[X G s ] = E[1 G ] for every s ∈ N d , by Fact 7.3, we have By (7.11), (7.12) and the fact that the resulting sum vanishes, we conclude that G is (12 ϑ 1/8 d )-box uniform, as desired. □ 14 Note that this subset is essentially unique.

7.2.
Mixtures. An important property of the class of boolean, spreadable random arrays is that it is closed under mixtures. More precisely, let n, d, J be positive integers with n ⩾ d ⩾ 2 and let X 1 = ⟨X 1 s : s ∈ [n] d ⟩, . . . , X J = ⟨X J s : s ∈ [n] d ⟩ be boolean, spreadable, d-dimensional random arrays on [n]. Then, for any choice λ 1 , . . . , λ J of convex coefficients, there exists a boolean, spreadable, d-dimensional random array X = ⟨X s : s ∈ [n] d ⟩ on [n] that satisfies d . It turns out that boolean, spreadable random arrays that satisfy the box independence condition are also closed under mixtures under suitable conditions. In particular, we have the following proposition (its proof follows from a direct computation).
Proposition 7.4. Let n, d, J be positive integers with n ⩾ d ⩾ 2, and let δ, ϑ > 0. For Observe that, by Propositions 7.2 and 7.4, if G 1 , . . . , G J are quasirandom, d-uniform hypergraphs with the same edge density, then any mixture of the finite subarrays of X G1 , . . . , X G J satisfies the box independent condition. We note that this fact essentially characterizes the box independence condition. Specifically, it follows from [DTV21, Propositions 8.3 and 3.1] that for every boolean, spreadable, d-dimensional random array X that satisfies the box independence condition, there exist quasirandom, d-uniform hypergraphs G 1 , . . . , G J with the same edge density, such that the law of X is close, in the total variation distance, to the law of a mixture of the finite subarrays of X G1 , . . . , X G J . 7.3. Further combinatorial structures. Let n, k, d be positive integers with n ⩾ d ⩾ 2 and k ⩽ n d , and let Ξ = ⟨ξ e : e ∈ [n] d ⟩ be a d-dimensional random array with boolean entries that are uniformly distributed on the set of all x ∈ {0, 1} ( [n] d ) that have exactly k ones. (In particular, Ξ is exchangeable.) The random array Ξ generates the classical fixed size Erdős-Rényi random graph/hypergraph, and it is clear that it satisfies the box independence condition. By taking products 15 of the entries of Ξ as in (1.12), one also obtains exchangeable random arrays that are approximately dissociated.
Spreadable random arrays-and, in particular, spreadable random arrays that satisfy the box independence condition-are also closely related to a class of stochastic processes introduced by Furstenberg and Katznelson [FK91] in their proof of the density Hales-Jewett theorem (see also [Au11,DT21]). Unfortunately, this relation is not so transparent 15 These products have a natural combinatorial interpretation; e.g., they can be used to count subgraphs of random graphs. as in case of graphs and hypergraphs, and we shall refrain from discussing it further since it requires several probabilistic and Ramsey-theoretic tools in order to be properly exposed.
8. Quasirandom families of graphs: proof of Theorem 1.8 We start with some preparatory material that will be used throughout this section. If K ⊆ I are two nonempty finite sets, then for every z ∈ {0, 1} I by z ↾ K ∈ {0, 1} K we shall denote the restriction of z on K. Moreover, for every subset A of {0, 1} I and every x ∈ {0, 1} K by A x := y ∈ {0, 1} I\K : x ∪ y ∈ A we shall denote the section of A at x. We will need the following lemma.
Lemma 8.1 is a typical combinatorial application of conditional concentration, and it follows from [DKT16, Theorem 1 ′ ]. That said, for the convenience of the reader we shall briefly recall the argument that also gives slightly better estimates for this special case. Let (d i ) ℓ i=1 be the martingale difference sequence of the Doob martingale for 1 A with respect to the filtration (F i ) ℓ i=0 . Since ∥d 1 ∥ 2 L2 + · · · + ∥d ℓ ∥ 2 L2 = ∥d 1 + · · · + d ℓ ∥ 2 L2 ⩽ 1, there exists i 0 ∈ [ℓ] such that ∥d i0 ∥ 2 L2 ⩽ 1/ℓ that further implies, by the contractive property of conditional expectation, that , by Chebyshev's inequality, we obtain that  we have that G belongs to A only if every isomorphic copy of G belongs to A; see also (1.9). This subsection is devoted to the proof of the following proposition.
To this end, let z 1 , z 2 , z 3 , z 4 ∈ {0, 1} (  ( [4] 2 ) , where µ 2 denotes the uniform probability measure on 2 ) be such that u −1 0 ({1}) = ∅. Then, by the definitions of S and δ 4 and the isomorphic invariance of A, we see that µ 2 (S u0 ) = δ 4 and consequently, by (8.8), (8.15) and (8.16) and the fact that k ⩾ 5, we conclude that 8.2. Proof of Theorem 1.8. The main goal of the proof is to extract out of the quasirandom family A a boolean two-dimensional approximately spreadable random array X that satisfies the box independence condition; once this is done, the proof will be completed by an application of Theorem 1.5.
8.2.1. Preliminary tools. We start with a more precise, quantitative, version of Proposition 1.3 for boolean two-dimensional random arrays. Specifically, let ℓ, m, r ⩾ 2 be integers with ℓ ⩽ m, and recall that the multicolor hypergraph Ramsey number R ℓ (m, r) is the least integer N ⩾ m such that for every set X with |X| ⩾ N and every coloring c : X ℓ → [r] there exists Y ∈ X m such that c is constant on Y ℓ . It is a classical result due to Erdős and Rado [ER52] that the numbers R ℓ (m, r) have (at most) a tower-type dependence with respect to the parameters ℓ, m, r. The following fact is the promised quantitative version of Proposition 1.3.
Fact 8.3. Let 0 < η ⩽ 1, let ℓ ⩾ 2 be an integer, and let N be an integer such that Then for every boolean two-dimensional random array X on [N ] there exists L ∈ [N ] ℓ such that the random subarray X L of X is η-spreadable (see Definition 1.1).
Proof. Fix X and, for notational convenience, set k := 2 ( ℓ 2 ) . Observe that there exists a partition of the positive cone of the unit ball of (R k , ∥ · ∥ ℓ1 ) into ⌈k/η⌉ k parts, each of ∥ · ∥ ℓ1 -diameter at most η. This partition induces, naturally, a coloring c of [N ] ℓ with ⌈k/η⌉ k colors: color L ∈ [N ] ℓ according to the shell of the partition that contains that law of X L . Notice, in particular, that for every L, K ∈ [N ] ℓ with c(L) = c(K) we have ρ TV (P L , P K ) ⩽ η, where P L and P K denote the laws of the subarrays X K and X L , respectively (recall that ρ TV stands for the total variation distance). By (8.18), there exists M ∈ [N ] 2ℓ such that the coloring c is constant on M ℓ . Let L denote the set of the first ℓ elements of M . We claim that X L is η-spreadable. Indeed, let r ∈ {2, . . . , ℓ} be an integer, and let Q, R ∈ L r . We select J, K ∈ M ℓ such that Q and R are the sets of the first r elements of J and K, respectively. Then ρ TV (P Q , P R ) ⩽ ρ TV (P J , P K ) ⩽ η, where P Q , P R , P J , P K denote the laws of the subarrays X Q , X R , X J , X K , respectively. □ We proceed by introducing some terminology and some pieces of notation. Let m ⩾ ℓ be positive integers and let F ∈ [m] ℓ ; given two subsets L ⊆ M of N with |L| = ℓ and |M | = m, we say that the relative position of L inside M is F if, denoting by {i 1 < · · · < i m } the increasing enumeration of M , we have that L = {i j : j ∈ F }.
ℓ is the unique subset of M whose relative position inside M is F , then the two-dimensional random array ⟨1 A x(e,M ) : e ∈ L 2 ⟩ is η-spreadable. (Here, we view {0, 1} ( [n] 2 )\( M 2 ) as a discrete probability space equipped with ν and we denote by A x(e,M ) the section of A at x(e, M ).) We have the following lemma.
Lemma 8.5. Let 0 < η ⩽ 1, let ℓ ⩾ 2 be an integer, and set Also let p be an integer with p ⩾ m and set 2 )\( M 2 ) . In other words, property (P1) in Definition 8.4 will be satisfied as long as the desired set P is contained in Q 1 .
For property (P2) we argue as follows. Let M ∈ Q1 m be arbitrary; by the choice of the constant m in (8.19) and Fact 8.3 applied to the boolean, two-dimensional random array ℓ such that if L ∈ M ℓ is the unique subset of M whose relative position inside M is F M , then the random array ⟨1 A x(e,M ) : e ∈ L 2 ⟩ is η-spreadable. By the choice of q 1 in (8.20) and another application of Ramsey's theorem, there exist P ∈ Q1 p and F ∈ [m] ℓ such that F M = F for every M ∈ P m . That is, property (P2) is satisfied for P , as desired. □ 8.2.2. Numerical parameters. Our next step is to introduce some numerical parameters. We fix 0 < δ ⩽ 1 and an integer k ⩾ 2, and we begin by selecting 0 < η, θ 0 ⩽ 1 and an integer ℓ ⩾ 4k such that  = q 1 · 2 q1+1 η −2 and θ := 1 2 min θ 0 , q 0 4 −1 .

8.2.3.
Completion of the proof. We are ready for the main part of the argument. As above, let 0 < δ ⩽ 1 and k ⩾ 2. Also let n ⩾ q 0 be an integer and let A ⊆ {0, 1} ( [n] 2 ) be a θ-quasirandom family of graphs with µ(A) ⩾ δ, where q 0 , θ are as in (8.27).
By Lemma 8.5, for every Q ∈ [n] q0 we fix P Q ∈ Q p and F Q ∈ [m] ℓ such that P Q is (A, η, F Q )-admissible in the sense of Definition 8.4. Moreover, denoting by {r 1 < · · · < r p } the increasing enumeration of P Q , we set U Q := r j(m−4)J+j : j ∈ {1, 2, 3, 4} ∈ Q 4 . Then observe that In order to see that the first inequality in (8.28) is satisfied, notice that the uniform probability measure on [n] 4 can be obtain by first sampling a set Q ∈ [n] q0 uniformly at random, and then sampling a set U ∈ Q 4 also uniformly at random; that is, for every A ⊆ and recalling that U Q ∈ Q 4 for all Q ∈ [n] q0 . By (8.28) and the fact that the family A is θ-quasirandom in the sense of Definition 1.7, we may select Q 0 ∈ [n] q0 such that, writing U Q0 = {u 1 < u 2 < u 3 < u 4 } and setting 16 where P is the uniform probability measure on {0, 1} ( [n] 2 )\( U Q 0 2 ) . 16 Recall that by x({u 1 , u 3 }, Next observe that, by the choice of U Q0 , the set P Q0 has J(m − 4) elements between any two consecutive u j 's, J(m − 4) elements before u 1 and J(m − 4) elements after u 4 . Therefore, we may select M 1 , . . . , M J ∈ P Q 0 m such that • for every i ∈ [J], denoting by L i the unique element of Mi ℓ whose relative position inside M i is F Q0 , we have that U Q0 ⊆ L i , and On the other hand, recall that P Q0 is (A, η, F Q0 )-admissible and that L is the unique subset of M ∈ P Q 0 m whose relative position inside M is F Q0 . Taking into account these observations and using properties (P1) and (P2) in Definition 8.4, we see that the boolean random array ⟨1 A x(e,M ) : e ∈ L 2 ⟩ is η-spreadable and it satisfies (1.7) with ϑ = 6η+θ. Let {s 1 < · · · < s m } be the increasing enumeration of M , and for every j ∈ 1, . . . , ⌊ℓ/k⌋ set By Theorem 1.5, property (P1) in Definition 8.4 and the previous discussion, for every j, j ′ ∈ 1, . . . , ⌊ℓ/k⌋ with j ̸ = j ′ we have . k is as desired. The proof of Theorem 1.8 is completed.
Remark 8.6 (Analysis of the bounds). Using the Erdős-Rado theorem [ER52], it is not hard to see that the proof of Theorem 1.8 yields a tower-type dependence of θ and ℓ 0 with respect to the parameters δ and k. More precisely, there exists a primitive recursive ψ : N × N → N function belonging to the class E 4 of Grzegorczyk's hierarchy 17 such that θ −1 , ℓ 0 ⩽ ψ ⌈δ −1 ⌉, k for every 0 < δ ⩽ 1 and every integer k ⩾ 2.
Remark 8.7 (Extensions to families of uniform hypergraphs 2 \ K 2 such that W ∪ H ∈ A for every H ∈ S. With this terminology, Conjecture 1.6 is simply asking whether every dense family of graphs smashes some clique, while Theorem 1.8 is equivalent to saying that if the family A is dense and quasirandom (in the sense of Definition 1.7), then it smashes all graphs with at most one edge on some K ∈ [n] k . It would be interesting to find quasirandomness conditions that ensure that the family A smashes richer families of small graphs. In this direction, the following problem is the most intriguing.
Problem 8.9. Find natural quasirandomness conditions on a family of graphs A that ensure that A smashes all graphs on some K ∈ [n] k .

Appendix A. Examples
Our goal in this appendix is to present examples that show that the box independence condition in Theorems 1.4 and 5.1 is essentially optimal. We focus on boolean random arrays as this case already covers all underlying phenomena. A.1. Boxes and faces. We start by introducing some terminology that will be used throughout this section. Let d ⩾ 2 be an integer; we say that a subset B of N d is a d-dimensional box of N if it is a d-dimensional box of [n] for some integer n ⩾ 2d (see Subsection 3.1). Moreover, we say that a subset F of A.2. The two-dimensional case. We have the following proposition.
Proposition A.1. There exists a boolean, exchangeable, two-dimensional random array X = ⟨X s : s ∈ N 2 ⟩ on N with the following properties. Proof. We will define the random array X by providing an integral representation of its distribution. (Of course, this maneuver is expected by the Aldous-Hoover representation theorem [Ald81,Hoo79].) Specifically, set V := {0, 1} and A := {(0, 0), (1, 1)} ⊆ V 2 ; we view V as a discrete probability space equipped with the uniform probability measure. We also set Ω := {0, 1} ( N 2 ) and we equip Ω with the product σ-algebra, which we denote by Σ. Let P denote the (1/2, 1/2)-mixture of the uniform distribution on {0, 1} ( N 2 ) and the law of the random array X A associated with A via (7.1); that is, P is the unique probability measure on (Ω, Σ) that satisfies, for every nonempty finite subset F of N 2 , that where: (i) µ denotes the product measure on V N obtained by equipping each factor with the uniform probability measure on V , and (ii) for every v = (v i ) ∈ V N and every we denote the restriction of v on the coordinates determined by s. Next, for every s ∈ N 2 let X s : Ω → {0, 1} denote the projection on the s-th coordinate, that is, X s (x t ) t∈( N 2 ) = x s for every (x t ) t∈( N 2 ) ∈ Ω. The fact that the set A is symmetric implies that the random array X = ⟨X s : s ∈ N 2 ⟩ is exchangeable; moreover, for every nonempty finite subset F of N 2 we have Using (A.2), properties (P1)-(P4) follow from a direct computation. In order to verify property (P5) we argue as in the proof of Proposition 2.8. Fix an integer n ⩾ 8. Let Box(2) be the 2-dimensional box of N defined in (3.2). We define f : R ( [n] 2 ) → R by setting for every x = (x t ) t∈( [n] 2 ) ∈ R ( [n] 2 ) f (x) := s∈Box(2) It is clear that f is a multilinear polynomial of degree 4 that satisfies E[f (X n )] = 0 and ∥f (X n )∥ L∞ ⩽ 1. (Recall that X n denotes the subarray of X determined by [n].) Let I be an arbitrary subset of [n] with |I| ⩾ 8. Since |I| ⩾ 8, there exists a 2-dimensional box B of N with B ⊆ I 2 and such that min(s) ⩾ 5 for every s ∈ B. Set C := s∈B [X s = 1] and observe that C ∈ F I . Hence, by the exchangeability of X, we have which implies that P E[f (X n ) | F I ] ⩾ 2 −11 ⩾ 2 −11 . The proof is completed. □ A.3. The higher-dimensional case. The following result is the higher-dimensional analogue of Proposition A.1.
Proposition A.2. Let d ⩾ 3 be an integer. Also let δ > 0. Then there exists a boolean, exchangeable, d-dimensional random array X = ⟨X s : s ∈ N d ⟩ on N with the following properties. Note that (A.5) barely misses to imply that X satisfies the box independence condition.
The examples provided by Proposition A.2 can be roughly described as semi-random, in the sense that they are part random and part deterministic. The following lemma provides us with the random component.
Lemma A.4. Let d ⩾ 3 be an integer, and let ε > 0. Then there exist a nonempty finite set V and a symmetric 18 subset A of V d−1 such that, denoting by A ∁ the complement of A, for every pair F, G of disjoint (possibly empty) subsets of [2d] d−1 we have where: (i) µ denotes the product measure on V N obtained by equipping each factor with the uniform probability measure on V , (ii) for every v = (v i ) ∈ V N and every s = {i 1 < · · · < i d−1 } ∈ N d we have v s = (v i1 , . . . , v i d−1 ) ∈ V d−1 , and (iii) in (A.6) we use the convention that the product of an empty family of functions is equal to the constant function 1.
Lemma A.4 follows from a standard random selection and the Azuma-Hoeffding inequality; see, e.g., [DTV21, Fact 3.3 and Lemma 3.4] for a proof.
We are ready to proceed to the proof of Proposition A.2.
Proof of Proposition A.2. Let V and A be the sets obtained by Lemma A.4 applied for (A.7) ε := min δ 2 −d2 d , 8 −1 2 −(d+2)2 d , and observe that V can be selected so that its cardinality is an even positive integer. We also note that in the rest of the proof we follow the notational conventions in Lemma A.
Note that the function H is symmetric 19 . In the following series of claims we isolate several properties of H that will be used in the proofs of properties (P1)-(P5).
First, we will show that X satisfies properties (P1) up to (P4). For property (P1), let s ∈ N d be arbitrary and notice that, by the exchangeability of X and (A.28), where, as in Claim A.5, we have t 1 = {1, . . . , d}. By (A.12) and the choice of ε in (A.7), we obtain that E[X s ] − 1 2 ⩽ 2 d−2 ε ⩽ δ. For property (P2), let s, t ∈ N d be distinct, and set k := d − |s ∩ t| + 1. Since X is exchangeable, by (A.28), we have where t 1 and t k are as in Claim A.5. By (A.13), (A.30) and invoking again (A.7), we see