Multiplicative chaos measures from thick points of log-correlated fields

We prove that multiplicative chaos measures can be constructed from extreme level sets or thick points of the underlying logarithmically correlated field. We develop a method which covers the whole subcritical phase and only requires asymptotics of suitable exponential moments for the field. As an application, we establish these estimates hold for the logarithm of the absolute value of the characteristic polynomial of a Haar distributed random unitary matrix (CUE), using known asymptotics for Toeplitz determinant with (merging) Fisher-Hartwig singularities. Hence, this proves a conjecture of Fyodorov and Keating concerning the fluctuations of the volume of thick points of the CUE characteristic polynomial.

1. Introduction 1.1. Background and motivation. Log-correlated fields are a class of stochastic processes that have recently appeared in various models of probability theory and mathematical physics (see e.g. [2,4,6,8,9,10,12,26,29,33,42] and references therein). Formally, a (Gaussian) log-correlated field X is a centered Gaussian process on a metric space Ω with covariance (1.1) C X (x, y) := EX(x)X(y) = log dist(x, y) −1 + h(x, y) where h : Ω × Ω → R is continuous. More specifically, we focus on the case where Ω ⊂ R d is either a bounded open set, in which case dist(x, y) = |x − y| denotes the Euclidean distance or a d-dimensional (smooth) Riemannian manifold, with or without a boundary, in which case dist denotes the intrinsic metric. We allow this generality because we are interested in the 2d Gaussian free field restricted to the unit circle T = R/2πZ, that is, the Gaussian field with covariance, (1.2) C X (θ, x) = log |e iθ − e ix | −1 , θ, x ∈ T and its approximation given by the log characteristic polynomial of the circular unitary ensemble; cf. Theorem 1.2.
Since the covariance (1.1) blows up on the diagonal, C X (x, x) = ∞, X must be understood as a random generalized function (X is a Gaussian random element in a Sobolev space of negative regularity index; see e.g. [32,Section 2] for a review of precise definitions). Despite log-correlated fields not being honest random functions, some of their geometric properties, such as extrema and extreme level sets can be understood to a degree. Namely, if X N is an approximation of X with relevant spatial scale 1/N (e.g. a discretization on a lattice of mesh 1/N , a smoothing at scale 1/N , or possibly a more complicated approximation coming from a model of statistical mechanics), in several instances, it has been proven that e.g. given a compact set K ⊂ Ω with non-empty interior, max x∈K X N (x) = (1 + o(1)) √ 2d log N (as N → ∞) and that for γ ∈ (0, √ 2d) This subset is known as the set of γ-thick points of the field and |·| denotes its Lebesgue measure (or volume form if we are on a manifold). A discussion of such claims in a rather general setting can be found in [10,Section 3]. In this context γ = √ 2d is called the critical value and it plays a prominent role in this theory.
A successful approach to describe the geometry of log-correlated fields is through studying multiplicative chaos (GMC) measures associated to X. More precisely, under some mild assumptions on the approximations X N (E N denotes the expectation with respect to the law of X N ), it is known that for γ > 0, (1.4) µ N,γ (x)dx := e γX N (x) E N e γX N (x) dx converges (with respect to the vague topology on Ω) to a limiting measure µ X,γ as N → ∞ (either almost surely, in probability, or in distribution -depending on the type of approximation in question). Here dx denotes the Lebesgue measure if Ω ⊂ R d , or the volume from if Ω is a d-dimensional Riemannian manifold. We refer to e.g. [4] and [10,Section 2] for general convergence statements and further references. These measures are intimately related to the geometric properties mentioned above: the fact that a non-trivial limit exists only for 0 < γ < √ 2d while for γ ≥ √ 2d the limit is zero is closely related to the leading order asymptotics of the maximum. Also it is known that in the N → ∞ limit, the random measure µ N,γ concentrates on the set of γ-thick points (1.3); see e.g. [10,Section 3] for general claims and [1,Theorem 1.3] or [35,Proposition 1.6.] for statements in case of the circular ensembles.
It is expected that multiplicative chaos measures have a deeper connection to fluctuations of thick points and such results have been obtained in particular cases. In [27,5,30,3], the authors prove such results for respectively, branching Brownian motion, the discrete two-dimensional free field, local times of Brownian motion, and a log-correlated field which is a probabilistic model for the logarithm of the modulus of the Riemann zeta function on the critical line. These works nevertheless rely heavily on the specific properties of these models (e.g. the branching structure in branching Brownian motion, the Markov property of the free field, a martingale structure, etc). A general approach to this question, working even in the generality of Gaussian approximations of generic log-correlated fields is lacking, though one expects that a general result holds. Indeed, in the theoretical physics literature, such results have been conjectured to hold universally for any reasonable approximation of a logarithmically correlated random field. As an example, we quote the following conjecture of Fyodorov and Keating [26, Section 2 (d)]. Conjecture 1.1 (Fyodorov and Keating (2014)). Let U N be a random matrix distributed according to the Haar measure on the group of N × N unitary matrices (known as the circular unitary ensemble or CUE) and let p N (θ) = det(1 − U N e −iθ ) be the characteristic polynomial of U N restricted to the unit circle {e iθ : θ ∈ T}. Then as N → ∞, for any 0 < γ < 1, the random variable converges in distribution to a positive random variable with density Here G(z), z ∈ C denotes the Barnes G-function (cf. Lemma 5.3).
There is another related conjecture [26,Section 2.3] for the centering and fluctuations of max θ∈T log |p N (θ)| as N → ∞. In particular, the limit corresponds to the sum of two independent Gumbel random variables with a specific mean. For circular β-ensembles for general β > 0, the centering term has been obtained in [8] and the fluctuations have been described in [40] as the sum of Gumbel and an independent derivative martingale. Identifying the law of this random variable in terms of GMC is still an open problem.
We return to the normalization of (1.5), the origin of this conjecture and the relationship to GMC after stating our main result for the CUE; Theorem 1.2. Let us point out that if a random variable Ξ has density P γ , then Ξ −1/γ 2 is exponentially distributed with rate 1. Indeed, by making the change of variables t = x −1/γ 2 , one can check that for any bounded continuous f : (0, ∞) → R, The main purpose of this article is to develop a robust machinery which allows to describe the fluctuations of the sets of thick points under some concrete assumptions on the covariance (1.1) of the field X and the asymptotics of exponential moments of its approximations X N . The relevant notation and assumptions are described in Section 1.3 and our main result valid for a general approximation scheme is presented in Section 1.4.

Main results.
In the setting of the previous section, let us introduce the measures, for γ > 0, (1.6) ν N,γ (dx) := where g : Ω → R is a tuneable (continuous) function. Recall that µ X,γ denotes the GMC measures associated with X, that is, the (distributional) limit of the exponential measures (1.4). One of our main results is studying this measure ν N,γ (dx) in the setting of the circular unitary ensemble. More precisely, we have the following result.
In relation to Conjecture 1.1, let X N = √ 2 log |p N | so as to fix the critical value to √ 2 (which is the standard value in the literature) instead of 1. It is known (e.g. [21,Theorem 7.5.1]) that for any γ > 0, as N → ∞, We offer a proof of this result in Lemma B.2 as our approach involves certain generalizations of these asymptotics. To verify the assumptions of Lemma B.2, see (1.17) for the appearance of the Barnes functions. Hence, as a consequence of Theorem 1.2 with g = 0, for any γ ∈ (0, √ 2), in distribution as N → ∞, where X is the restriction of the 2d Gaussian free field to T (namely has the covariance (1.2)). In fact, our method also establishes that the moments of order β ∈ (0, 1] also converge; cf. Theorem 1.8 below. In particular, Eµ X,γ (dθ) = dθ 2π with our convention and this should be compared to (1.5) (upon multiplying by Γ(1−γ 2 /2) and replacing γ by √ 2γ). It is yet another conjecture from [25] that the probability density function of the random variable Γ(1 − γ 2 /2)µ X,γ (T) can be calculated explicitly, and it is given by P γ . This conjecture has been proved independently using different methods in [41,9]. Hence, we obtain the following result; Corollary 1.3. Conjecture 1.1 is true.
It is also of general interest to obtain similar results for smoothing (by convolution) of general Gaussian log-correlated fields. Let Ω ⊂ R d be an open set, h : Ω × Ω → R be locally α-Hölder continuous (α ∈ (0, 1]) and let X be a log-correlated field on Ω with covariance kernel (1.1). Theorem 1.4. Let ρ ∈ C c (R d → R + ) be a probability density function ( R d ρ(x)dx = 1) and let ρ δ = δ −d ρ(δ −1 ·) for any δ > 0. Let K ⊂ Ω be a compact set. For a small c = c K,ρ > 0, define a stochastic process on (0, c] × K by For any γ ∈ (0, √ 2) and g ∈ C(Ω → R), the random measure ν N,γ (with X δ , δ = 1/N in place of X N ) converges in probability as N → ∞ (with respect to the vague topology) to e −γg µ X,γ .
For δ > 0, (1.8) denotes the convolution of the random Schwartz distribution X with a continuous compactly supported test function, so it is almost surely well-defined (the constant c K,ρ depends only on the support of the mollifier ρ and on dist(K, ∂Ω)). This provides a natural family of approximations of X as δ → 0.
We will prove Theorems 1.2 and 1.4 together by developing a general framework which guarantees that the random measures ν N,γ and the usual GMC approximation µ N,γ (1.4) have the same limit (in probability in the subcritical phase γ ∈ [0, √ 2)) under general assumptions on the approximation of X; see Theorem 1.8 below. These assumptions are stated in the next section in terms of exponential moments of the field X N . Importantly, this allows us to consider non-Gaussian approximations such as the log characteristic polynomial of a random matrix model. We demonstrate the applicability of our method for the CUE in Sections 5 and obtain Theorem 1.2. Most of the relevant asymptotic estimates are already available in the literature (see e.g. [11,14,15]) based on the relationship with orthogonal polynomials and Riemann-Hilbert problems with Fisher-Hartwig singularities. This method and the appropriate results are reviewed in Section 5 and we carefully emphasize the (technical) modifications which are required to derive Theorem 1.2.
While the CUE is arguably the most basic random matrix ensemble, we believe that our approach can be adapted to a large class of unitary invariant ensembles, such as the one considered in [10] for a related problem, modulo additional technicalities while performing the steepest descent analysis of the corresponding Riemann-Hilbert problems.
1.3. Notation and General assumptions. Throughout this article, the dimension d ∈ N is fixed. Recall that Ω ⊂ R d is either a bounded open set (as in Theorem 1.4 in which case dist(x, y) = |x − y| denotes the Euclidean distance) or a d-dimensional smooth Riemannian manifold (as in Theorem 1.2 with d = 1, Ω = T and dist(θ, x) = |e iθ − e ix |). In the latter case, we assume that for any compact K ⊂ Ω, the metric tensor G satisfies c K I d ≤ G ≤ C K I d . In both cases, since we consider the vague topology of convergence of measures, one can always assume (by a partition of unity) that Ω ⊂ R d is a ball (the previous condition on G guarantees that dist is locally equivalent to the Euclidean distance). Thus, to ease notation, we will now denote dist(x, y) = |x − y|.
The interpretation of X being a log-correlated field with covariance structure (1.1) is that for is a centered Gaussian random variable with variance C X (x, y)f (x)f (y)dxdy; cf. (1.1). One can extend (1.9) to more general functions (by density). For instance, this quantity is (almost surely) well-defined for any f ∈ L ∞ c (Ω) (see (1.18) for a definition of this space). In general, we consider two layers of approximation; • For N ∈ N, X N : Ω → R is a lower-semicontinuous function such that for any f ∈ C ∞ c (Ω), X N , f → X, f in distribution as N → ∞.
One views X N as an approximation of the field X coming e.g. from a statistical mechanics model on a (microscopic scale) 1/N , and X N,δ as a smoothing of X N on a scale δ > 1/N . In particular, (X N ) N ∈N need not be defined on the same probability space, while the second layer of approximation X N,δ is defined on the same probability space as X N for every N ∈ N. We assume that X N has mild regularity (e.g. lower-semicontinuity such as the log characteristic polynomial in the context of Theorem 1.2) so that the level set of γ-thick points as in (1.6) is well-defined. Let us also record that in the context of Theorem 1.4, one works with only the second level of approximation and we restrict ourself to a basic class of mollifiers, ρ δ,x = ρ δ (· − x), with ρ as in the statement. Instead of a continuous approximation, one could consider a discretization of X on a graph embedded in Ω with mesh size 1/N . The methods of this paper can be adapted to this case with similar assumptions, modulo some rather involved notational changes, so we do not pursue such arguments in this paper.
We define the kernels for x, z ∈ Ω and ǫ, δ ∈ (0, 1], In the sequel, we require that ρ δ,x is a suitable smoothing kernel (depending on the covariance structure of the log-correlated field X) in the following sense; Assumption 1.5. For any compact set K ⊂ Ω, there exists c K > 0 so that where both error terms are uniform for (x, y) Depending on the model, one is lead to consider different smoothings, which is why we work in an abstract setup and formulate general assumptions. However, the reader might keep in mind the following two schemes; • Let (φ k ) k∈N be a orthonormal basis of L 2 (Ω) satisfying the following conditions: 2) The kernel (1.1) satisfies ) for x ∈ Ω and δ ∈ (0, 1] in which case we obtain convolution approximations as in Theorem 1.4. For instance, in the context of Theorem 1.2, X N = √ 2 log | det(I − e i· U N )|, where U N is a Haar distributed random N × N unitary matrix. We can decompose this field in the Fourier basis,

Moreover,
TrU k N k → N k in distribution as N → ∞, where (N k ) k∈N are independent complex Gaussians with variance 1/k; see e.g. [18]. Then, we consider the approximation for δ ∈ (0, 1], These are random trigonometric polynomials. One could also consider other regularization schemes such as a convolution with the Poisson kernel on T, ρ δ (x) = k∈N e −δk+ikx , in which case one ends up with regularizations of the form Our main assumptions are formulated in terms of exponential moments of X N and X N,δ = X N , ρ δ,x . Since these assumptions are rather elaborate, we need to introduce further notations. For a (large) R > 0 and a (small) c > 0, let and for a compact 1 K ⊂ Ω, We also write D ∞ for the infinite strip [0, √ 2d] × R. Note that the value √ 2d corresponds to the GMC critical point.
One should interpret this equation as saying that the RHS exists 2 and defining Ψ N for the relevant choices of parameters. This quantity encodes the joint exponential moments of X N and its approximation at different scales δ j as X N , f δ,z = q j=1 ξ j X N,δ j (z j ) where (ξ j ) q j=1 are parameters. We use the following shorthand notation; The function Ψ N controls the effects coming from any non-trivial (or non-Gaussian) behavior on microscopic scales in the model and we assume it satisfies the following properties; Assumption 1.6. The following assumptions hold for Ψ N ; for a fixed q ∈ N 0 .
Let us further comment on this assumption. On the one hand, if ζ 1 = ζ 2 = 0, this implies that for z ∈ Ω q , ξ ∈ C q and δ 1 , · · · , δ q ≥ N −1+η , This means that on arbitrary mesoscopic scales δ ≥ N −1+η , the regularized process X N,δ is essentially Gaussian with covariance kernel as in (1.10). On the other-hand, if ζ 2 = ξ = 0, then it holds locally uniformly for (x, ζ) ∈ Ω × D ∞ , . Note that it follows from this that Ψ(0, x) = 1 for all x ∈ Ω. In particular, making this assumption in Assumption 1.6 was not actually needed. The interpretation we suggest the reader keeps in mind is that this function Ψ encodes the non-trivial microscopic structure of X N . Then, (1.14)-(1.15) imply that to leading order, X N has variance log N , correlation kernel C X (x 1 , x 2 ) for x 1 = x 2 and that this microscopic information decorrelates at any mesoscopic distances N −1+η .
For instance, in the context of Theorem 1.2, by Lemma 5.3, the CUE characteristic polynomials satisfy for (ζ, θ) ∈ D ∞ × T, as N → ∞, where G(z), z ∈ C denotes the Barnes G-function (G(x) > 0 if x > 0.) In the CUE case, these quantities are independent of θ ∈ T because of the (distributional) rotational invariance of the eigenvalues of U N . Moreover, the function Ψ plays a key role in the moderate deviations of the field X N as reflected by (1.7). Our method requires a slightly stronger version of this assumption when the singularities (x 1 , x 2 ) are macroscopically separated. In this regime, we must allow the imaginary parts of ζ 1 , ζ 2 to grow like a (small) power of EX 2 N ∼ log N . Assumption 1.7. The following assumptions hold for Ψ N ; for a fixed q ∈ N 0 . (1) For any (small) c, η > 0, let R N = (log N ) η and recall (1.12)-(1.13). For fixed ξ ∈ R q , δ ∈ (0, 1] q , z ∈ Ω q , we have uniformly in ζ 1 , ζ 2 ∈ D R N and (x, y) ∈ A c , (2) For any compact K ⊂ Ω, there exist constants C = C K and κ = κ K such that 1.4. General results. Recall that in terms of the (log-correlated) field X N , we define the random measures Note that we have dropped the subscript γ and g ∈ C(Ω → R) is a tunable function. Our main statement is; Theorem 1.8. Fix γ ∈ (0, √ 2d) and g ∈ C(Ω → R). Under Assumption 1.5, 1.6 and 1.7, for any f ∈ C ∞ c (Ω), lim Our result shows that, under general hypothesis and in the whole subcritical phase, the fluctuations of the volume of the set of γ-thick points and the exponential measure (1.4) have the same limit in probability as N → ∞. If it is already known that the exponential measure converges to GMC 3 as N → ∞, then we obtain the following immediate consequence; Corollary 1.9. Under Assumptions 1.5, 1.6 and 1.7, if µ N → µ X,γ in distribution (or in probability) for γ ∈ (0, √ 2d) as N → ∞, with respect to the vague topology, then ν N → e −γg µ X,γ in the same sense.
1.5. Organization of the paper. In Section 2, we present the general strategy of the proof of Theorem 1.8 which is based on the modified second moment method by introducing a barrier and the new ideas underlying this result. In Section 3, we provide the details of the proofs, which are largely based on technical estimates using characteristic functions and Fourier analysis. In Section 4, we verify that a convolution approximation to a Gaussian log-correlated field satisfies the relevant assumptions. This allows us to obtain Theorem 1.4 by applying Corollary 1.9. Finally in Section 5, we verify that the log characteristic polynomial of the circular unitary ensembles satisfies the assumptions from Section 1.3, thus we obtain Theorem 1.2. In particular, the (exponential) moments conditions from Assumptions 1.6 and 1.7 are checked based on the determinantal structure of the CUE and the relationship to the Riemann-Hilbert problem for orthogonal polynomials on the unit circle. Finally, in the Appendix A, we provide further background on the (uniform) Gaussian approximation of a distribution function in terms of the asymptotics of its characteristic function.
In the sequel, we use the following convention for the Fourier transform: for n ∈ N and We will use that the Fourier transform extends to a linear operator on L 1 (R n ), L 2 (R n ) and on Schwartz distributions with the usual properties.
Recall that T = R/2πZ and for a function V ∈ C(T → C), we define its Fourier coefficients, We also find it convenient to introduce notation for the space of functions that are compactly supported and essentially bounded: 2. Main steps of the proof of Theorem 1.8 2.1. Notation. We fix the parameter γ ∈ (0, √ 2d), a continuous function g : Ω → R (cf. Definition 1.6) and a small parameter η > 0 ("small" depending only on γ < √ 2d). We rely on the convention from Section 1.3. Let K ⊂ Ω be compact and C = C η,K,R denote a constant which changes from line to line (this constant also depends on the function h from (1.1) and g from (1.6)). If C depends on another parameter, this will be emphasized.
We define the barrier for any ℓ, L ∈ N with ℓ < L, We view B ℓ,L both as an event and a random subset of Ω. When considering it as an event, we write for x ∈ Ω B ℓ,L (x) = X N,e −k (x) ≤ (γ + η)k : k ∈ [ℓ, L] . We also write B c ℓ,L for the complementary set (or event). Our proof of Theorem 1.8 is divided in three steps.

2.2.
Step 1, Truncation. This step aims to show that the barrier is a typical event in the sense that the following bounds hold.
There exists a constant C = C η,g such that for all ℓ, N ∈ N, The proof of this proposition is given in Section 3.1. It relies on a simple union bound based on our assumptions. In particular, it is important to set up the barrier so that e −L N = N −1+η is a mesoscopic scale.
By Proposition 2.1 and Cauchy-Schwarz's inequality, it holds for any f ∈ L ∞ c (Ω) where the implied constant is independent of N (but may depend on γ, η, f, g). Moreover, expanding the square, we can write The next two steps aim to show that (the integrals of) Θ j,N,ℓ for j ∈ {1, 2, 3} all have the same limit as N → ∞ and then ℓ → ∞.

2.3.
Step 2, Excluding the diagonal. Let us fix a compact K ⊂ Ω; e.g. K = supp(f ). Let us recall a direct consequence of (1.14) and Assumption 1.6. If we drop the barriers in the expression of Θ 1 , then so that for x, y ∈ K with |x − y| ≥ N −1+η , where h is as in (1.1). This estimate is sharp up to microscopic scales: using (1.16) we verify that E N µ N (x) 2 = C γ,x N γ 2 which is on the same scale as the above estimate if |x − y| ≃ N −1 . Moreover, for γ ∈ [ √ d, √ 2d), the RHS of (2.5) is not locally integrable on Ω 2 , which is why we introduce the barrier in the first place. In fact, using the barrier and Markov's inequality, it holds for any α ≥ 0, where we choose δ = e −L N = N −1+η . Then, according to Assumption 1.6 (with ζ 1 = 2γ, ζ 2 = 0, q = 1, ξ = −α (in the numerator) or ξ = 0 (in the denominator) and z = x), this implies that for x ∈ K ' . Hence, choosing α = γ − η, we obtain the uniform bound for x ∈ K ' Now, by Cauchy-Schwarz's inequality, this estimate implies that for x, y ∈ K, Thus, by choosing η > 0 small enough, this quantity converges to 0 as N → ∞.
Similarly, according to Lemma B.2, we have for x ∈ K, Hence, by Cauchy-Schwarz's inequality again (ignoring the barrier for ν N -terms), we obtain for j ∈ {1, 2, 3} and η > 0 small enough (depending only on γ < √ 2d) By using again the barrier, we obtain the following result in Section 3.2.
Proposition 2.2. Let 0 < η < min(γ, 1), ℓ ∈ N and K ⊂ Ω be a compact. There exists a constant C = C η,γ,K,g such that for j ∈ {1, 2, 3} and for x, y ∈ K with N η−1 ≤ |x − y| ≤ e −ℓ , The proof of Proposition 2.2 relies on Assumption 1.6 and Fourier analytic arguments. In contrast to (2.5), the RHS of (2.8) is integrable near the diagonal (at least for small enough η) and the constant C is independent of the parameter ℓ. Hence, by combining the bounds (2.7) and (2.8), we conclude that if the parameter η is small enough, then for any ℓ, N ∈ N, Going back to (2.1)-(2.2), this shows that This completes the second step of the proof. The last step consists basically in computing the pointwise limit of Θ j,N,ℓ (x, y) for j ∈ {1, 2, 3} when the points (x, y) are macroscopically separated.

2.4.
Step 3, Macroscopic regime. As we already suggested, we expect that the limits of Θ j,N,ℓ for j ∈ {1, 2, 3} are all of the same form. More precisely, we will show that (2.10) where N (m x,y , Σ x,y ) is a multivariate Gaussian measure with mean m x,y ∈ R 2cℓ , covariance matrix Σ x,y ∈ R 2cℓ×2cℓ and c ∈ N is a large enough constant (depending only on (γ, η)). If these limits were to hold uniformly over {|x − y| ≥ e −ℓ }, by taking first N → ∞ and then ℓ → ∞, we could conclude that lim which would complete the proof of Theorem 1.8.
While it would be possible to obtain (2.10) with the required uniformity, we proceed differently since this is technically slightly simpler. More precisely, in Section 3.3, we prove the following results. Proposition 2.3. Fix ℓ, L ∈ N with L > ℓ. For j ∈ {1, 2}, it holds pointwise for x, y ∈ Ω with x = y, e γ 2 C X (x,y) e γg(x)+γg(y) .
The functions (x, y) → m x,y ∈ R 2L and (x, y) → Σ x,y ∈ R 2L×2L taking values in positive-definite matrices are both continuous functions on Ω × Ω and given by Definition 3.5. Moreover, for any D > 0, if L = Rℓ with R sufficiently large (depending on η, γ, D), then and it holds pointwise for any x, y ∈ Ω with x = y, The proof of Proposition 2.3 relies on Assumption 1.7 and it is the most involved probabilistic argument of this paper. While proving Proposition 2.3, we also obtain the following bounds: for j = 1, 2, and ℓ ∈ N, By (2.9) and using the bound (2.12), we obtain Moreover, combining (2.11) and (2.13) shows that the pointwise limit (when N → ∞) of the integrand above is ≤ 0. Since it is uniformly bounded from above on the set {|x − y| ≥ e −ℓ }, cf. (2.14), by the (reverse) Fatou's lemma, we conclude that The LHS is independent of the parameter ℓ so this quantity converges to 0 and it completes the overall argument. In the next section, we provide the details of the proof.
3. Proof of Theorem 1.8 In this section, we give the complete proof of Theorem 1.8 following the strategy explained in Section 2. We rely on the setting from Section 1.3 as well as on the notation from Section 2; in particular, on (1.6), (1.14), (2.3) and the Assumptions 1.5, 1.6 and 1.7.
3.1. Proof of Proposition 2.1. As a size estimate suffices for us here, we can simply bound |f (x)| ≤ f ∞ 1 K (x) where K = supp(f ), and by scaling, we can assume that f ∞ = 1, so we will focus on the setting f (x) = 1 K (x) for some compact K ⊂ Ω. Let us begin by bounding )] in terms of ℓ. By definition of the barrier and a union bound, According to (1.14) and (1.10), for β, γ ∈ [0, √ 2d], α ∈ R and x ∈ Ω, Then, as a consequence of the Assumptions 1.5 and 1.6, there is a constant C = C K,η so that By Markov's inequality, this bound with β = γ implies that for any α > 0 Choosing e.g. α = η, this sum converges and we obtain that for any N, ℓ ∈ N, This proves the first claim. We can proceed similarly for the second claim: We can use the method from the Appendix A.1 to compute every term in this sum. For future purposes, we record a more general proposition. Recall that A c := (x, y) ∈ K 2 : |x − y| ≥ c for a small c > 0.
there exists a constant C = C K,η,g such that for any c > 0 there is N c ∈ N so that it holds for (x, y) ∈ A c , N ≥ N c and for δ ∈ [N η−1 , 1], where β N,δ = γ + α log δ log N . In case ζ = 0, the previous bound holds for all N ∈ N and x ∈ K.
Proof. Let us denote X N = X N (x)−γ log N √ log N , σ N = √ log N and define a new probability measure .
We want to compute the asymptotics of Q[X N (x) ≥ γ log N + g] where Q = Q N,y,x by using Lemma A.1. The characteristic function of the random variable X N under the measure Q biased by e βX N (x) is given by, for χ ∈ R. Using (1.14) and (1.10), we can rewrite this characteristic function as Then, it is natural to choose β = β N, and, in the regime where β N,δ → β as N → ∞, locally uniformly for z ∈ S = {z ∈ C : |Re z| ≤ W} and uniformly for (x, y) ∈ A c ; upon choosing W > 0 small enough (cf. Assumption 1.6). Thus, the condition (A.1) holds (provided that β N,δ → β and β > 0 by assumptions), then by applying Lemma A.1, we conclude that uniformly for (x, y) ∈ A c and locally uniformly for Hence, by definition of Q, this implies that for ( Note that by a compactness argument, this bound holds for any possible sequence δ(N ) ∈ [N η−1 , 1], since we have seen that β N,δ ∈ [γ η 2 , γ] and the bound holds along any subsequence such that β N,δ → β as N → ∞.
Using the asymptotics of Lemma B.2, this proves the claim for ζ = 0. For ζ = 0, the argument is even simpler -we do not need to worry about the variable y, so c plays no role and indeed the estimates hold for all N and x ∈ K.

3.2.
Mesoscopic regime: Proof of Proposition 2.2. The proof is divided in three subsections where we obtain the relevant bounds for Θ j,N,ℓ for j ∈ {1, 2, 3} respectively. 13 3.2.1. Bound for Θ 1 . Since g is locally uniformly bounded, the bound for Θ 1,N,ℓ we are after will follow from bounding Our strategy will be to replace the barriers here by a single judiciously chosen bound which we then estimate with Markov's inequality. More precisely, let us take δ = e −k with k = ⌈log |x − y| −1 ⌉ and then note that for x, y ∈ K and any α > 0, By definitions, we rewrite From Assumption 1.6, we see that the ratio of the Ψ-functions converges to 1 (uniformly in all parameters), while using Assumption 1.5 and the definition of k, with different errors (these errors are bounded functions of (x, y) ∈ K and the convergence holds uniformly if δ → 0). This implies that for β, γ, α ∈ [0, √ 2d], This estimate directly implies (2.8) for Θ 1 .

Bound for Θ 2 . Recall that our goal is to bound
and g ∈ C(Ω). The strategy is analogous to that used in the previous section; we replace the barriers by e −γ(X N,δ (x)−(γ+η)k) with the same choice of k and δ = e −k as in the previous section. The required estimate is then provided by the following result.
Proposition 3.2. For any γ, η > 0 and any compact set K ⊂ Ω, there exists a constant C = C γ,η,K such that for x, y ∈ K with |x − y| ≥ N −1+η and k = ⌈log |x − y| −1 ⌉, Proposition 3.2 provides control of Θ 2 on mesoscopic scales. Just as in the previous section, we deduce from this proposition the estimate (2.8) for Θ 2 ; for x, y ∈ K with e −ℓ ≥ |x−y| ≥ N −1+η , In order to prove Proposition 3.2, we rely on the following result.
, and can be written in the form Moreover, let γ 1 , γ 2 > 0 (possibly depending on N ) be such that γ 1 , γ 2 > γ for some fixed γ > 0 and set φ j = (e −γ j · 1 R + ) ⋆ χ. Then we have the following bound: for any sequence of strictly positive finite numbers σ N , we have Proof. Fourier transforming, we find by Plancherel's formula, that the integral in question equals Since χ is continuous and has compact support in [−R, R], using our assumption about the form of f N , we find The integrand on the RHS is the unnormalized density of a centered Gaussian distribution with which concludes the proof.
where k and δ = e −k are as in the statement of the proposition.
For convenience, we also denote σ N = √ log N and (X 1 , where f N is the law of (X 1 , X 2 ) under Q N,x,y .
Using (1.14), Assumption 1.6 and (3.3) as in Section 3.2.1 (recall also that k = ⌈log |x − y| −1 ⌉ and δ = e −k ), we find that for x, y ∈ K with |x − y| ≥ N −1+η , Now, we would like to apply Lemma 3.3 in order to estimate this quantity. This requires the asymptotics of the characteristic function Namely, with ζ j = γ j − iξ j σ −1 N , we find from (1.14) and (1.10) that and F N is defined implicitly by the previous equation.
In particular, using (3.3), we have where the ratio of the Ψ-functions is uniformly bounded for ξ 1 , ξ 2 ∈ [−Rσ N , Rσ N ] and x, y ∈ K, |x − y| ≥ N −1+η (cf Assumption 1.6) and the implied constant is uniform in ξ 1 , ξ 2 . This implies that that is, f N satisfies all the assumptions of Lemma 3.3 (note that 0 < κ N ≤ 1 − η). Now, let χ : R → R + be an even mollifier such that R χ = 1 and χ has compact support. As in Lemma 3.3, set φ j = (e −γ j · 1 R + ) ⋆ χ for j = 1, 2 and let and Then, observe that Since χ is even, Upon rescaling χ, we can assume that this integral is at least 1/3 (recall that we assume that γ j > c > 0). These trivial bounds imply that Hence, by Lemma 3.3, we conclude that Going back to the estimate (3.5), replacing σ N = √ log N and Using the asymptotics from Lemma B.2, we conclude that Proof. For x, y ∈ K, we define a new measure by where k is as in the statement of the proposition and set δ = e −k (in particular, the dependence on x of this measure comes through δ). Then, we can rewrite We use the method described in the Appendix A.1 to compute the first factor on the RHS of (3.6). To this end, we let and obtain the asymptotics of the characteristic function of the random variable X N under the measure Q N,x,y biased by e βX N (x) , that is, for This ratio of exponential moments can again be computed using (1.14), since the log singularity cancels (cf. Assumption 1.5 with δ = e −⌈log |x−y| −1 ⌉ ). Then, by Assumption 1.6, we have for small enough W > 0, ζ ∈ S = {z ∈ C : |Re z| < W}, Hence, the characteristic function of X N under Q N,x,y satisfies the condition (A.1) with ǫ N = 1/σ N and limit ψ x,y (z) = Ψ (γ+z) (x) Ψ (γ) (x) . We emphasize that this limit is locally uniform in {(x, y) ∈ K 2 : |x − y| ≥ N −1+η } and {z ∈ C : |Re z| ≤ W} for a small W > 0. Hence, we can apply Lemma A.1 with β = γ; we obtain Using also formula (B.3), by (3.6) and the definition of the measure Q N,x,y , this shows that where we used that δ = e −⌈log |x−y| −1 ⌉ and |x − y| ≥ N −1+η . This yields the claim. 3.3.1. Upper-bound for Θ 1 . The goal of this subsection is to obtain (2.11) for Θ 1 . Let us denote by P 1 N,x,y the probability measure with density proportional to e γX N (x)+γX N (y) with respect to P N and observe that by (2.3) and (2.5), it holds for N sufficiently large, for any fixed ℓ, L ∈ N with L > ℓ. This already yields the bound (2.14); for x, y ∈ K with |x − y| ≥ e −ℓ , The next proposition provides the required pointwise limit of the RHS of (3.7). To state the result, we need the following definitions; Note that (x, y) → Σ x,y , m x,y are both continuous functions on Ω × Ω which can be written explicitly in terms of the kernels (1.10).
We will use this notation throughout this section and we let, for x, y ∈ Ω with x = y, We have the following result.
Proposition 3.6. For x, y ∈ Ω with x = y, under P 1 N,x,y , the random vector Y converges in distribution to a multivariate Gaussian law N (m x,y , Σ x,y ).
Proof. By definitions, the Laplace transform of Y under the measure P 1 is given by where q = 2L, δ ∈ (0, 1) q and z ∈ Ω q are given according to Definition 3.5. Moreover, according to (1.14), we can rewrite From Assumption 1.6, we see that this ratio of the Ψ-functions converges to 1 (for any fixed x, y ∈ Ω with x = y and ξ ∈ R q ). Hence, this implies that for any ξ ∈ R q Since the pointwise convergence of the Laplace transform implies convergence in distribution, this completes the proof.
By Proposition 3.6, we see that for any fixed L, ℓ ∈ N and x, y ∈ Ω with x = y, as N → ∞, Then, with (3.7), this completes the proof of (2.11) in case j = 1.

3.3.2.
Upper-bound for Θ 2,N,ℓ . We proceed as in Section 3.3.1 and introduce a new probability measure P 2 N,x,y given by for x, y ∈ Ω, dP 2 Using this notation, it holds for any fixed L ≥ ℓ (and N sufficiently large), . Our next proposition provides the asymptotics of the quantity on the RHS of (3.10). Recall the notation (1.13).
Proposition 3.7. For x = y, under P 2 N,x,y , the random vector (3.8) converges in distribution as N → ∞ to a multivariate Gaussian N (m x,y , Σ x,y ). Moreover, for any c > 0, we have uniformly for (x, y) ∈ A c , By (3.10), Proposition 3.7 immediately implies that (2.11) in case j = 2, that is, Moreover using the uniformity of the limit (3.11), this also establishes the bound (2.14) in case j = 2. The proof of Proposition 3.7 is the only step which requires the Assumption 1.7 and the results of the Appendix A.2. This is the most technically involved step in this paper.
Proof of Proposition 3.7. Let us introduce yet new notation. Let σ N = √ log N and Let q = 2L. We also consider another probability measure depending on ξ ∈ R q , . 19 Our strategy is to compute the Laplace transform of the random vector Y under the measure P 2 , that is, using the previous notation, for ξ ∈ R q , (3.13) where P 1 N,x,y = L N,x,y,0 as in Section 3.3.1. To compute these quantities, we rely on classical Fourier-analytic arguments based on the asymptotics of Assumption 1.7. These arguments are presented in detail in Subsection A.2.
Integrating by parts and making a change of variables, we find To compare with Subsection A.2, we use ǫ N = 1/γσ N . Then, choosing L N := log σ c N for a fixed c > 2 γ , we have (3.14) Now, we can use the uniform approximation of Proposition A.3 to compute the leading term up to an error of order o(σ −2 N ) if we can verify the assumptions (A.7)-(A.8) on the characteristic function of (X x , X y ) under the measure L = L N,x,y,ξ . Using the notation from Definition 3.5, the characteristic function of this random vector is given by for χ ∈ R 2 , where we set and we have used that We claim that this characteristic function satisfies the assumptions of Subsection A.2 with In particular, Assumption 1.7 guarantees that (A.7) holds for a small η > 0 uniformly for λ = (x, y) ∈ A c ; the parameters ξ ∈ R q and δ ∈ (0, 1) q are fixed here. Its limit is given by Going back to formula (3.14), this implies that uniformly in (x, y) ∈ A c , Integrating by parts again, we obtain By Lebesgue's dominated convergence theorem, we conclude that for any ξ ∈ R q and uni- According to formula (3.13) and the asymptotics (3.9), this shows that for any ξ ∈ R q and for x, y ∈ Ω with x = y, This yields the first claim concerning the convergence in distribution of Y under P 2 N,x,y . For the second claim, we can rewrite in a similar way, Using again the asymptotics (3.15) and Lemma B.2, we obtain uniformly in (x, y) ∈ A c , .
On the other hand, by Assumption 1.6, we also have uniformly for (x, y) ∈ A c , Hence we conclude that with the required uniformity Our goal is now to obtain the lower bound (2.3). We choose L = Rℓ with R > 1 and define Recall (2.3) and that L N = ⌊(1− η) log N ⌋. Then, by a union bound, we can bound for x = y, . 21 We also define for α, δ > 0 and x, y ∈ Ω, . By Markov's inequality, using this notation, it holds for any fixed α > 0, In Subsection 3.3.4, we obtain the bound (2.12) for the quantity Υ 2,N,L (provided that R is large enough and α = η). Then, in Subsection 3.3.5, we compute the pointwise limit of Υ 1,N,ℓ . This last step completes the proof of Proposition 2.3. Both subsections are based on the tools presented in Appendix A.1.
Similarly, we can bound Υ 2,δ 2,N in terms of exponential moments; Proof. We use the method described in Section A.1; let .

3.3.5.
Asymptotics of Υ 1,N,ℓ . We consider a new probability measure with density proportional to ν N (x)µ N (y) that is, for x, y ∈ Ω, Recall (3.8), according to (3.16), we have . The goal of this section is to obtain the following result. Proposition 3.9. For x, y ∈ Ω, x = y, under P 3 N,x,y , the random vector Y converges in distribution as N → ∞ toward a multivariate Gaussian N (m x,y , Σ x,y ). Moreover, By (3.20), Proposition 3.9 directly implies that the limit of Υ 1,N,L is given by (2.13), concluding the proof of Proposition 2.3.
Proof. First observe that according to Proposition 3.8 with α = 0, we have for x, y ∈ Ω, x = y, Hence, by (2.4)-(2.5), we obtain (3.21). To prove the first claim, we consider the Laplace transform of the random vector Y that is, for ξ ∈ R q , where we used that ξ · Y = X N , f δ,z according to Definition 3.5. We rely again on the method from the Appendix A.1 to compute this quantity. For this purpose, we consider the Laplace transform of the random variable X N = X N (x) − γ log N ǫ N with ǫ N = 1/ √ log N under the biased measure L = L N,x,y,ξ , (3.12), that is, for χ ∈ C, Using (1.14), we can rewrite for ζ ∈ D ∞ , where ϕ(x, y) = γC X (x, y) + C X (x, u)f δ,z (u)du. By Assumption 1.6, this Laplace function satisfies the condition (A.1). Hence, applying Lemma A.1 with β = γ, we obtain for fixed ξ ∈ R q and x, y ∈ Ω with x = y, as N → ∞, Going back to (3.22), using this twice (once with f = 0), implies that for x, y ∈ Ω with x = y, We already computed the limit of this quantity in the proof of Proposition 3.6, so this completes the proof of Proposition 3.9.

Verification of assumptions for Gaussian fields
A natural task is to verify that the assumptions from Section 1.3 hold for a large class of convolution approximations for a Gaussian log-correlated field, since this is arguably the most basic way of regularizing it. In addition, this allows us to prove Theorem 1.4.

4.1.
Convention and GMC convergence. Throughout this section, we assume that Ω ⊂ R d is an open set and that X is a (mean-zero) Gaussian log-correlated field with correlation kernel (1.1) where h ∈ C Ω × Ω → R is locally α-Hölder continuous for a α ∈ (0, 1]. In this context, for a compact K ⊂ Ω, we define a regularization X δ (x) x∈K,δ∈(0,c] by a convolution with a smooth mollifier ρ, see (1.8). We also abuse notation and let (4.1) We recall the following classical result from [4] concerning the existence of the GMC measures associated with X in the subcritical phase. There are many other results about the existence of GMC measures (and the convergence of different Gaussian approximations): [4] is very concise, [42] offers a review of the subject, while [43] is perhaps the most general treatment.
The goal of this section is to verify that this function Ψ N satisfies both Assumptions 1.6 and 1.7. This boils down to precise estimates for the regularized kernels (1.10). In particular, we will also obtain Assumption 1.5 at the end of the proof.

4.2.
Estimates for regularized correlation kernels. Recall that in the context of Theorem 1.4, for x, y ∈ K, where h is α-Hölder continuous for some α ∈ (0, 1]. Recall that ρ(du) is a probability measure on R d with a continuous density with compact support. Without loss of generality, we assume that supp(ρ) ⊂ {u ∈ R d : |u| ≤ 1}. We define for x, z ∈ K and ǫ, δ ∈ [0, c], with the convention that g 0,δ = h δ and h 0 = h. We immediately verify that there is a constant C = C K,h so that for any x, z, y ∈ K and ǫ, δ ∈ [0, c], Lemma 4.2. Assume that α < 1. For any x, z ∈ K and δ ∈ (ǫ, c], we have as ǫ → 0, Proof. By (1.10), for z, x ∈ K and δ ∈ (0, c], Since ρ is uniformly bounded, for x, y, z ∈ K and r > 0, Together with the estimate (4.3) this implies that Now, using the bound log |1 + θ| ≤ C|θ| α valid for |θ| ≤ 1/2, we obtain δ where the implied constant depends only on α. Hence, using that we obtain for z, x ∈ K, which proves the claim.
Proof. To simplify the proof, we assume that the probability measure ρ is rotationally-invariant, even though it is straightforward to adapt the argument.
By definition of C X,δ , (1.10), we have for δ ∈ (0, c] and x, z ∈ K, At the second step, we made a change of variable and used that ρ is rotationally invariant; e 1 denotes the first basis vector in R d . Now, we have max |v|≤1 log |v + u| ρ(du) < ∞, and for 0 < r < 1, log e 1 + ru ρ(du) = 1 2 log (1 + ru 1 ) 2 + r 2 |u ⊥ | 2 ρ(du) ≤ Cr 2 , where we decompose u ∈ R d as u = (u 1 , u ⊥ ) and the constant C depends only on ρ. This proves the first claim. The second claim follows by the same argument.
where U N is a Haar distributed random N × N unitary matrix and T = R/(2πZ). As discussed in Section 1.3, we consider the approximation kernels ρ δ,x (θ) = |k|≤δ −1 e ik(x−θ) for δ ∈ (0, 1], leading to x ∈ T, δ ∈ (0, 1], N ∈ N. and the following estimates; Lemma 5.1. Let X be the free field on T, that is, a generalized Gaussian process with covariance kernel (1.2). Then, with ρ δ,x as above, we have for δ, ǫ ∈ (0, 1] with ǫ ≤ δ and x, θ ∈ T, Moreover, the Assumption 1.5 holds with K = T. Proof. The expressions for C X,δ (x, θ) and C X,δ,ǫ follow directly from the fact that the kernel (1.2) can be expressed as as a generalized Fourier series. Note that C X,δ are convolution kernels on T and where dist(x, θ) = |x − θ| mod 2π is the distance function on T.
To check the Assumption 1.5, we separate two cases. Recall that log r = k≤r k −1 + O(1) as r → ∞, then if ∆ ≤ δ, This establishes the case where dist(x, θ) ≤ δ.
On the other hand, if ∆ ≥ δ, with L = ⌊∆ −1 ⌋, where we used the first case (in particular, the error is independent of δ). We can control the oscillatory sum by making a summation by parts, we obtain Using that |Θ k | ≤ C/∆ (for some universal constant C) for any δ > 0 and k ∈ N with k ≤ 1/δ, this implies that  Combined with (1.11), this implies that for δ, ǫ ∈ [1/N, 1], we have exactly for x, θ ∈ T, Note that the last sum is continuous on T 2 and uniformly bounded by 1.
Second, we review the existing literature, in particular the Selberg-Morris integral formula (see e.g. [24, equation (1.18)] and set there a = b = 1 2 ζ, γ = 1); for any ζ ∈ C with Re(ζ) > −1 and θ ∈ T, Note that both sides are independent of θ by rotational invariance of the Haar measure. Similarly, the function Ψ(ζ) = Ψ(ζ, θ) is independent of θ ∈ T in the CUE case. We obtain the following asymptotics for the Laplace transform of the CUE log characteristic polynomial.
Proof. By definition, from (5.3), and the basic fact that G(z + 1) = Γ(z)G(z), we verify that In particular, ζ ∈ C → G(ζ + 1) is entire and without zero in this region Re ζ > −1. This leads us to consider L N (ζ) := log G(N + 1) + log G(ζ + N + 1) − 2 log G( ζ 2 + N + 1). Note that the condition Re(ζ) > −1 and ζ = o(N 1/3 ) imply that all of the arguments of G are here large (and have real part positive and of order N while the imaginary part is o (N 1/3 )). To estimate L N , we use the asymptotic expansion where A is the Glaisher-Kinkelin constant, and the expansion is valid in any sector not containing the negative real axis -see e.g. [22, Theorem 1, Theorem 2, and Theorem 3] where G is called the double gamma function. In particular, we find , and the error is uniform in the domain of ζ we are considering. Using that where the implied constants are universal. The claim follows from these asymptotics using that according to (5 Finally, the bound on Ψ follows immediately from (5.4); for any κ > 2, there exists a constant C κ such that for Re(z) ≥ 0, Remark 5.4. Let us point out that from the asymptotics of the proof, the condition that ζ = o(N 1/3 ) seems to be sharp here in that one starts getting order one corrections to the asymptotics when ζ is on the scale N 1/3 .
The following consequence of the results of [14] (see also [45,19,6] for closely related results) is also directly relevant 4 .
Theorem 5.5 (Deift, Its, Krasovsky, [14]). Let p ∈ N 0 , V ∈ C ∞ (T → C) with V 0 = 0 and Ψ as in Lemma 5.3. Then, it holds locally uniformly for ζ 1 , . . . , ζ p ∈ {z ∈ C : Re z > −1} and x ∈ {x ∈ T p dist(x i , x j ) > 0 for 1 ≤ i < j ≤ p}, as N → ∞, Note that according to Lemma 5.1, we have X N,δ (x) = − √ 2Tr C X,δ (U N , x) for x ∈ T and δ > 0 so that by linearity, X N , f δ,z = TrV (U N ) where In particular, V ∈ C ∞ (T → R) with V 0 = 0 and we claim that We can view C X , (5.2), as the kernel of a (bounded) integral operator K on L 2 (T) whose action on the Fourier basis is given by K(e ikθ ) = 1 2|k| e ikθ for k ∈ Z \ {0} and K(1) = 0. Then, the first identity in (5.7) is equivalent to For any function V ∈ C(T → C), we denote TrV (UN ) = N k=1 V (ϑ k ) where {e iϑ k } N k=1 denotes the eigenvalues of the random matrix UN .
Hf δ,z where H is the Hilbert transform 5 on T, since K ′ = 1 2 H as (Fourier) integral operator. Hence, the second second identity in (5.7) is equivalent to where we used that the operators K, H commute and H * V ′ = − 1 √ 2 f δ,z since the Hilbert transform is unitary (on the appropriate subspace of L 2 (T)).
Thus, taking V as above, p = 2, and comparing the asymptotics (5.5) to (1.14), we find that for fixed δ ∈ (0, 1] q and ξ ∈ R q , it holds uniformly for z ∈ T q and ζ 1 , ζ 2 ∈ D R and (x 1 , x 2 ) ∈ A c , see (1.12)-(1.13), Hence, in order to obtain the asymptotics from Assumption 1.6 (2) and Assumption 1.7 (1) to hold for the log characteristic polynomial of CUE, we need to extend Theorem 5.5 in two crucial ways; for any small η > 0, (1) the asymptotics (5.5) hold for p = 2 in the merging regime, that is uniformly for x 1 , x 2 ∈ T with |x 1 − x 2 | ≥ N η−1 and for a trigonometric polynomial V = V N with V N (θ) = Re k≤N 1−η V k e ikθ and | V k | ≤ C/k. This case yields Assumption 1.6. (2) the asymptotics (5.5) hold for p = 2 in case Im ζ j is allowed to grow mildly as N → ∞, that is uniformly for ζ 1 , ζ 2 ∈ D R N This case yields Assumption 1.7. We note that these extensions are mostly technical work and, to derive them, we will rely on the method from [14]. In Section 5.3 and Appendix C, we review the basics of this method which relies on the connection between the circular unitary ensembles, orthogonal polynomials on the unit circle and the associated Riemann-Hilbert problems. In Sections 5.4-5.6, we review the steepest descent method for these problems, including the global parametrix and local parametrix (around the singularities). In Section 5.7, we present the small norm problem and the main differences with the case already treated in [14]. Based on these asymptotics the proofs of Assumption 1.6 (2) and Assumption 1.7 (1) are finalized in Sections 5.8 and 5.9 respectively. 5.3. Toeplitz determinants and differential identities. A specificity of the CUE (and other unitary invariant ensembles) is their determinantal structure and, in particular, the Heine-Szegő identity which relates the Laplace transform of linear statistics of the random matrix U N to Toeplitz determinants (for a proof, see e.g. [7, Theorem 1]). For n ∈ N, let {e iϑ k } n k=1 be the eigenvalues of U n . The statement is as follows: for any F ∈ L 1 (T → C), This makes a connection with orthogonal polynomials on T that we now review. Note that using the Heine-Szegő formula, the LHS of (1.14) equals where, with V as in (5.6), In particular, V is a trigonometric polynomial of degree ≤ N η−1 for δ 1 , · · · , δ q ≥ N −1+η . Moreover, this symbol F t , as well as Y defined below, depend on the auxiliary parameters ζ 1 , ζ 2 ∈ D ∞ , x 1 , x 2 ∈ T with x 1 = x 2 , δ ∈ (0, 1] q and z ∈ T q which are allowed to vary with the dimension N ∈ N. To analyze the asymptotics of D N (F t ), we rely on the Riemann-Hilbert problem associated with orthogonal polynomials.
The relevant definitions are collected in the Appendix C and this problem reads as follows (the original connection between such problems and orthogonal polynomials is due to [23]); Problem 1. Let F = F t be as in (5.10), let U = {z ∈ C : |z| = 1} be the unit circle, and for n ∈ N and t ∈ [0, 1] let Y = Y n,t solve; ( Here, we do not assume that this problem has a solution -this will follow from our analysis. However, this issue is directly related to the existence of certain (orthogonal) polynomials and, if it exists, this solution is unique and given explicitly by (C.6). Regardless of this consideration, there are standard methods to derive the asymptotics of the matrix Y n,t as n → ∞ through a steepest descent analysis pioneered by Deift and Zhou [17]. This analysis is based on deforming the jump contour in a suitable neighborhood of U and we review it in the next sections. We have introduced the parameter t ∈ [0, 1] to make an interpolation (one could also consider different scheme, e.g. [15]). This allows us to relate the Toeplitz determinants D N (F 1 ) to D N (F 0 ) (where V = 0) through the following differential identity; Lemma 5.6. Let V be a trigonometric polynomial and F t be given by (5.10). Assume that D n (F t ) = 0 for all n = 1, ..., N . Then, for t ∈ [0, 1], where Y = Y N,t is the unique solution of Problem 1.
Although it is classical, the proof of Lemma 5.6 is given in the Appendix C for completeness. This relies on the fact that the matrix Y = Y N,t is built from the orthogonal polynomials with respect to the (complex) weight F t .
Remark 5.8. In formula (5.11), it is relevant to note that for Re(ζ j ) > 0, the entries Y 11 and Y 21 are polynomials and there is no issue in evaluating Y 12 and Y 22 at e iθ j since the zero of F 0 at e iθ j fixes the non-integrable singularity; cf. Appendix C and in particular formula (C.6).
We now turn to transforming our Riemann-Hilbert problem into a form that will eventually allow an approximate solution.

5.4.
Transforming the Riemann-Hilbert problems. The idea of the steepest descent analysis of Deift and Zhou is to perform transformations to the Riemann-Hilbert problem by modifying the jump contours so that the jump matrices are close to the identity matrix and the solution is normalized to be the identity matrix at infinity. Then the problem can be solved asymptotically with a suitable Neumann series. In particular, the Problem 1 at hand for n = N, N + 1 has an oscillatory jump matrix on {|z| = 1}. Hence, the first step of this transformation procedure consists in moving this contour to the regions {|z| < 1} and {|z| > 1} where the jump matrix will be exponentially decaying. This is known as "opening lenses" in the Riemann-Hilbert literature; we refer the reader to e.g. [13,14,15] and references therein for further details.
We are interested in V as in (5.6), but we can allow a more generic (real-valued) potential 6 in (5.10), In the sequel, ∆ is either N 1−η in case of Assumption 1.6 or fixed in case of Assumption 1.7.
Let U ∆,j = {w ∈ C : |w − e ix j | ≤ 1 2∆ } for j ∈ {1, 2} and consider the following domain, see Figure 1. We enlarge this set L by connecting it to the points e ix j suitably. More precisely, we draw certain contours (specifically defined in Section 5.6) from the points {(1 ± ∆ −1 4 )U} ∩ {∪ 2 j=1 U ∆,j } to the points e ix j , this yields the set L.
In order to "open lenses", we must analytically continue the symbol (5.10) to a neighborhood of U. Let V(z) = 1≤|k|≤∆ V k z k , this is a continuation of V in C \ {0} as a Laurent polynomial. Then, this is a question of analytically continuing the functions |z − e ix j | ζ j into a neighborhood of U, excluding some appropriately chosen branch cuts. We follow the construction from [14,Section 4]. We consider the functions where the branches of the roots are defined as follows: the cut of (z − e ix j ) ζ j /2 is taken to be on the half line e ix j × [1, ∞) and the branch is fixed by requiring that arg(z − e ix j ) = 2π on the half line parallel to the real axis going from e ix j to the right. For z ζ j / √ 2 we choose the cut to be the half line e ix j × [0, ∞) and one fixes the branch by requiring that arg(z) ∈ (x j , x j + 2π).
Then, by the Sokhotski-Plemelj identity, one has This provides the required analytic continuation of F t in a neighborhood of U. Note that we do not emphasize that this function (and all other quantities defined in terms of it, such as S below) depends on the parameters t ∈ [0, 1], ζ 1 , ζ 2 ∈ D ∞ , x 1 , x 2 ∈ T with x 1 = x 2 , and ∆ (which may also depend on N ). Moreover, according to (5.12) , uniformly for z ∈ ∂ L (and all other relevant parameters), In terms of terms of the analytic function f and the set L, we consider the following Riemann-Hilbert problem; Problem 2. Let Σ S = ∂L ∪ U be oriented as in Figure 1. Let S = S n for n ∈ N solve; (1) S : C \ Σ S → C 2×2 is analytic.
(2) S has continuous boundary values on Σ S \ {e iθ 1 , e iθ 2 }, denoted S + , S − which satisfy (4) For j = 1, 2, as z → e ix j , These O(·)-terms involve only z and do not require any uniformity in n, t, etc. 34 It is a standard fact (that the reader will have no difficulty verifying) that Problem 1 and Problem 2 are related by the following transformation, , z ∈ L and |z| > 1 .
In particular, if the solution Y of Problem 1 exists, then it is unique and S given by (5.18) is the unique solution of Problem 2, including the asymptotic conditions (3) and (4). Conversely, these conditions guarantee uniqueness of a solution of Problem 2 (which would not hold otherwise -this is common behavior in Riemann-Hilbert problems with singular symbols; for further discussion, see e.g. [34,Section 5] for the case when the symbol is supported on an interval) and, by solving Problem 2, one can recover Y .
In the next sections, we explain how to construct a solution of Problem 2. Note that if ∆ ≪ N for n ∈ {N, N + 1}, by (2), the jumps of ∂L are "exponentially small" and can be neglected except in neighborhoods of the singularities {e ix 1 , e ix 2 }. Then, this construction involves two ingredients; • a global parametrix which models the jumps over U (cf. Section 5.5).
• local parametrices to adjust for the jumps in neighborhoods of e ix 1 and e ix 2 (cf. Section 5.6). As a final step, one patches together these parametrices into a small norm problem (cf. Section 5.7) to obtain an (asymptotic) solution of Problem 2, including conditions (3) and (4).

5.5.
The global parametrix. Let us ignore the jumps across ∂L in order to find an approximate of Problem 2. In terms of (5.16), define (5.19) By construction, this function is analytic off of U, and since lim z→∞ D out (z) = 1, we have that N (z) = I + O(z −1 ) as z → ∞. Moreover, this function has a jump on U \ {e ix 1 , e ix 2 }, by (5.16), which is exactly the same as the jump of S across U. We emphasize again that (5.19) only provides a good approximation for S away from the singularities {e ix 1 , e ix 2 } and we need different parametrices there. 5.6. The local parametrix. We turn now to the approximations at e ix j . There are various equivalent ways to represent the local parametrix. In [36], a solution is constructed in terms of Bessel and Hankel functions, while in [14], one is constructed in terms of confluent hypergeometric functions. In [11], there is a slightly different representation in terms of hypergeometric functions. We will follow [14,Section 4.2], since we will rely heavily on other related results proven in [14,15].
Recall that U ∆,j = {w ∈ C : |w − e ix j | ≤ 1 2∆ } for j ∈ {1, 2}. Our goal is to construct a function P : U ∆,j → C 2×2 such that P has the same jumps as S, z → S(z)P (z) −1 is analytic in U ∆,j , and P (z)N (z) −1 = I + o(1) uniformly in z ∈ ∂U ∆,j (and the other relevant parameterswe will be more precise later on). It is this last part which requires extra care compared to the existing literature; e.g. [14,15]. The point being that in [14,15], the analysis is performed for fixed ζ 1 , ζ 2 , while in the context of Assumption 1.7, |Im ζ 1 |, |Im ζ 2 | are allowed to grow mildly with N .  Figure 3. The splitting of the complex plane into octants relevant for the function Ψ in the local parametrix as well as the orientation of the corresponding jump contours.
The first step is to specify, how we define the set L, (5.14), inside U ∆,j . For this purpose, let us define inside U ∆,j the function ξ N,j (z) = N log(ze −ix j ), where we consider the principal branch of the logarithm. This variable will serve as local conformal coordinate in U ∆,j . We require that ∂L, which consists of four (simple) curves connecting the points } to e ix j , which gets mapped to (parts of) the rays e ikπ/4 × (0, ∞) (with k = 1, 3, 5, 8) -see Figure 2 Another important ingredient to define the local parametrix is a function that is in a sense an analytic continuation of f 1/2 off of the unit circle, but different from the factorization (5.16). More precisely, let us write I, II, III, IV, V, V I, V II, V III for the octants of the complex ξplane ordered in the counter clockwise direction, and enumerated such that I = {ξ ∈ C : arg(ξ) ∈ (π/2, 3π/4)} and so on (see Figure 3).
Define for z ∈ U ∆,j (5.20) with a suitable choice of the cut (see the discussion around [14, (4.13)] for details).
It is then argued in [14, (4.21) and (4.22)] that We orient Γ 4 , Γ 5 , Γ 6 towards the origin, and the others away from it (see Figure 3). Again, it is argued in [14,Section 4.2] that if we take the +/−-side to be the left/right side of the contour then F N,j has continuous boundary values and satisfies the jump conditions; The next key ingredient in the construction of the local parametrix is a function built from the confluent hypergeometric function of the second kind (or Tricomi confluent hypergeometric function) -let us write ψ = ψ(a, c, ξ) for this hypergeometric function (often Ψ or U in the literature). More precisely, we define where M (a, c, ξ) = ∞ n=0 a (n) c (n) n! ξ n , with a (n) = a(a + 1) · · · (a + n − 1), and the branch of the root ξ 1−c is fixed by requiring that arg(ξ) ∈ (0, 2π) (with a cut on the positive real axis). Moreover a, c are complex numbers and we assume that b is not an integer (integer cases can be dealt with by a limiting procedure). See e.g. [28,Appendix] for a review of the basic theory of these functions. We mention here nevertheless that M is an entire function of ξ so the only singularity of ψ is the branch cut along the positive real axis. We then define for ξ ∈ I, In other regions, Ψ is constructed to have prescribed jump conditions. For example, in region II, one defines so on Γ 2 , we have the jump condition, For further details about the definition of Ψ(ξ) for all ξ ∈ C and the Riemann-Hilbert problem satisfied by ξ, we refer to [14,Section 4.2].
We are now in a position to define the local parametrix. Once again, we refer the reader to [14,Section 4.2], where it is proven that for z ∈ U ∆,j , the function where z ∈ U ∆,j → E(z) is a certain analytic function we will return to shortly. As a result of our construction, P has the same jumps as S in U ∆,j , and in fact, z → S(z)P (z) −1 is analytic in U ∆,j for all t ∈ [0, 1]. The final ingredient, the function E is relevant for the matching condition -as mentioned, this is the main step where we cannot rely directly on [14], since we allow ζ j to grow with N . Nevertheless, it varies slowly enough that we have essentially the same asymptotics as in the fixed ζ j -case. In particular, we will use the same E-function as [14]. We define We are now concerned with the matching condition, namely we wish to understand asymptotics of P (z)N (z) −1 for z ∈ ∂U ∆,j , where N is as in (5.19). Note that for z ∈ ∂U ∆,j , for some numerical constant c, so we will need to understand large ξ asymptotics of Ψ(ξ). For simplicity, we will only do this in sector I and leave the remaining sectors to the reader (though one must be slightly careful due to the branch cut so the asymptotics are slightly more complicated in regions V I and V II). For a reference on the relevant asymptotics, we refer the reader to [38,Chapter 13.7] and [39, Section 6 -Section 9] (though note that in the latter reference, the results are expressed in terms of Whittaker functions which are readily expressed in terms of ψ -we leave the details of this to the reader). The upshot is that for ξ ∈ I and for |c − 2a| = 1 (bounded would work just as well, but for us, |c − 2a| = 1) where for say |ξ| ≥ 1, for some universal constants C 1 , C 2 > 0.
We thus find that in region ∂U ∆,j ∩ I, if we assume that |ζ j | 2 = O(|ξ N,j (z)|) (this is the case for ζ j ∈ D R N if R N = o(N α ) for any α > 0), then where the implied constants are uniform in everything and Recalling (5.16) and (5.22), this can be written as, for z ∈ ∂U j,∆ ∩ ξ −1 N,j (I), where the implied constants are universal in the regime |ζ j | 2 /|ξ N,j (z)| ≤ 1 (we recall that for z ∈ ∂U j,∆ , |ξ N,j (z)| ≥ cN/∆ for a c > 0). We see that the key thing is to estimate D in (z)D out (z)e π √ 2 iζ j for z in the appropriate domain.
In particular, this is where the condition ζ 1 , ζ 2 = o(log N ) comes in play. We summarize the required fact in the following lemma.
Proof. Note that by definition (namely (5.15)), for t ∈ [0, 1], iζ j If z ∈ ∂U j,∆ , (5.13), and ζ 1 , ζ 2 ∈ D R N , we have for some numerical constant iζ j Thus it remains to control the V -term. For this, we note that if we write z = z |z| , then Thus its contribution to the exponential has size 1. We turn to bound for z ∈ U ∆,j , The same argument will work for the sum with k replaced by −k. Then, by (5.12), using that where the implied constant only depends on C in (5.12). This concludes the proof.
In particular, by (5.26) and Lemma 5.9, we conclude that there is a fixed α > 0 (α = η if R N = R is fixed and α < 1 if R N = o(log N ) and ∆ is fixed; see (5.13)) such that uniformly for z ∈ ∪ 2 j=1 ∂U ∆,j , (5.27) uniformly in all the relevant parameters (t ∈ [0, 1], x 1 , x 2 ∈ T with |e ix 1 − e ix 2 | ≥ ∆ −1 , ζ 1 , ζ 2 ∈ D R N with R N = o(log N ) and ∆ ≤ CN η−1 ). While we will need the more precise matching condition (5.26) for parts of our argument (in Section 5.9), this is already sufficient for us to discuss the "small norm analysis". 5.7. The small norm problem. We now briefly review some basic facts about the analysis of "small norm" Riemann-Hilbert problems. For details (which we omit), we refer the reader to e.g. [16,Section 7.2] and [34, Theorem 3.1 and Section 9]. There are several key underlying ideas, and we will not go into detail about them.
Our starting point is to define and the contour Γ R = ∂ L ∪ 2 j=1 ∂U ∆,j (see Figure 4), where both circles are oriented in a clockwise manner. By construction, S is a solution of Problem 2 if and only if R satisfies the following Riemann-Hilbert problem; Problem 3. let Γ • R denote Γ R without the self-intersection points. Let R = R n for n ∈ N solve; (1) R : C \ Γ R → C 2×2 is analytic.  (2) R has continuous boundary values on Γ • R , denoted by R + , R − which satisfy We now argue (mainly by referring to the literature) that this problem can be solved (uniquely) by a Neumann series if N is large enough. Importantly, this will establish that Problem 2 also has a solution for large enough N . Thus, this yields that Y exists (without assuming a priori that Problem 1 admits a solution) and it provides the asymptotics of this solution as N → ∞ (with the required uniformity). Moreover, we recall that this solution is unique and given explicitly by (C.6), although we will not need this fact.
Since the uniformity is different in cases of Assumptions 1.6 and 1.7, we have to treat these situations separately. Let us first consider the case of Assumption 1.6. Lemma 5.10. Fix a (small) η > 0 and assume that V satisfies (5.12). Let also α > 0 be as in (5.27). The (unique) solution R = R N of Problem 3 satisfies uniformly for x 1 , x 2 ∈ T with |e ix 1 − e ix 2 | ≥ ∆ −1 , ζ 1 , ζ 2 ∈ D R N , t ∈ [0, 1] and locally uniformly for z ∈ z ∈ C : dist(z, In particular, there exists a N R,α ∈ N such that for N ≥ N R,α , the solution Y = Y N of Problem 1 exists.
Proof. This is very standard in the Riemann-Hilbert literature, so we will simply refer the reader to the relevant references on various points. As a general reference, see e.g. [16,Section 7.2], and for something closer to our setting, see [37,Section 9]. First of all, the jump condition for R can be written as where the jump matrix satisfies, by (5.27), J R (z) − I = O(N −α ) for z ∈ ∪ 2 j=1 ∂U ∆,j . On the remainder of Γ • R , by (5.14), |z| = 1 ± 1 4∆ so that |z ∓N | = O(e −cN ∆ −1 ) for some universal constants c > 0. Hence, by (5.17), one readily checks that J R (z) − I = O(e −cN η ) for a fixed constant c > 0 on ∂ L. In particular, this quantity is negligible. Using the conditions from Problem 3, one can use the Sokhtoski-Plemelj identity to solve (5.30) in terms of R − ; for z ∈ C \ Γ R , Taking the boundary values of this equation from the −-side, this implies that R − satisfies the singular integral equation where C − denotes the boundary values from the −-side of the Cauchy integral operator associated with Γ R . We rewrite this as The main point is that the operator I − C ∆ on L 2 (Γ R , C 2×2 ) is invertible. This follows from the relationship between C ∆ and the weighted Hilbert transform on Γ R , so one can control the norm of the operator C ∆ in order to invert I − C ∆ via Neumann-series techniques. To be more specific, C − is a bounded operator on L 2 (Γ R , C 2×2 ) whose norm is uniformly bounded in ∆ (which depends on N and control the other relevant parameters) -this relies on a celebrated result of David combined with a simple argument allowing for moving contours (see [37,Lemma 9.2] for more details). Since we have established that This allows writing on Γ R , which is uniformly bounded by a numerical constant (for N large enough) and then to bound R using formula (5.31); we conclude that for z ∈ C \ Γ R , where · is any suitable matrix norm, and |dw| means integration with respect to arc-length measure. Then, we have seen that J R (w)−I = O(N −α ) for w ∈ ∂U j,∆ , so that if dist(z, Γ R ) ≥ 1 2∆ , On the remainder of Γ R , we have seen that J R − I = O(e −cN η ), so a similar bound shows that the integral on ∂ L is negligible as N → ∞. This yields the first estimate. Then, to prove the second, one can use that R is analytic on C \ Γ R , by Cauchy integral formula, } is a (simple) loop around z (not intersecting Γ R since we assume that dist(z, Γ R ) ≥ 1 ∆ ). Running a similar argument again (the main contributions coming from ∂U j,∆ ), using the previous estimate for R−I instead, this yields uniformly for dist(z, also uniformly in the other relevant parameters. This argument does not only provide the relevant asymptotics, it also shows that Problem 3 has a unique solution R = R N if N ≥ N R,α . Hence, by inverting the transformations (5.28) and (5.18), one constructs a solution Y = Y N to the Riemann-Hilbert problem 1.
In the setting of Assumption 1.7, the corresponding statement is as follows. Recall that A c ⊂ (x, y) ∈ T 2 : |e ix − e iy | ≥ c for a fixed (small) c > 0. Lemma 5.11. Let ∆ ≥ c −1 be fixed and assume that R N = o(log N ) as N → ∞. For any ǫ > 0, the (unique) solution R = R N of the Problem 3 satisfies uniformly for ζ 1 , ζ 2 ∈ D R N , Here, Θ j is a meromorphic function in U j,∆ with a simple pole at e ix j (it is explicit and depends on N -see the proof ) and A j is another explicit quantity (see e.g. [14, (4.71)]). In particular, there exists a N c ∈ N such that for N ≥ N c , the solution Y = Y N of Problem 1 exists.
Proof. The proof follows along the same lines as that of Lemma 5.10, so we also give few details.
The key point is again estimating the jump matrix J R − I.
By (5.26), using that |ζ j | ≤ R N and (5.25) with η = 1 (here, we assume that ∆ is fixed), for any ǫ > 0, it holds for z ∈ ∂U j,∆ with say ξ N,j (z) ∈ I, Note that we used Lemma 5.9 and the assumption R N = o(log N ) to control the error term (uniformly in all the relevant parameters). We denote the main term by Θ j (z), it is basically of order 1 and according to (5.15), so that this quantity is analytic in a neighborhood of e ix j . Hence, if ∆ is sufficiently small (see (5.13)), Θ j (z) is meromorphic in U j,∆ with a simple pole at z = e ix j coming from the local coordinate ξ N,j . On the remainder of the jump contour, by (5.16) and a similar argument, there is a constant c > 0 depending on ∆ so that with the sign depending on whether |z| < 1 or |z| > 1.
Thus, the same argument used to control (5.32) applies and we find that if say dist(z, Γ R ) > 1 2∆ , with the required uniformity. To compute this integral, one needs to evaluate the residue of Θ j at the simple pole w = e ix j . This is done in detail in [14,Section 4.3], leading to the quantity A j and we omit further details. Note that in case z ∈ int(U j,∆ ), there is an extra residue at w = z which is given by Θ j (z). We conclude by mentioning that the estimates can be extended near the contour Γ R by the standard contour deformation argument (see [16,Section 7]). The estimate for the derivative of R is obtained using the Cauchy integral formula and the claim about Y is obtained exactly as in the proof of Lemma 5.10 with ∆ fixed. This concludes the proof. Now that we have good asymptotics for the function R, we can proceed to integrate the differential identities of Lemmas 5.6 and 5.7 to verify our assumptions. 5.8. Verification of Assumption 1.6 (2). The claim is already known if ξ = 0; this is a combination of [11,Theorem 1.11] (which holds if |e ix 1 − e ix 2 | < ǫ for some fixed ǫ > 0) and [15,Theorem 1.1] (see [15,Remark 1.4] for a comment on uniformity). Together, these two results exactly state that with Ψ as in Lemma 5.3, Note that by (1.14) and (5.9), the LHS equals . We now extend these asymptotics to ξ = 0 fixed by using (5.9) and integrating the differential identity of Lemma 5.6. In this case, the potential V is given by (5.6) where for δ ∈ (0, 1], e iθ → C X,δ (θ, x) is a trigonometric polynomial as in Lemma 5.1. In particular, the assumptions (5.12)-(5.13) hold if ∆ = CN 1−η for a C ≥ 1. There is a standard analyticity argument that we omit and refer instead to e.g. [15, Section 5.1 and Section 5.3] (see in particular [15, (5.23)]) which ensures that the condition D n (F t ) = 0 for all n ∈ N from Lemma 5.6 holds for all but finitely many t ∈ [0, 1]. Then, with Y = Y N,t and F = F t , we have A straightforward calculation using the jump conditions for Y (Problem 1) implies that for 11,− . Thus we can deform our integration contour; let L ± be two circles (positively oriented) of radius 1 ∓ 1/∆ so that we can use Lemma 5.10. In particular, we obtain N is given by (5.19), and R is as in Lemma 5.10. This implies that there exists α > 0 so that Note that when the derivative hits R, we are conjugating the matrix R −1 R ′ by N and this does not affect the diagonal entries, which yields the error term. By (5.15), we obtain for z ∈ L + , In particular, the residue at e ix j do not contribute to the integral over L + and the error is integrable. Similar reasoning shows that for z ∈ L − , Putting everything together, this shows that as N → ∞, with the required uniformity. Note that according to formulae (5.6)-(5.7), this implies that Then, according to (5.9) and (1.14), we conclude from these asymptotics that as N → ∞ Combining these asymptotics and (5.33), this completes the proof of Assumption 1.6 (2).

5.9.
Verification of Assumption 1.7 (1). Starting from Lemma 5.3 (which states that independently of (1)) provided that ζ 1 = o(N 1/3 ) as N → ∞), we will integrate the differential identity from Lemma 5.7 to obtain an approximation of Here, the parameter ∆ ≥ c −1 is fixed, (5.13), so the arguments are exactly as in [14,15] and we will be rather brief with details, just providing the appropriate references and emphasizing the main differences. Lemma 5.11 provides the relevant asymptotics of the matrices Y N , Y N +1 (using the relationships (5.18) and (5.28)) and also of χ N (through e.g. formula (C.6) 7 ); see e.g. [14, the proof of Theorem 1.8 in Section 5]. Moreover, these quantities are analytic for ζ 1 , ζ 2 ∈ D ∞ so we can apply Cauchy's formula to also obtain asymptotics of their derivatives with respect to ζ 2 . Note that in this case, it is crucial that the asymptotics of Lemma 5.11 hold up to o(N ) since the main term in formula (5.11) involves N ∂ ζ 2 log(χ N ). Regarding the assumptions of Lemma 5.7, they can also be dispensed with by an analyticity argument as explained in Section 5.8. In fact, repeating (word for word, noting that β j = 0 in this case) the arguments in the proof of [15, Proposition 5.1], one deduce from (5.11) that if R N = o(log N ) as N → ∞, for any ǫ > 0, uniformly for ζ 1 , ζ 2 ∈ D R N and (x 1 , x 2 ) ∈ A c (with ǫ as in Lemma 5.11).

Appendix A. Approximation of distribution functions
In this appendix we consider some general probabilistic methods for estimating the tail of a probability distribution from information about characteristic functions. In the one dimensional case, this is a classical problem where the basic fact we rely on is due to Feller (see [20,Chap. XVI.3]). We will also need a multi-dimensional version of the result.
We fix a small parameter W > 0 and let S = {z ∈ C : |Re z| < W}.
A.1. 1d case. We consider a sequence of (real-valued) random variables (X N ) N ∈N whose distributions depend on an external set of parameters λ ∈ Λ where for simplicity, Λ ⊂ R m is an open set. In our applications to Ψ N from (1.14) (though this is the multidimensional case we turn to shortly), the parameter λ corresponds to the variables x i , δ, ξ. We assume that the characteristic function of X N satisfies for ξ ∈ R, where ǫ N → 0, ψ N,λ are analytic functions in S such that (λ, z) → ψ N,λ (z) are locally bounded and there exists compacts A N ⊂ Λ such that (A.1) ψ N,λ = ψ λ + o(1) locally uniformly in S and uniformly for λ ∈ A N as N → ∞.
Of course due to the local uniform convergence, the limit ψ λ is also analytic in S and λ → ψ λ is bounded. Note that we assume that ψ N,λ is analytic in the symmetric strip S while the Laplace transform considered in Section 1.3 (see e.g. (1.14)) is assumed to be analytic in D ∞ = [0, √ 2d] × R. This is not inconsistent because we always apply the results from this section under a measure which is biased by a e ζX N with Re ζ > 0 fixed. Moreover, while the distinction between A N and Λ may seem irrelevant, we emphasize that in our applications, we are interested in situations where the parameter space may depend on N , e.g. |x − y| ≥ N −1+η as in Assumption 1.6.
In this setting, we want to obtain a uniform approximation for the distribution functions F N of X N in terms of This is a perturbation of the distribution function of a standard Gaussian. Note that if N is large enough, for any λ ∈ A N , G N,λ takes values in [0, 2] and G ′ N,λ L ∞ ≤ 1. Then, it is proved in [20, Chap. XVI.3] that there exists a universal constant C so that for any ̺ > 0, To estimate this, note that Since ψ N,λ (0) = 1 for any N ∈ N, using our assumption (A.1), we see that by Taylor's theorem, for any R ≥ 1, there is a constant C R > 0 so that if λ ∈ A N and |ξ| ≤ Rǫ −1 N , By choosing ̺ = Rǫ −1 N , letting N → ∞ and then R → ∞, we conclude from (A.2) that uniformly for λ ∈ A N , This type of approximation is directly relevant in our context if X N is an approximately Gaussian random variable with variance √ log N to obtain the asymptotics of probability of the form P[X N ≥ γ log N ] for γ > 0. Such approximations are obtained e.g. in [21,Chap. 4]. However, the (local) uniformity over the various parameters of the distribution is crucial for our applications, so we formulate a general result.
Below one should think of P N,θ being some parametrized family of probability measures giving rise to a family of approximately Gaussian random variables X N = X N,θ .
Lemma A.1. Let P N,θ be a sequence of probability measures depending on N ∈ N and θ ∈ Θ for some parameter space Θ. For β > 0, define a new measure P N,θ,β by dP N,θ,β dP N,θ = e βX N E N,θ [e βX N ] and consider the random variable X N = ǫ N (X N −γ log N ) with ǫ N = 1/ √ log N . If the characteristic function ξ ∈ R → E N,θ,β e iξX N = satisfies (A.1) with λ = (γ, β, θ) ∈ A N , a compact subset of (0, ∞) × (0, ∞) × Θ, then locally uniformly in g ∈ R and uniformly for (γ, β, θ) ∈ A N , Proof. We consider first the case g = 0. We can rewrite for γ ≥ 0, Using that the characteristic function of X N under P N,θ,β satisfies (A.1), we can use the uniform approximation (A.4) so that integrating by parts, By a change of variable, this implies that for β > 0 By Lebesgue's dominated convergence theorem, this completes the proof with the required uniformity if g = 0. Moreover, since these asymptotics are locally uniform in γ > 0, replacing γ ← γ + g/ log N , we obtain the claim for general g ∈ R.
The asymptotics from Lemma A.1 are instrumental in several arguments of this paper. To close this section, let us make a few comments on the method. A.2. Multidimensional case. We now adapt the previous argument in arbitrary dimension, This requires stronger assumptions on the characteristic function of the random vector X N . Fix n ∈ N with n ≥ 2. We consider a sequence of continuous random R n -valued vectors (X N ) N ∈N whose probability distributions depend on an external parameter λ ∈ Λ for an open set Λ.
We assume that the characteristic function of X N satisfies for ξ ∈ R n , where ǫ N → 0, ψ N,λ are analytic functions in S n , λ → ψ N,λ are locally bounded (uniformly on a compact set of S n ). Now, instead of assuming that ψ N,λ can be approximated locally uniformly in S n , we assume that there is a η > 0 and compact A N ⊂ Λ so that Moreover, we need an assumption about the growth of ψ λ . More precisely, we assume that there exist constants c > 0 and κ ≥ 2) so that for ξ ∈ R n and λ ∈ A N , For x, y ∈ R n , we write x ≥ y if x i ≥ y i for all i ∈ {1, . . . , n} and 1 = (1, · · · , 1) ∈ R n (and 2 = 2 × 1).
We claim that under these assumptions, we can obtain uniform approximations for the "distribution function" 9 of X N , In a similar way, we let be the "distribution function" of a standard Gaussian. Our main result is then as follows. Lemma A.5. Let φ : R n → R + be a Schwartz-function such that R n φ = 1 and its Fourier transform φ has compact support. Let Z be a random vector (taking values in R n , independent of X N ) with probability density function φ. Let β, κ > 0 and define, for x ∈ R n , x ≥ 0, . Proof. Since F N is an increasing function (in every coordinate), we have for t ∈ R n with |t| ≤ 1, Observe that since the Gaussian p.d.f. is uniformly bounded on R n , we have It turns out that it is FN (x) instead of P(XN ≤ x), x ∈ R n which is more relevant for our purposes.
for a numerical constant C > 0. In particular, if β > α(n − 1), then the RHS is o(ǫ n N ). This implies that uniformly for N for any k ∈ N, since the p.d.f. φ decays faster than any polynomial. Using the same argument, replacing t ← ǫ κ N t and integrating both sides of (A.9) against φ(t) on {t ∈ R n : |t| ≤ ǫ −κ N }, we obtain after dividing by P |Z| ≤ ǫ −κ N , where the error is controlled uniformly for x ∈ [0, ǫ −α N ] n . We can use the same strategy to obtain a lower bound, using that for t ∈ R n with |t| ≤ 1, . We are now ready to give the proof of Proposition A.3.
Proof. Let α, β, κ > 0. The characteristic function of the random variable ǫ −1 N X N − ǫ β+κ N Z is given by, according to (A.6), . In particular, it has compact support (for |ξ| ≤ ǫ −β−κ N as we may assume that φ(u) is supported in {u ∈ R n : |u| ≤ 1}) and, by Fourier's inversion formula, the p.d.f. of this random variable is smooth and given by We have a similar expression (with ψ N,λ ← 1) for the p.d.f. of ǫ −1 N X −ǫ β+κ N Z where X ∼ N (0, I n ) independent of Z.
Appendix B. Consequence of the Assumption 1.6 Our main Assumption 1.6 imply that X N is really an approximation to the limiting Gaussian log-correlated field X down to arbitrary mesoscopic scales; Lemma B.1. Under the Assumption 1.6, for any compact K ⊂ Ω and (small) η > 0, it holds uniformly for (x, y) ∈ K with |x − y| ≥ N η−1 , Moreover, if one assumes that (1.15) holds uniformly for ξ ∈ C q with |ξ j | ≤ c for some c > 0, then for any test function g ∈ C 1 c (Ω) and q ∈ N, E N X N , g q → E X, g q as N → ∞.
In particular, X N → X as a random Schwartz distribution on Ω.
We can now proceed to prove for the second claim starting from the fact that, choosing δ(N ) = N −1+η for some η ∈ (0, 1), one has for ξ ∈ C q , uniformly in a neighborhood of 0, and locally uniformly for z ∈ Ω q , using that Ψ (0,0,ξ) N,δ (x 1 , x 2 ; z) = 1 + o(1) is independent of x 1 , x 2 (cf. Assumption 1.6). This implies that locally uniformly for z ∈ Ω q , as N → ∞, E N q j=1 X N,δ (z j ) = E q j=1 X δ (z j ) 1 + o(1) . Hence, we can integrate this against q j=1 g(z j ) to get as N → ∞. Here, we used that X is a Gaussian log-correlated field, so that E q j=1 X δ (z j ) is bounded on the set {z ∈ K q : |z i − z j | ≥ ǫ, ∀ i = j} by a constant depending only on q, ǫ and not on δ(N ). It is also standard that for any g ∈ C 1 c (Ω) and q ∈ N, E X, g δ q → E X, g q as δ → 0. Thus, combining the previous asymptotics with (B.2), we conclude that E N X N , g q = E N X, g q + o (1) as N → ∞, which is the second claim.
To conclude, we record the following consequence of Lemma A.1. Using (1.16) with ζ = γ, this completes the proof with the required uniformity.

51
Remark B.3. Let us emphasize that the proof only relies on (1.16) and that the asymptotics from Lemma B.2 are restricted to γ ∈ (0, √ 2d) because of our choice of D ∞ .

Appendix C. Orthogonal polynomials and Riemann-Hilbert problems
In this appendix we review the basic connection between orthogonal polynomials and Riemann-Hilbert problems.
Let F ∈ L 1 (T) and recall the notation (5.8). In terms of F , we define a sequence of polynomials (p m ) m∈N of increasing degree, for every m ∈ N and z ∈ C,  Moreover, if F ≥ 0, these form a family of orthogonal polynomial. By multi-linearity of (C.1), for 0 ≤ j < m, 2π 0 p m (e iθ )e −ijθ F (e iθ ) dθ 2π is a determinant with two identical rows, so it must vanish, that is p m is orthogonal to monomials of lower order. By the same argument, Conversely, if the polynomials p j exist for j < N , we can recover the Toeplitz determinant D N (F ) from them. There is no general result which guarantees that they do exist, but we can argue that apart from a countable number of values of ζ 1 , ζ 2 , ϕ they do exist, and then work from this.
In addition, one often introduces a family of dual polynomials. Again, if D m (F ) = 0 for m = 0, ..., N , we define for m ≤ N and z ∈ C,  Note that if F is real valued (for us it may not be), then we have q m (z −1 ) = p m (z) for z ∈ T.

52
To analyze the orthogonal polynomials asymptotically, we encode them into a Riemann-Hilbert problem. For j ∈ N, if p j and q j−1 exist, we define for |z| = 1 The basic result in this business that everything builds on is the following result of Fokas, Its, and Kitaev [23] (which is in a different setting, and has later been adapted to many different questions). See also [13,Chatper 3] for a proof in yet another setting -the argument readily adapts to this case (once one replaces Sobolev-theory by Hölder-theory) and we omit the proof.
By applying the Deift-Zhou steepest descent analysis, one can invert this statement and, under suitable conditions, construct a function Y which solves this problem if j is sufficiently large. Then, one can argue that the polynomials p j , q j−1 have to exist.
While variants of Lemma 5.6 (e.g. for Hankel determinants) have certainly appeared in the literature, we provide a brief proof this differential identity since we do not know of a perfect reference.
Proof of Lemma 5.6. Using (C.3), a short calculation (some of whose details we are omitting) shows that where the polynomials are the ones orthogonal with respect to the weight F (e iθ ). Also we note that ∂ t F = V F so if we can relate the sum above to Y in a suitable way, then we will be done. The first step in this is to use the Christoffel-Darboux identity (see [14,Lemma 2.3] for a result that holds also for the complex weights we need): for z = 0 Let us on the other hand look at the object that we want to see: by (C.6) (and drop the t from our notation for now) The next step is to apply a recursion relation for the polynomials. More precisely [14, Lemma 2.2, (2,4)] says that (C.7) χ N −1 z −1 q N −1 (z −1 ) = χ N q N (z −1 ) − q N (0)z −N p N (z).
This implies that Moreover, multiplying (C.7) by z N and differentiating, we see that We see that the q N (0)-terms cancel, so these considerations lead to Noting that if we make the change of variables z = e iθ in our integral, we have z −1 dz 2πi = dθ 2π so (reintroducing t to our notation) which was the claim.