The Hardy--Littlewood--Chowla conjecture in the presence of a Siegel zero

Assuming that Siegel zeros exist, we prove a hybrid version of the Chowla and Hardy--Littlewood prime tuples conjectures. Thus, for an infinite sequence of natural numbers $x$, and any distinct integers $h_1,\dots,h_k,h'_1,\dots,h'_\ell$, we establish an asymptotic formula for $$\sum_{n\leq x}\Lambda(n+h_1)\cdots \Lambda(n+h_k)\lambda(n+h_{1}')\cdots \lambda(n+h_{\ell}')$$ for any $0\leq k\leq 2$ and $\ell \geq 0$. Specializing to either $\ell=0$ or $k=0$, we deduce the previously known results on the Hardy--Littlewood (or twin primes) conjecture and the Chowla conjecture under the existence of Siegel zeros, due to Heath-Brown and Chinis, respectively. The range of validity of our asymptotic formula is wider than in these previous results.

Here and in the sequel, n is understood to range over natural numbers, and we use the averaging notation E n∈A f (n) := 1 |A| n∈A f (n) for any set A of a finite cardinality |A|. The reasons for the primes in the notation h 1 , . . . , h is for compatibility with Conjecture 1.3 below.
For = 1 Chowla's conjecture is equivalent to the prime number theorem, but the conjecture is open for all ≥ 2, although a slightly weaker "logarithmically averaged" conjecture is known to hold for = 2 [27] or for odd [28,29]. All the discussion here concerning the Liouville function λ has a counterpart for the Möbius function µ, but for simplicity of exposition we restrict attention to the Liouville function here.
Clearly Conjectures 1.1, 1.2 correspond to the special cases k = 0 and = 0 respectively of Conjecture 1.3. One could also generalize this conjecture by replacing the forms n + h j , n + h j by more general linear forms a j n + b j , a j n + b j , no two of which are scalar multiples of each other, but we do not do so here in order to simplify the notation.
Only the k + ≤ 1 cases of Conjecture 1.3 are currently known, even if one assumes the generalized Riemann hypothesis, though see [26] for some recent progress in the function field case, and the recent works [15], [16] for some progress on an averaged version of this conjecture. On the other hand, it turns out (perhaps surprisingly) that some progress on this conjecture can be made under an opposing hypothesis, namely the existence of a Siegel zero. We use the notational conventions from Heath-Brown's work [11]: Definition 1.4 (Siegel zero). A Siegel zero β is a real number associated to a primitive quadratic Dirichlet character χ of conductor q χ such that L(β, χ) = 0 and β = 1 − 1 η log q χ for some η ≥ 10 (which we call the quality of the zero).
The lower bound on η is mostly in order to ensure that log log η is positive; the precise numerical value of the lower bound is not important. From Siegel's theorem we have the (ineffective) upper bound (1.4) η ε q ε χ on the quality of a Siegel zero for any ε > 0.
There are prior results in the literature towards Conjecture 1.3 in the presence of a Siegel zero when only either the von Mangoldt function or the Liouville function appears in the correlation. These results are due to Heath-Brown [11] in the case of two-point correlations of the von Mangoldt function, and due to Chinis [2] in the case of the Chowla conjecture (with previous work by Germán and Katái [6] on the two-point case). We can summarize them as follows: Theorem 1.5 (Prior results on Hardy-Littlewood-Chowla given a Siegel zero). Suppose that one has a Siegel zero β with associated conductor q χ and quality η.
Remark 1.7. The k-dependent exponent of 10k in the range (1.5) can be improved somewhat, particularly when k = 1, but we will not attempt to optimize it here. On the other hand, in order to improve the exponent 1 2 in (1.5) in the case k = 0 it seems necessary to be able to obtain non-trivial bounds on short character sums such as n∈I χ(n + h 1 ) · · · χ(n + h ) for intervals I of length less than q 1/2 χ , which is beyond the range of direct application of the Weil bounds and completion of sums (and for > 1 we were not able to adapt the Burgess argument [1] to such sums due to the lack of multiplicative structure). The exponent 1 10 max (1,k) in (1.6) can similarly be improved, but we will not attempt to do so here.
Note that Theorem 1.6 improves the dependence on the quality η of Siegel zero, and also allows for correlations that involve both the von Mangoldt function Λ and the Liouville function λ, so long as the former function appears at most two times. This latter restriction is an inherent limitation of our current state of knowledge of correlations for functions like the divisor function τ := 1 * 1; in particular, k-point correlations E n≤x τ (n+h 1 ) · · · τ (n+h k ) are currently only well understood when k ≤ 2.
As a direct corollary to Theorem 1.6, we can state the following strengthening of previous results.
We lastly note that, after the submission of this paper, Matomäki and Merikoski [17] proved a quantitatively stronger version of Corollary 1.8(i).

Overview of proof
The general strategy for proving results such as Theorem 1.6 is now well known: in the presence of a Siegel zero (and for x comparable in log-scale to q χ ), the function λ "pretends" 2 to be like the Dirichlet character χ, and the von Mangoldt function Λ = µ * log similarly "pretends" to be like χ * log, so the correlation in (1.6) is of comparable complexity to the average E n≤x (χ * log)(n + h 1 ) · · · (χ * log)(n + h k )χ(n + h 1 ) · · · χ(n + h ) (in practice we also have to insert some sieve weights to account for the fact that not all numbers are rough). This is a twisted and weighted version of the divisor correlation which, as previously mentioned, is well understood for k ≤ 2, basically because the Weil bounds for Kloosterman sums ensure that τ has level of distribution at least 2/3, the key point being that this is larger than 1/2. The twist by χ introduces the need to estimate character sums such as which can be adequately controlled by the Weil estimates for character sums since we are in the regime x q 1/2 χ . To make this strategy rigorous, we will approximate the functions Λ, λ by a series of more tractable approximants that involve the exceptional character χ (as well as the scale x). We will do this by executing the following steps in order: (i) Replace the Liouville function λ with an approximant λ Siegel , which is a completely multiplicative function that agrees with λ at small primes and agrees with χ at large primes. (This step was also performed in [6], [2].) (ii) Replace the von Mangoldt function Λ with an approximant Λ Siegel , which is the Dirichlet convolution χ * log multiplied by a Selberg sieve weight ν to essentially restrict that convolution to almost primes. (This step essentially also appears in [11].) (iii) Replace λ Siegel with a more complicated truncation λ Siegel which has the structure of a "Type I sum", and which agrees with λ Siegel on numbers that have a "typical" factorization. (iv) Replace the approximant Λ Siegel with a more complicated approximant Λ Siegel which has the structure of a "Type I sum". (This step is inspired by a similar Type I approximation to the divisor function τ (and its higher order generalizations) recently introduced in [19], [18].) (v) Now that all terms in the correlation have been replaced with tractable Type I sums, use standard Euler product calculations and Fourier analysis, similar in spirit to the proof of the pseudorandomness of the Selberg sieve majorant for the primes in [9,Appendix D], to evaluate the correlation to high accuracy.
More succinctly, the proof of Theorem 1.6 proceeds by justifying all of the following approximations: where the precise meaning of the symbol ≈ is given in (2.11) below.
The steps (i)-(v) are executed in Sections 4-8 respectively. Interestingly, the hypothesis k ≤ 2 is only used in step (iv) of this process.
Steps (i) and (ii) of the strategy rely ultimately on the well known phenomenon that in the presence of a Siegel zero, one has χ(p) = −1 for most primes p that are comparable to the conductor q χ in log-scale. Traditionally, such phenomena are justified using complex-analytic methods, and in particular by exploiting the Deuring-Heilbronn phenomenon. It turns out that an alternate approach relying almost entirely on elementary methods leads instead to significantly superior dependence on the quality η of the zero; see Proposition 3.5. This eventually enables us to obtain a wider x range in Theorem 1.6 than in previous results.
Step (iii) involves splitting λ Siegel , which is a kind of character-twisted divisor sum, into two parts as λ Siegel + λ Siegel , where λ Siegel accounts for the small divisors (with a smooth truncation) and λ Siegel accounts for the large divisors. It turns out that λ Siegel has a negligible contribution to the correlation (basically because smooth numbers become extremely rare at large scales). This is shown by first constructing a majorant for λ Siegel (in Lemma 6.1) that after some Euler product computations is seen to be small "on average" in a suitable sense. 3 Steps (iv) and (v) morally speaking amount to computing correlations such as E n≤x,n=a(q) (χ * log(n))(χ * log(n + h)) (1.8) with power-saving error term (for 1 ≤ a ≤ q ≤ x δ for a small δ > 0), as well as correlations of the form where f (n) = d|n,d≤x δ b d is a Type I sum with explicit coefficients b d . However, both of these tasks are rather tedious as such; the first correlation (1.8) has secondary main terms of order O( 1 log x ) times the main term (cf., [4]), and we would need a fully explicit asymptotic in terms of h, a, q; meanwhile, evaluating the second correlation (1.9) with the Goldston-Yıldırım approach [7] leads to some tricky contour integrals. We therefore smoothen Λ Siegel by inserting a smooth partition of unity; the smoothness of the resulting functions makes handling error terms easier, just as in the smoothed approach to Goldston-Yıldırım type correlations in [9,Appendix D]. We can also avoid explicitly obtaining asymptotics for sums such as (1.8) by using the Dirichlet hyperbola method, although the main ingredient for evaluating such correlations (namely Kloosterman sum bounds) is still needed. Our use of smooth weights does still necessitate some lengthy yet standard Fourier-analytic computations, but the arithmetic input is easier than in a direct approach involving an evaluation of (1.8), (1.9).

Acknowledgments
TT was supported by a Simons Investigator grant, the James and Carol Collins Chair, the Mathematical Analysis & Application Research Fund Endowment, and by NSF grant DMS-1764034. JT was supported by a Titchmarsh Fellowship and Academy of Finland grant no. 340098.
The authors thank Kaisa Matomäki and Jori Merikoski for pointing out a slight correction to the proof of Proposition 3.5 in an earlier version of this paper. The authors would also like to thank the referee for helpful comments and suggestions.

Asymptotic notation
For the rest of the paper, we let k, , h 1 , . . . , h k , h 1 , . . . , h , ε 0 , β, χ, q χ , η, x be as in Theorem 1.6, save that we will not require the hypothesis k ≤ 2 except in Section 5, and that we do not impose the restriction (1.5) on x > 1 before Section 4. We use the asymptotic notation X Y , Y X, or X = O(Y ) to denote the bound |X| ≤ CY where C is a constant which is allowed to depend on the "fixed" quantities k, , h 1 , . . . , h k , h 1 , . . . , h , ε 0 ; we permit the constants to be ineffective. Thus for instance the singular series S in Conjecture 1.3 obeys the bound S = O(1). If we need the constant C to depend on additional parameters, we will indicate this by subscripts, for instance X A Y denotes the bound |X| ≤ C A Y where C A depends on the parameter A as well as the fixed quantities. We write X Y for X Y X. By shrinking ε 0 if necessary, we may assume that ε 0 is sufficiently small depending on k, . We will also assume that η is sufficiently large depending on the fixed quantities, since otherwise the claim follows from standard upper bound sieves (such as Lemma 3.2). By (1.4), this also means that q χ (and hence x) is also sufficiently large depending on the fixed quantities.

Indicator and exponential functions
If S is a sentence, we use 1 S to denote its indicator, thus 1 S = 1 when S is true and 1 S = 0 otherwise. If E is a set, we use 1 E to denote the indicator function 1 E (n) := 1 n∈E . In addition to the notation e(θ) := e 2πiθ , we also write e q (a) := e(a/q) = e 2πia/q for natural numbers q and a ∈ Z/qZ. We also write θ R/Z for the distance of θ to the nearest integer.

Primes and prime factorization
Unless otherwise specified, all sums and products will be over the natural numbers N = {1, 2, . . . }, with the exception of sums and products involving the variable p (or p , p 1 , etc.), which will be over primes. We define an exceptional prime to be a prime p * such that χ(p * ) = −1; sums over p * (or p * 1 , etc.) will always be understood to be over exceptional primes.
If n is a natural number and p is a prime, we let n (p) denote the largest power of p dividing n, thus from the fundamental theorem of arithmetic For any threshold z > 1, we may therefore factor a natural number n as where the z-smooth and z-rough components n (≤z) , n (>z) of n are defined as  . . , d m ] to denote their greatest common divisor and least common multiple, respectively. We use d (q) to denote the reduction of d to Z/qZ, and q|d to denote the assertion that q divides d (or equivalently d = 0 (q)).  We will frequently rely on Dirichlet convolution We let pointwise product take precedence over convolution, thus for instance From (2.2) we observe the identity for any multiplicative function f and any threshold z > 1, where are the restrictions of f to z-smooth and z-rough numbers respectively. Thus for instance 1 (≤z) = 1 N (≤z) . Observe that this splitting respects Dirichlet convolutions, in the sense that for any f, g : N → C.

Scales
We will make frequent use of the scales We will also occasionally need the auxiliary scale (2.10) The reader may wish to keep in mind the hierarchy of scales which follows easily from (1.4). The conductor q χ lies between log x and x 2 but can be either smaller or larger than R 0 , R, or D.
We adopt the notation log z y := log y log z for the logarithm of y to base z for any y, z > 0, and use the notation X ≈ Y as an abbreviation for Thus for instance the estimate (1.6) can be abbreviated to E n≤x Λ(n + h 1 ) · · · Λ(n + h k )λ(n + h 1 ) · · · λ(n + h ) ≈ S. The scales R 0 , R, D have been chosen so that certain combinations of these scales with x, η, q χ that will arise in our calculations are negligible with respect to the relation ≈.

The Selberg sieve
We fix a smooth function ψ : R → R supported on [−1, 1] that equals to 1 on [−1/2, 1/2], and define the smooth cutoffs for any z > 1. We then define the Selberg sieve 4 Note that ν is an upper bound sieve for 1 (>R) , thus for all natural numbers n. 4 Here we use the Selberg sieve with smoothed coefficients, which was implicitly introduced by Goldston and Yıldırım; see for instance [9, Appendix D] for further discussion. Other sieve approximants to 1 (>z) could be used as a substitute for this sieve if desired; for instance the beta sieve was used in place of a Selberg sieve in the recent work [17], which appeared subsequently to the initial release of this paper.

Tools
In this section we collect some (mostly standard) estimates on various arithmetic functions which will be used in our main argument.

Multiplicative number theory bounds
We recall the crude divisor bound for any n ≥ 1 and ε > 0; see e.g., [20, (2.20)]. From the Euler product formula and the fact that ζ has a simple pole at s = 1 with residue 1 and no zeroes in {s : |s − 1| ≤ 1 2 }, we see that whenever s is a complex number with Res > 1 and |s − 1| < 1 2 . From Mertens' theorem we easily verify that for any σ > 0 and R, z ≥ 1, as can be seen by verifying the cases σ log R z < 1 and σ log R z ≥ 1 separately; in exponential form we thus have Mertens' theorem also gives (by dyadic decomposition) the bounds for any y ≥ z ≥ 2 and m ≥ 1. In particular We recall an elementary inequality of Landreau [14] that allows one to upper bound the divisor function τ by a Type I sum: (i) If n is a natural number and y > z > 1, then we can factor where n 1 , . . . , n m ≤ y lie in N (≤z) and 0 ≤ m ≤ 1 + log y/z n. Also, n (>z) is the product of at most log z n primes. (1) for all n ≥ 1. In particular, by (2.9), one has for n x.
Proof. Observe from the greedy algorithm that any number in N (≤z) is either greater than y, or contains a factor between y/z and y. Iterating this fact, we can factor n (≤z) = n 1 · · · n m where n 1 , . . . , n m ≤ y and all but at most one of the n 1 , . . . , n m are greater than or equal to y/z. This gives the bound m ≤ 1 + log y/z n. Since n (>z) is the product of primes greater than z, the total number of primes is at most log z n. This gives (i). For (ii), we apply (i) with y = n ε and z = n ε/2 and use (2.12) to obtain the factoriza- and hence by the pigeonhole principle for d equal to one of the n 1 , . . . , n m . The claim (ii) follows.
We also record a standard sieve upper bound, which can easily be deduced from the fundamental lemma of sieve theory (or the large sieve): Lemma 3.2 (Sieve upper bound). Suppose that for every prime p ≤ x there is a natural number 0 ≤ ω(p) 1, and let E be a subset of {n : n ≤ x} which avoids at least ω(p) residue classes modulo p for each p ≤ x. Then we have Proof. We may assume that ω(p) < p for all p, since otherwise E is empty and the claim is trivial (of course, this assumption is only non-trivial for the very small primes p = O(1)). By Mertens' theorem the contribution of the primes x 1/100 < p ≤ x to the right-hand is 1, so we may replace the product p≤x here with p≤x 1/100 .
For each d ≤ x 1/2 , let E d be the set formed by removing the ω(p) residue classes modulo p from {n : n ≤ x} for all p|d. Then from (3.11) (with n replaced by p≤x 1/100 :n ∈Ep p) we have the pointwise bound From the Chinese remainder theorem we have and thus by (3.12) By Mertens' theorem (3.6), the second term on the right-hand side is certainly dominated by the former, and the claim follows.
We also record the following easy consequence of the Chinese remainder theorem.
Proof. All the claims are immediate except for the existence of the residue class a in part (ii) (the final part of (ii) following from the general relation By the Chinese remainder theorem we may assume that the d i are all powers of a single prime p. Then we have d = d i for some 1 ≤ i ≤ k , and the claim follows by setting a := −h i .

Some Fourier analysis
Recall the Fourier inversion formula: if g : R → C is a Schwartz function, then one has for all u ∈ R, where the Fourier transform f : R → C of g is another Schwartz function defined by the formula As a special case of this, if ϕ : R → C is a function such that u → e u ϕ(u) is Schwartz, then In particular, for any real n, z > 0 we have Evaluating this formula at n = 1 we conclude that and if one differentiates at n = 1 instead one obtains the variant identity As an application of these Fourier representations, we give an analogue of Lemma 3.2 for the Selberg sieve ν (cf., [22,Lemma 3]
The R 2k x error term is negligible in practice. The σ variable of integration is technical and as a first approximation the reader is invited to replace σ with 1 (and delete the integral). The key feature of this estimate are the factors of min(σ log R p, 1), which make the left-hand side of (3.17) small when d 1 , . . . , d k have one or more small prime factors. With further effort one could obtain a more precise asymptotic for the left-hand side of (3.17) (in the spirit of [9, Theorem D.3]) but we will not need to do so here.
Proof. It is convenient to relabel by writing k := k + and h k+j := h j , d k+j := d j for j = 1, . . . , . By Lemma 3.3 we may assume that (2.23), the left-hand side of (3.17) may be expanded as x ), so it suffices to show that We can expand the left-hand side using (3.14) and Fubini's theorem as for some Schwartz function f . Changing variables using the substitution σ := 1 + k j=1 |t j | + |t j |, and using the rapid decay of f and the triangle inequality, it will suffice to establish the pointwise bound for all t 1 , . . . , t k ∈ R. By (2.3) and the fact that µ is supported on square-free numbers, the left-hand side factors as an Euler product p E p where From the triangle inequality we have Now let p ≤ R and d (p) > 1. Then from the triangle inequality we have Finally, suppose that p ≤ R and d (p) = 1. Then from the triangle inequality we have (1.2). From the Cauchy integral formula (in the case σ log p ≤ log R) or (3.18) (otherwise) we thus have and hence by (1.3) Putting all this together, we see that and hence by Mertens' theorem (3.6) and the claim follows.

Elementary consequences of a Siegel zero
Recall from Section 2 that we use p * to denote primes that are exceptional in the sense that χ(p * ) = −1. It is a well known phenomenon that exceptional primes become rare at scales comparable in log-scale to q χ . For instance, in [11,Lemma 3] it was shown that 5 while in [6] it was shown more generally that In fact we can do better: The first bound is non-trivial for x as large as q η 1−ε 0 χ , while the second bound is nontrivial for primes p * as small as q . It is not difficult to recover (3.21) (and hence (3.20)) from the above proposition by taking a suitable linear combination of (3.22) and (3.23) for m ≤ √ log η, and using Mertens' theorem to control the contribution of exceptional primes p * ≤ q 10/ √ log η χ (say); we leave the details to the interested reader.
. 5 Strictly speaking, these results only claim to control the set where χ(p * ) = 1, ignoring the relatively small number of primes where χ(p * ) = 0, but it is not difficult to modify the arguments to also include the latter set.
From Siegel's theorem we have L(1, χ) ε q −ε/10 χ , and hence From [20,Theorem 11.4] we also have L L (1, χ) η log q χ . Thus, (3.24) gives On the other hand, from the non-negativity and multiplicativity of 1 * χ we have Since 1 * χ(p) is non-negative and is at least one when p is exceptional, the first claim (3.22) follows.
In a similar vein, since any n ≤ q 10 χ has ≤ 20m m representations in the form n p 1 · · · p m with q (1+ε)/2 χ < p 1 < p 2 < · · · < p m , we have for any natural number m ≥ 2 that q 1+ε 2 χ <n≤q 10 Observe that once n < q (1+ε)/2 χ and some of the exceptional primes p * 1 , . . . , p * j , j < m have been chosen, the restrictions that the exceptional prime p * j+1 be distinct from p * 1 , . . . , p * j and not divide n only excludes at most 2m primes p * j+1 from the range q 1+ε 2m since n has at most m factors in this range. Thus we have where the asterisk in the sum means that we are allowed to delete the 2m largest terms from the sum (or delete the sum entirely, i.e. replace it by zero, if there are fewer than 2m terms in all). The estimates (3.25), (3.26) then give * One can reinstate the top 2m terms from the sum on the left-hand side, since their contribution is m/q This bound will be used in steps (i), (ii) of the main argument.
Proof. From (3.22) (with ε = 1) and (1.5) we have and from (3.23) we similarly have for all m ≥ 2. Summing over 2 ≤ m ≤ √ log η + 1, we obtain the first claim. Now we prove the second claim. The contribution of those p * with p * ≥ x log 0.1 η is acceptable by (3.5), while the contribution of those p * with p * ≤ R 1/ log 0.4 η is also acceptable by (3.3). Thus it remains to show that The contribution of those p * with q χ < p * < x log 0.1 η is acceptable by (3.22) (for ε = 1), (1.5), while the contribution of those p * with R 1/ log 0.4 η < p * ≤ q χ is acceptable by (3.23) (for ε = 1 and 2 ≤ m log 0.5 η, say) and (2.8).

Consequences of the Weil bound for character sums
uniformly for all integers a whenever f is not a constant multiple of perfect square modulo p, where χ p is the quadratic character modulo p; see [30] (or [21]). When f is a constant multiple of a perfect square, we can of course use the trivial bound of O(p). Since the exceptional modulus q χ is a fundamental discriminant, it is of the form 2 j p 1 · · · p m for some j ≤ 3 and distinct odd primes p 1 , . . . , p m , and so from the Chinese remainder theorem we obtain the bounds uniformly in a, where d is the largest factor of q χ for which f is a constant multiple of a perfect square modulo d. Applying for any interval I of length at most q χ and any ε > 0; by subdividing longer intervals into intervals of length q χ , plus a remainder, we conclude that for any interval I and any ε > 0. This gives us the following bounds: Lemma 3.7. Let d 1 , . . . , d k+ be natural numbers. Let I be an interval in [1, x]. Let J be a non-empty subset of {1, . . . , k + }, and for each j ∈ J, let d j be a factor of d j . Then for any ε > 0, where we use the notation h k+j := h j for j = 1, . . . , .
This bound will be used in step (v) of the main argument, to dispose of any "Type I sum" contributions that are twisted by one or more factors of the exceptional character χ.
Proof. By Lemma 3.3 we may assume that (d i , d j )|h i − h j for all 1 ≤ i < j ≤ k + and replace the conditions d j |n + h j with n = a (d) where and a = −h j (d j ) for j = 1, . . . , k + . Our task is now equivalent to showing that n:dn+a∈I j∈J We can write the left-hand side as n:dn+a∈I Suppose that there is a prime p not dividing d such that f is a constant multiple of a square modulo p. Then the roots − a+h j d (p) of f must experience a repetition, and hence p divides h i − h j for some 1 ≤ i < j ≤ k + . Since the h 1 , . . . , h k+ are fixed, this forces p = O(1). From the Chinese remainder theorem (and the fact that q χ is a fundamental discriminant), we conclude that the largest factor d of q χ for which f is a constant multiple of a square modulo d is O ((d, q χ )). The claim now follows from (3.27).

Consequences of Kloosterman sum bounds
We recall 6 Estermann's form [5] x∈Z/qZ:(x,q)=1 e q (u 1 x + u 2 x * ) ≤ τ (q)q 1/2 (u 1 , u 2 , q) 1/2 of the Weil bound for Kloosterman sums, where x * is the inverse of x in Z/qZ and u 1 , u 2 are arbitrary integers. From this and a simple change of variables we see that for any natural number q and integers w, a, u 1 , u 2 with (w, q) = (a, q) = 1, where we use the averaging notation f (n 1 , n 2 ) whenever f : Z 2 → C is a periodic function with some period L (thus, f (n 1 + Lm 1 , n 2 + Lm 2 ) = f (n 1 , n 2 ) for all integers n 1 , n 2 , m 1 , m 2 ). We will need to extend the bound (3.28) to the case where a shares a common factor with q, and where we also insert a periodic weight: Lemma 3.8 (Fourier coefficients on a hyperbola). Let q be a natural number, and let a, u 1 , u 2 be integers. Let q 0 be a factor of q such that (a, q)|q 0 . Let f : Z 2 → C be a 1-bounded 7 function with period q 0 . Then The factor τ (q 0 ) 2 q 3/2 0 can be improved somewhat, but we will not attempt to optimize it here. This bound will be needed in step (iv) of the main argument, in order to dispose of the non-Type I portion Λ Siegel to the Siegel approximant Λ Siegel . 6 For the applications in this paper one could also proceed using the weaker but more elementary bounds of Kloosterman [13], as the important thing is that we gain a power savings over the trivial bound of q, at the cost of degrading the numerical exponent 10k in (1.5) somewhat. We leave the details of this variant of the argument to the interested reader. Proof. If n 1 n 2 = a (q), then from considering the prime factorisations of n 1 , n 2 , a, q we see that (n 1 , q 0 ), (n 2 , q 0 ) must be factors of (a, q) and hence of q 0 ; also, we have ((n 1 , q 0 )(n 2 , q 0 ), q) = (a, q 0 ) = (a, q). Thus there are at most τ (q 0 ) 2 possible choices for (n 1 , q), (n 2 , q), and by the triangle inequality it suffices to show that (3.29) |E n 1 ,n 2 ∈Z f (n 1 , n 2 )1 n 1 n 2 =a (q) e q (u 1 n 1 + u 2 n 2 )| ≤ q 3/2 0 τ (q)q −3/2 (u 1 , u 2 , q) 1/2 . under the additional hypothesis that f is supported in the region where (n 1 , q 0 ) = q 1 , (n 2 , q 0 ) = q 2 for some factors q 1 , q 2 of q 0 with (3.30) (q 1 q 2 , q) = (a, q).
In particular, if we write q := q (a,q) , then the quantity w = q 1 q 2 (a,q) is a primitive element of Z/q Z. Making the change of variables n 1 = q 1 n 1 , n 2 = q 2 n 2 , we can now rewrite the left-hand side of (3.29) as 1 q 1 q 2 |E n 1 ,n 2 ∈Z f (q 1 n 1 , q 2 n 2 )1 wn 1 n 2 = a (a,q) (q ) e q (u 1 q 1 n 1 + u 2 q 2 n 2 )|. By Fourier inversion and the Plancherel formula we have where the coefficients c k 1 ,k 2 obey the bound and hence by Cauchy-Schwarz Thus by the triangle inequality and pigeonhole principle, we can bound the left-hand side of (3.29) by q 0 q 3/2 1 q 3/2 2 E n 1 ,n 2 ∈Z 1 wn 1 n 2 = a (a,q) (q ) e q u 1 + k 1 q q 0 q 1 n 1 + u 2 + k 2 q q 0 q 2 n 2 for some integers k 1 , k 2 . Since 1 wn 1 n 2 = a (a,q) (q ) is a q -periodic function of n 1 , n 2 , this expression vanishes unless the integers (u 1 + k 1 q q 0 )q 1 , (u 2 + k 2 q q 0 )q 2 are divisible by q/q = (a, q). Since w and a (a,q) are both primitive in Z/q Z, we may then apply (3.28) and bound the left-hand side of (3.29) by which we can rewrite as q 0 By construction, we have and hence by taking suitable linear combinations We conclude in particular that d|q 0 q 1 q 2 (u 1 , u 2 , q), and the claim follows (noting from (3.30) that (a, q) ≤ q 1 q 2 ).
From Lemma 3.8 and the Fourier inversion formula one can express the periodic function f (n 1 , n 2 )1 n 1 n 2 =a (q) as a linear combination of Fourier phases e q (u 1 n 1 + u 2 n 2 ) with good bounds on the Fourier coefficients. However, the contribution of those terms in which one of u 1 , u 2 is divisible by q (or by a very large factor of q) will be inconvenient to handle. We therefore perform the following substitute expansion: Lemma 3.9 (Modified Fourier expansion). Let q be a natural number, and let a be an integer. Let q 0 be a factor of q such that (a, q)|q 0 . Let f : Z 2 → C be a 1-bounded function with period q 0 . Define q 0 := (q 0 (a, q), q). Then we have f (n 1 , n 2 )1 n 1 n 2 =a (q) = αq 0 q f (n 1 , n 2 )1 n 1 n 2 =a (q 0 ) 1 (n 1 n 2 ,q)=(a,q) where α is the quantity α := p| q q 0 ;p q 0 (a,q) p p − 1 and the coefficients c u 1 ,u 2 obey the bounds |c u 1 ,u 2 | ≤ 2τ (q 0 ) 2 q 3/2 0 τ (q)q −3/2 (u 1 , u 2 , q) 1/2 . Proof. We may assume without loss of generality that 1 ≤ a ≤ q. Let A denote the collection of those 1 ≤ a ≤ q such that a = a (q 0 ) and (a , q) = (a, q). From Lemma 3.8 we see that the Fourier coefficient (3.31) E n 1 ,n 2 ∈Z f (n 1 , n 2 )(1 n 1 n 2 =a (q) − 1 n 1 n 2 =a (q) )e q (u 1 n 1 + u 2 n 2 ) for u 1 , u 2 ∈ Z/qZ is bounded in magnitude by 2τ (q 0 ) 2 q 3/2 0 τ (q)q −3/2 (u 1 , u 2 , q) 1/2 for any a ∈ A. We claim furthermore that this Fourier coefficient vanishes whenever one of u 1 , u 2 is divisible by q/q 0 . Indeed, suppose for instance that u 2 is divisible by q/q 0 , so that n 2 → f (n 1 n 2 )e q (u 1 n 1 + u 2 n 2 ) is q 0 -periodic for any n 1 . To obtain the vanishing of (3.31), it suffices to show that (3.32) n 2 ∈Z/qZ:n 2 =a 2 (q 0 ) 1 n 1 n 2 =a (q) = n 2 ∈Z/qZ:n 2 =a 2 (q 0 ) 1 n 1 n 2 =a (q) for any integers n 1 , a 2 . But since (a , q) = (a, q), we can write a = wa (q) for some primitive w ∈ Z/qZ; since a = a ((q 0 (a, q), q)) we have w = 1 ((q 0 , q/(a, q))); as we have the freedom to adjust w by an arbitrary multiple of q/(a, q) we may in fact assume that w = 1 (q 0 ). The claim (3.32) then follows after applying the change of variables n 2 → wn 2 on the right-hand side. We argue similarly if u 1 is divisible by q/q 0 instead of u 2 .
Averaging in a , we conclude that the Fourier coefficient E n 1 ,n 2 ∈Z f (n 1 , n 2 )(1 n 1 n 2 =a (q) − E a ∈A 1 n 1 n 2 =a (q) )e q (u 1 n 1 + u 2 n 2 ) is bounded in magnitude by 2τ (q 0 ) 2 q 3/2 0 τ (q)q −3/2 (u 1 , u 2 , q) 1/2 , and vanishes whenever u 1 or u 2 vanish in Z/qZ. To establish the claim, it now suffices by the Fourier inversion formula to obtain the identity for any integer n. By the Chinese remainder theorem, it suffices to establish this identity at each prime p, that is to say it suffices to show that whenever p is a prime, 0 ≤ j 0 ≤ j, and a is an integer with (a, p j )|p j 0 , where α p := p p−1 if j > j 0 and (a, p j ) = p j 0 , and α p = 1 otherwise. But this follows by a direct case analysis: • If j = j 0 , then the conditions (a , p j ) = (a, p j ) and (n, p j ) = (a, p j ) are redundant, α p = 1, a is restricted to a single residue class mod p j , and both sides are equal to 1 n=a (p j 0 ) . • If j < j 0 and (a, p j ) < p j 0 , then the conditions (a , p j ) = (a, p j ) and (n, p j ) = (a, p j ) are redundant, α p = 1, a is restricted to p j−j 0 residue classes mod p j , and both sides are equal to 1 p j−j 0 1 n=a (p j 0 ) . • If j < j 0 and (a, p j ) = p j 0 , then α p = p p−1 , a is restricted to p−1 p p j−j 0 residue classes mod p j , and both sides are equal to p p−1 1 p j−j 0 1 n=a (p j 0 ) .

First step: replacing the Liouville function with a Siegel model
We now execute step (i) of the strategy outlined in the introduction. From (2.6) we have the splitting λ = λ (≤R) * λ (>R) . In view of Corollary 3.6, we expect λ to resemble the exceptional character χ on the rough numbers N (>R) . It is therefore natural to introduce the Siegel approximant thus λ Siegel is the completely multiplicative function that agrees with λ for primes p ≤ R and agrees with χ for primes p > R. Similar approximants were also introduced in [6], [2]. Clearly λ, λ Siegel are both bounded by 1: The error between λ and λ Siegel can be controlled by exceptional primes and by rough numbers:  Proof. If n is not divisible by any exceptional prime p * > R, then we have λ(n) = λ Siegel (n) since λ, λ Siegel agree on every prime dividing n. Clearly (4.3) holds in this case. If n is divisible by an exceptional R < p * ≤ x/R, then the first term on the right-hand side of (4.3) is at least one, and the claim (4.3) then follows from (4.2).
The only remaining case is if n is divisible by an exceptional prime p * ≥ x/R, so n = p * d for some d ≤ 2R. Since n/d = p * ≥ x/R is prime, the second term on the right-hand side of (4.3) is at least one, and the claim (4.3) again follows from (4.2).
In this section we establish Proposition 4.2 (Replacing λ with a Siegel model). We have

From (4.2) and the triangle inequality, it suffices to show that
We begin with (4.4). For n ≤ x and 1 ≤ j ≤ k, the quantity Λ(n + h j ) is bounded by log(2x)1 (≥ √ 2x) (n + h j ) unless we are in the exceptional case where n + h j is of the form p i for some prime p < √ 2x (cf. the sieve of Eratosthenes). The contribution of such exceptional cases can easily be shown to be ≈ 0, so it suffices to show that (log k x) Let p * be as in the above sum. Changing variables, we have Let C 0 be a sufficiently large constant depending on h 1 , . . . , h k , h j . Then for any prime C 0 < p ≤ √ 2x other than p * , the support set of 1 and hence by Mertens' theorem (3.6) and the bound p * ≤ x/R The claim (4.4) now follows from Corollary 3.6 and (2.17). Now we prove (4.5). Arguing as in the proof of (4.4), it suffices to show that For d ≤ 2R, we have after change of variables that With C 0 as before, we see that for any prime C 0 ≤ p < √ 2x not dividing d we are excluding k + 1 residue classes modulo p (since h j is distinct from h 1 , . . . , h k ), hence by Lemma 3.2 By (2.5) we may therefore bound the left-hand side of (4.6) by By (3.6) this latter quantity is O(log x R), and the claim follows from (2.13).

Second step: replacing the von Mangoldt function with a Siegel model
We now execute step (ii) of the strategy outlined in the introduction. In order to (mostly) restrict to rough numbers, we will insert the Selberg sieve ν defined in (2.23). Namely, observe that Λ − Λν is supported on prime powers p j with p ≤ R and can be crudely bounded by O(log 2 x) on such powers. Since the number of such powers of size O(x) is crudely bounded by O(R log x), one easily sees from the triangle inequality that E n≤x Λ(n + h 1 ) · · · Λ(n + h k )λ Siegel (n + h 1 ) · · · λ Siegel (n + h ) ≈ E n≤x Λν(n + h 1 ) · · · Λν(n + h k )λ Siegel (n + h 1 ) · · · λ Siegel (n + h ) (5.1) (with plenty of room to spare in the error term). Next, we expand Λν = (µ * log)ν.
Since µ is expected to be close to χ on rough numbers, and the Selberg sieve ν is mostly restricted to such numbers, it is then natural to introduce the Siegel approximant Λ Siegel := (χ * log)ν.
From the triangle inequality we have the crude bounds We also have the following bound for the error between Λν and Λ Siegel : For n ≤ 2x, we have the bounds Proof. If n is divisible by an exceptional R 0 < p * ≤ √ 2x, then E τ (n)ν(n) log x, and (5.3) then follows from (5.2) and (5.4). Similarly if n is divisible by the square of a prime p > R 0 (which must then necessarily be at most √ 2x). Next, suppose that n is not divisible by any exceptional prime p * > R 0 , nor by any square p 2 of a prime p > R 0 . We write χ * log = (1 * χ) * µ * log = (1 * χ) * Λ.
Note that 1 * χ(d) is only non-zero when d is the product of exceptional primes times a perfect square, so if d|n and n is as above then d must be the product of some primes less than or equal to R 0 . Also d|n Λ(d) = log n. Thus, for n as above, we have where we recall that n (≤R 0 ) is the largest factor of n that is the product of primes less than or equal to R 0 . Applying (3.10) (with n replaced by n (≤R 0 ) ), we have (1) and the claim (5.3) now follows in this case from (5.5).

Now we can prove
Proposition 5.2 (Replacing Λ with a Siegel model). We have In view of (5.1) it suffices to show that

By (4.2) and the triangle inequality it suffices to show that
for j = 1, . . . , k. Multiplying these estimates together, we conclude that Λ Siegel (n + h 1 ) · · · Λ Siegel (n + h k ) = Λν(n + h 1 ) · · · Λν(n + h k ) By the triangle inequality and relabeling, it thus suffices to establish the bounds We begin with (5.9), which is a variant of (4.5). We can bound (Λν + G)(n + h j ) by O(log(2x)1 (≥R 1/4 ) (n + h j )), and we also have the bound unless n + h j is of the form p m for some p < R 1/4 and m ≥ 1, or p p m for some p < R 1/4 , m ≥ 2, and √ 2x ≤ p ≤ 2x/R 1/2 . There are only O(x log x/R 1/4 ) such exceptional values of n and their contribution is easily seen to be negligible using (3.1). Thus it will suffice to show that Making the change of variables n = p * n − h 1 and using Lemma 3.2 and Mertens' theorem (3.6), we see that The claim (5.10) now follows from Corollary 3.6 and (2.8). Now we turn to (5.7). Observe using (3.10) that and so we can bound the left-hand side of (5.7) by Using Euler products (2.5) we can bound If p = p 0 , then d (p) = 1, and we can calculate

From (3.4) we then have
Also, we have the crude bound Putting all these estimates together, and choosing A large enough, we conclude that Inserting this into (5.11) and using Corollary 3.6, (2.17), (2.16), we obtain the claim (5.7). Finally, we establish (5.8). Observe from (5.5), (5.6) that and so it suffices to show that Applying Lemma 3.4 and (3.1), we may estimate the left-hand side as for any A > 0. The second term is ≈ 0 by (2.16). Replacing the condition d 1 > 1 by (d 1 , . . . , d k ) = (1, . . . , 1), removing the constraints d 1 , . . . , d k ≤ D, and factoring the Euler product using (2.5), the first term can be bounded by Direct calculation givesẼ We can thus bound (5.12) for A large enough by (log k R x) log R R 0 which is ≈ 0 by (2.14). This concludes the proof of (5.9) and hence of Proposition 5.2.

Third step: replacing the Liouville Siegel model with a Type I approximant
We now execute step (iii) of the strategy outlined in the introduction. From (4.1), (2.7), (2.6) and Möbius inversion we have We now split where λ Siegel is the Type I approximant (6.2) λ Siegel := (λ * µχ) (≤R) ψ ≤D * χ and λ Siegel is the error Here ψ ≤D , ψ >D are the smooth cutoffs defined in (2.21), (2.22). In particular we see that λ Siegel (n) = λ Siegel (n) whenever n (≤R) ≤ √ D. Since √ D is significantly larger than R, and R-smooth numbers become extremely sparse at scales much larger than R, we thus see that λ Siegel , λ Siegel agree with each other for "typical" n, and would thus be heuristically expected to be close to each other; in other words, λ Siegel would be expected to be small on average.
Unfortunately, λ Siegel , λ Siegel are not bounded. However, we can still obtain a reasonable bound on the latter quantity: and α(d) are non-negative quantities obeying the bounds for any A ≥ 1.
for any A ≥ 1. We shall need (6.7) later, but we stated Lemma 6.1 in a stronger form to emphasize that it does not use any information on exceptional characters.
To control β we perform a Fourier expansion on ψ >D , which is the only term on the righthand side of (6.9) which is not multiplicative. Applying Fourier inversion (3.13) to the function g(u) := e −u (1 − ψ((log D R)u) and setting u := log R n, we conclude the identity From the triangle inequality we have for any A > 0, while from repeated integration by parts we have for any positive integer A. Combining the two bounds, we conclude that From (6.9), (6.10) we have From (6.8) and the triangle inequality we then have The function |β t | is multiplicative and supported on N (≤R) , thus From (6.12) we see that |β t (p j )| = 1 when p ≤ R and χ(p) = −1 (because λ * µχ agrees with λ * µλ = 1 {1} on N (p) ), and when p ≤ R and χ(p) = 0. For χ(p) = +1 the situation is more complicated: direct calculation gives where P j is the polynomial P j (z) := 1 − 2z + 2z 2 − · · · + (−1) j 2z j .
Note that |P j (1)| ≤ 1 and P j (z) j O(1) (1 + |z| j−1 ) for any z, hence by the fundamental theorem of calculus |P j (z)| ≤ 1 + O(|z − 1|j O(1) ) whenever |z| ≤ 1 + 1 j . Also from the triangle inequality we have |P j (z)| j|z| j for |z| ≥ 1. We thus have |P j (z)| ≤ min(1 + j O(1) |z − 1|, exp(O(j log |z| + 1 + log j))) for |z| ≥ 1. Thus regardless of the value of χ(p), we have the upper bound for p ≤ R, where a t,p j is the quantity a t,p j := min(j C (1 + |t|) log R p, j log R p + 1 + log j) for some large constant C ≥ 1 and for all j ≥ 1, with the convention a t,1 = 0. We conclude that To convert the right-hand side into Type I sums we apply Lemma 3.1(i) to split n = n (>R) n 1 · · · n m where m = O(1) and n 1 , . . . , n m ≤ D lie in N (≤R) . We then have for all p ≤ R, and hence Using the definition of a t,p j and the inequality (j 1 + j 2 ) C C j C 1 + j C 2 , we conclude that for some i = 1, . . . , m. In particular, we see that We therefore obtain the bound (6.4) with It remains to establish the bound (6.6). We use Fubini's theorem and Euler product expansion (2.5) to bound For j ≥ 2 we use the crude bound for any p ≤ R. For j = 1 we have exp(O(a t,p j )) ≤ 1 + O(min((1 + |t|) log R p, 1)), and thus using 1 + x ≤ e x we obtain Using (6.11) we obtain (6.6) as required.
From Lemma 6.1 we have for j = 1, . . . , . Multiplying these estimates using (4.2) and the triangle inequality, and relabeling, we reduce to showing that E n≤x |Λ Siegel (n + h 1 ) · · · Λ Siegel (n + h k )|H(n + h 1 ) · · · H(n + h ) ≈ 0 for any 1 ≤ ≤ . By (5.2) it suffices to show that Expanding out (6.5), the left-hand side is using (3.10), one can bound this further by where we use the notations k := k + , h k+j := h j , and d k+j := d j for j = 1, . . . , . Applying Lemma 3.4, we can bound this by for any A > 0. The contribution of the latter term D k R 2k x is ≈ 0 thanks to (2.16). By (6.6), the former term can be bounded by for any A > 0, which by (2.5) can be bounded by We can bound E p (σ) ≤ 1 + O min(σ log R p, 1) p so by (3.4), (2.15) and setting A large enough we conclude (6.13). This completes the proof of Proposition 6.3.

Fourth step: replacing the von Mangoldt Siegel model with a Type I approximant
We now execute step (iv) of the strategy outlined in the introduction. In this step we will achieve power savings in many of our error terms, and as a consequence we can often afford to lose factors such as x O(ε) , in contrast to other sections where even a loss of log x is often unacceptable.
It is convenient to perform a smooth dyadic decomposition of the convolution χ * log in order to run a smoothed version of the Dirichlet hyperbola method. Let φ : R → R be a smooth even function supported on [−1, 1] of total mass one. For any t > 0, define the function which is a smooth cutoff to the interval [t/e, et]. Then for any natural number n, one has the identity where we made the change of variables u := log n − log t. We conclude that As it turns out, the Dirichlet convolution χ * Φ t is of an adequate "Type I" form when t ≤ Dq 2 χ or x/t ≤ (Dq 2 χ ) 2 . Accordingly, we split and (χ * log) is the error Thus, (χ * log) is the modification of χ * log formed by replacing the cutoff Φ t with Φ Dq 2 χ ) 2 of t (using a smoothed version of the upper cutoff t ≤ x (Dq 2 χ ) 2 in order to facilitate some technical computations in the next section). As it turns out, it will be the second term in the right-hand side of (7.3) (in which the Φ t term is supported in values x/(Dq 2 χ ) 2 , so that the χ term is supported in values (Dq 2 χ ) 2 ) that will give the main contributions, being a more complicated version of the (untwisted) Type I sum (χψ ≤(Dq 2 χ ) 2 ) * log. We then have a similar spliting Λ Siegel := (χ * log) ν and Λ Siegel := (χ * log) ν.
We have good bounds on the distribution of (χ * log) or Λ Siegel in residue classes a (q) with q almost as large as x 2/3 , as long as (a, q) is not too large: Proposition 7.1 (2/3 level of distribution). Let 0 < ε < 1 2 , 1 ≤ q ≤ x, and a be an integer. Let I be a subinterval of [0, 2x]. Let f : Z → [−1, 1] be a q χ -periodic function.
(i) We have (ii) If ε is sufficiently small depending on k, , ε 0 , then we have The powers of (a, q) and q χ are of minor importance and these terms can be neglected on a first reading. The key point here is that we can have a power savings over the trivial bound of O ε (x 1+ε /q) even when q is somewhat above x 1/2 (indeed, the above bounds can remain non-trivial as q approaches x 2/3 ).
Proof. We first prove (i). Note that 0 ≤ (χ * log) (n) ≤ χ * log(n). From (3.1) we may bound the left-hand side of the claim by O ε (x O(ε) (1 + x/q)). From this we see that we may assume without loss of generality that we may take ε is sufficiently small depending on k , and we may also assume that q ≤ x 2/3 , since otherwise the above crude bound is already dominated by x O(ε) q 1/2 and hence by x q (a,q) 3/2 q 9/2 . By shrinking I slightly (and using (3.1) to treat the error) we may assume that The integrand in (7.4) is only non-zero in the range Dq 2 By the fundamental theorem of calculus one has dt t whereΦ t (n) := φ log n t so by the triangle inequality (and increasing ε slightly) it will suffice to show that We can approximate 1 I by a cutoff ψ I : R → R supported on I obeying ψ I (y) = 1 whenever dist(y, I) ≥ x 1−2ε , and additionally obeying the derivative estimates )j for all j ≥ 0 and y ∈ R, with the error being acceptable by (3.1). It thus remains to establish the bound The left-hand side can be rewritten as n 1 ,n 2 χ(n 1 )Φ t (n 2 )ψ I (n 1 n 2 )1 n 1 n 2 =a (q) f n 1 n 2 − a q .
We first estimate the quantity Y . From repeated summation by parts we have n 1 ,n 2Φ t (n 2 )ψ I (n 1 n 2 )e q (u 1 n 1 + u 2 n 2 ) ε x −1+O(ε) Writing u 1 = du 1 , u 2 = du 2 with d = (u 1 , u 2 , qq χ ) 1/2 , we then have Thus we see that the contribution of Y is acceptable. Now we consider the contribution of X. From Möbius inversion we have for some d with (a , qq χ )|d|qq χ . On the one hand, we see from (3.1) (noting that the constraints n 1 n 2 = a (q 0 ), d|n 1 n 2 constrain n 1 n 2 to at most one residue class modulo [d, q 0 ]) that On the other hand, we can write where F is the [d, q 0 ]-periodic function F (n 1 , n 2 ) := χ(n 1 )1 n 1 n 2 =a (q 0 ) 1 d|n 1 n 2 .
Finally, suppose that t < √ x. Now we make the change of variables d * 2 := n+h 2 d 2 and rewrite the bound as Observe that the summand vanishes unless d * 2 √ x. Now we can repeat the previous arguments (using d * 2 in place of d 2 , and the q χ -periodic function χ in place of Φ t , noting that (2.20) can handle several additional losses of q χ ) to conclude.

Fifth step: Computing the Type I correlations
We now execute step (v) of the strategy outlined in the introduction by establishing Proposition 8.1 (Evaluating the Type I correlation). We have where S is the quantity in Conjecture 1.3.
We first dispose of the easy case > 0, in which S vanishes. For 1 ≤ j ≤ k, we see from (7.3) and replacing d by n/d in the first and third factors, and truncating the very small or very large values of t (where the summand vanishes) that (χ * log) (n) = In all of these terms, the summands vanish unless d (Dq 2 χ ) 2 . One can then write where Ψ : R + → R is the smooth function and c d is the coefficient For the current analysis we will need the crude bound where we use the total variation norm Combining this with the expansion (7.8), we see that where Ψ d : R + → R is a smooth function and g d,d is a coefficient obeying the bounds Using the decomposition (7.13) to expand λ Siegel (n + h j ), we can thus write Λ Siegel (n + h 1 ) · · · Λ Siegel (n + h k )λ Siegel (n + h 1 ) · · · λ Siegel (n + h ) as which on evaluating the d j sums, and then writing d := d 1 · · · d k , can be bounded by From (2.16) we see that and then by (2.5) we can bound (8.6) by One can calculate when p q χ and otherwise, thus by (3.7) the preceding expression (8.7) is if ε is small enough. Applying (2.18) we conclude that This concludes the treatment of the > 0 case. Now suppose that = 0. The above arguments allow us to dispose of the g d,d contributions in (8.4), leaving us with the task of showing that This is a correlation of Goldston-Yıldırım type and can be calculated by a lengthy but straightforward calculation, basically a more careful variant of Lemma 3.4. We follow the Fourier-analytic method laid out in [9, Appendix D], as follows. Using Lemma 3.3, (8.5), and summation by parts, we can write the left-hand side here as Using (3.1), the contribution of the error term is at most ε (RDq 2 χ ) 2k x ε x for any ε > 0, which is ≈ 0 for ε small enough thanks to (2.16). Thus it remains to show that The contribution of those y with y ≤ x 1−ε 2 0 is bounded by This function is not multiplicative in d, but it can be Fourier expanded as a linear combination of multiplicative functions: which on making the change of variables v := u − s factors as From the triangle inequality one has R e (1+it)v φ log y + h j x − v log x dv 1 log x while from integration by parts (and (2.9)) one has R e (1+it)s ψ log x 2 log(Dq 2 χ ) s (1 − s) ds m (1 + |t|) −m for any m ≥ 0, thus yielding (8.13). Also, from (3.16), and integration by parts one has where we have used the observation that ψ( log(x/t) 2 log(Dq 2 χ ) ) equals to 1 on the support of φ (log y+h j t ) (since one then has x/t x/y x ε 2 0 ). This gives (8.15).
We now perform an even more precise analysis of the Euler factors E p,t 0,1 ,...,t 2,k . Let us first suppose that p is larger than C 0 for some sufficiently large C 0 (depending on h 1 , . . . , h k , k). Then p does not divide 1≤i<j≤k (h i − h j ). Thus in order for the sum in (8.17) to be non-zero, at most one of the d j can be greater than 1, and hence E p,t 0,1 ,...,t 2,k = 1 + k j=1 ∞ l=1 c p l ,t 0,j ,t 1,j ,t 2,j p l .