The polynomials X2+(Y2+1)2$X^2+(Y^2+1)^2$ and X2+(Y3+Z3)2$X^{2} + (Y^3+Z^3)^2$ also capture their primes

We show that there are infinitely many primes of the form X2+(Y2+1)2$X^2+(Y^2+1)^2$ and X2+(Y3+Z3)2$X^2+(Y^3+Z^3)^2$ . This extends the work of Friedlander and Iwaniec showing that there are infinitely many primes of the form X2+Y4$X^2+Y^4$ . More precisely, Friedlander and Iwaniec obtained an asymptotic formula for the number of primes of this form. For the sequences X2+(Y2+1)2$X^2+(Y^2+1)^2$ and X2+(Y3+Z3)2$X^2+(Y^3+Z^3)^2$ , we establish Type II information that is too narrow for an aysmptotic formula, but we can use Harman's sieve method to produce a lower bound of the correct order of magnitude for primes of form X2+(Y2+1)2$X^2+(Y^2+1)^2$ and X2+(Y3+Z3)2$X^2+(Y^3+Z^3)^2$ . Estimating the Type II sums is reduced to a counting problem that is solved by using the Weil bound, where the arithmetic input is quite different from the work of Friedlander and Iwaniec for X2+Y4$X^2+Y^4$ . We also show that there are infinitely many primes p=X2+Y2$p=X^2+Y^2$ where Y$Y$ is represented by an incomplete norm form of degree k$k$ with k−1$k-1$ variables. For this, we require a Deligne‐type bound for correlations of hyper‐Kloosterman sums.


INTRODUCTION
Friedlander and Iwaniec [6] famously showed that there are infinitely many primes of the form 2 + 4 . This result is striking as integers of this form are very sparse -the number of integers of the form 2 + 4 less than is of order 3∕4 . Prior to this, Fouvry and Iwaniec [3] had shown that there are infinitely many primes of the form 2 + 2 with also a prime. The result of [6] was extended to primes of the form 2 + 4 with also a prime by Heath-Brown and Li [11]. Pratt [18] has shown that there are infinitely many primes of the form 2 + 2 with missing any three fixed digits in its decimal expansion, which is also a very sparse sequence.
Another key result concerning primes in sparse polynomial sets is the result of Heath-Brown of infinitely many primes of the form 3 + 2 3 [9], which has been generalized to binary cubic forms by Heath-Brown and Moroz [12] and to general incomplete norm forms by Maynard [16]. Recently, Li [15] showed that there are infinitely many primes of the form 3 + 2 3 with relatively small.
The work of Friedlander and Iwaniec for primes of the form 2 + 4 relies on the factorization over Gaussian integers 2 + 4 = ( 2 + )( 2 − ) and the great regularity in the distribution of squares 2 . In particular, the argument fails to capture primes of the form 2 + ( ) 2 for nonhomogeneous quadratic polynomials ( ). Our first main result resolves this, with the caveat that instead of an asymptotic formula, we get lower and upper bounds of correct order of magnitude for primes of this shape.

Theorem 1.
There are infinitely many primes of the form 2 + ( 2 + 1) 2 . More precisely, we have The methods developed in this paper are quite flexible and work for Gaussian primes with one coordinate given by a polynomial sequence that is not too sparse compared to the degree. We are also able to show the following. Theorem 2. There are infinitely many primes of the form 2 + ( 3 + 3 ) 2 with , > 0. More precisely, we have ∑ ⩽ ∑ = 2 +( 3 + 3 ) 2 , >0 1 ≍ 5∕6 log .
Remark 1. The proof of Theorem 1 is easily generalized to primes of the form 2 + ( ) 2 , where ( ) = 2 + for any integer ≠ 0, with = 0 corresponding to [6]. The proof of Theorem 2 as given below generalizes (with minor technical nuisances) to primes of the form 2 + ( 1 ( ) + 2 ( )) 2 , where 1 and 2 are polynomials of degree at most 3. It is possible to handle also primes of the form 2 + g( , ) 2 for a cubic polynomial g( , ) that could involve mixed terms, but this would require additional work (replacing the Weil bound by the Deligne bound for two-dimensional sums to get an analog of Lemma 21).
The assumption that ∕ℚ is Galois could be removed with more work. With a lot more precise numerical work, it should be possible to give a non-trivial lower bound for all ⩾ 3. The result we prove actually approaches to an asymptotic formula as becomes large. The only reason for restricting to a norm form is that we need the fact that the number of representations with ≪ 1∕ is bounded by ≪ , which is not known for general forms. For the diagonal form 1 + ⋯ + −1 , we can obtain a result conditional to such a hypothesis. Working with a norm form also simplifies the bounding of the relevant exponential sums of Deligne type.

Sketch of the arguments
Let us focus on the case 2 + ( 2 + 1) 2 in this sketch, as the proofs of the other results are similar. Using a sieve argument counting primes is reduced to estimating Type I and Type II sums of the 1 with = with bounded coefficients and with satisfying the Siegel Walfisz property (4.1). We can evaluate the Type I sums in the range ⩽ 3∕4− by essentially the same argument as in [6,Section 3], applying the large sieve inequality for roots of quadratic congruences.
We will be able to handle the Type II sums in the range for any small > 0. This is in contrast to FI, where the range is essentially 1∕4+ ≪ ≪ 1∕2− . Our Type II information is not sufficient for an asymptotic formula, but using Harman's sieve method [7], we are nonetheless able to get a lower bound of the correct order of magnitude. In contrast to the work of Friedlander and Iwaniec [5], our sieve argument is much more similar to the one in Heath-Brown's work [9].
In the situation of Theorem 2, we are able to handle Type I sums for ⩽ 5∕6− and Type II sums in the range 1∕6+ ≪ ≪ 2∕9− .
For the Type II sums, we first use a device of Heath-Brown to reduce to the special case where ( ) satisfies the Siegel-Walfisz property with main term 0 (see (4.3)) -this replaces the sieve argument of Friedlander and Iwaniec [5] in our work. It then suffices to show that for satisfying The initial part of the argument is the same as in [6,Sections 4 and 5]. Assuming for simplicity that ( , ) = 1, we may write = | | 2 , = | | 2 , = 2 + 1 + for Gaussian integers , ∈ ℤ[ ], and denote = (| | 2 ) and = (| | 2 ). Then our Type II sums are essentially ∑ 1.
For simplicity, assume that Δ is square-free. Writing By using the bound ( ) ≪ 1∕2+ , we can truncate the sum at ⩽ log 2 , getting which can now be bounded by the same argument as in [6,Section 16], using the Siegel-Walfisz property (4.3). In contrast, Friedlander and Iwaniec [6] have a singular curve with which produces the sum This sum cannot be truncated and for large , they have to show cancelation in the sums over 1 and 2 . Thus, for the main term having ( 2 + 1) 2 rather than 4 turns out to be a friend rather than an enemy. Similar arguments apply for the sequence 2 + ( 3 + 3 ) 2 .
The article is structured as follows. In Section 3, we state and prove a Type I estimate (Proposition 4). Using this, we prove a fundamental lemma of the sieve type result (Proposition 5). In Section 4, we state our Type II information (Proposition 12) and reduce it to a simpler case (Proposition 13). In Section 5, we will state several corollaries of the Weil bound. After this preparation, we give the proof of our Type II estimate in Section 6. In Section 7, we apply Harman's sieve method with the gathered arithmetic information to prove Theorems 1 and 2. In Sections 8, 10, and 11, we give the proof of Theorem 3, where we will assume that the reader is familiar with the techniques from the previous sections. The key estimates are Lemmas 32 and 36. The first gives essentially a square-root bound for the completed exponential sums unless the frequency parameters lie in some bad hyperplanes, as an application of bounds for correlations of hyper-Kloosterman sums [17]. The second shows that on average over the frequency, parameters are not too often in these bad hyperplanes.

Notations
For functions and g with g ⩾ 0, we write ≪ g or = (g) if there is a constant such that | | ⩽ g. The notation ≍ g means g ≪ ≪ g. The constant may depend on some parameter, which is indicated in the subscript (e.g., ≪ ). We write = (g) if ∕g → 0 for large values of the variable. For summation variables, we write ∼ meaning < ⩽ 2 , that is, We let > 0 denote a large constant. In particular, we write ≪ g∕ log , meaning that this bound holds for any fixed > 0.
For a statement , we denote by the characteristic function of that statement. We let ( ) ∶= 2 and ( ) ∶= ( ∕ ) for any integer ⩾ 1. We write + ( ) for the largest prime factor of , with the convention that + (1) = 1. We denote
Lemma 7 (Poisson summation). Let be as in Section 3.1 for some ∈ (0, 1∕10) and denote ( ) ∶= ( ∕ ). Let ≫ 1 and let ∼ be an integer. Let > 0 and denote Then for any > 0 Proof. By the usual Poisson summation formula, we have For |ℎ| > , we have by integration by parts ⩾ 2 timeŝ which gives the result. □ We need the following lemma bounding the number of representations of an integer as a sum of two cubes.
Lemma 10. For every square-free integer and every ⩾ 2, there exists some | such that ⩽ 1∕ and From this, we get the more general version.

Proof of Proposition 4
In this section, all implicit constants depend on the parameter , that is, and ≪ stand for and ≪ throughout this section. Let us first consider the case = 1, and denote The contribution from > is trivially bounded by For ⩽ in ( , ), we apply the Poisson summation formula (Lemma 7) with ∶= ′ 1 ∕ for some small ′ > 0 to the sum over to get We can absorb the factor 1 ∕ ≍ 1 into the coefficient for the second sum.

Bounding the main term
We get by Poisson summation (Lemma 7, using > 1∕2− ) Summing over and recalling the definition (2.3) of 1 , we see that the main terms cancel in so that the total contribution from the main term is which is sufficient for (3.1).
Let us consider this for = 2, the case = 1 being similar but easier. Let ∶= ∕ and denote Then by Cauchy-Schwarz and Lemma 6, the left-hand side in (3.2) is at most .
We have The details are essentially the same, with the only change that we let bounding the contribution from min{ 0 , 0 } ⩽ 1∕6− trivially. The contribution of the main term from Poisson summation is handled by the same argument as in Section 3.3.1, applying Poisson summation to the variables 0 , 0 to extract 2 ( ).
For the nonzero frequencies, the argument is essentially the same as in Section 3.3.2. Let To get the bound corresponding to (3.3), one also has to use Lemma 8 to bound the number of representations as a sum of two cubes, which gives ∑ The end bound is (using ⩽ 2 1∕2 , 1 ⩽ ′ 1 ∕ , ⩽ , and ⩾ 1∕2− ) which is sufficient for the bound (3.2) if 2 ⩽ 5∕6− ′ with = 4 and ′ = 100 .

Proof of Proposition 5
To prove Proposition 5, we write For ⩽ , we get by Proposition 4 with level For > , we note that | ( ) implies that there is a divisor 1 | with 1 ∈ [ , 2 ]. Thus, by crude bounds, Lemma 11, and Proposition 4, we get by applying Lemma 9 to get the last bound. For coefficients ( ), we denote ∶= (| | 2 ) for ∈ ℤ[ ] and for any > 0 write ( ) ∶= ( ∕ ). It is convenient for us to give the following definition of the Siegel-Walfisz property over Gaussian integers, where the variable is weighted by a smooth function supported on a polar box.

Definition 1 (Siegel-Walfisz property on ℤ[ ])
. We say that coefficients ( ), supported on ∼ , satisfy the Siegel-Walfisz property if the following holds for any > 0 and any smooth weight as in Section 3.1 with −1 = log (1) . For any ∈ [ , 5 ], 1 ∼ , ∈ ℤ[ ] and = log (1) , we have   For all coefficients ( ) that we consider, this property follows by standard arguments (similar to [6,Section 16]). More precisely, we will need this for ( ) which have a large prime factor, that is, for ( ) such that for some ≍ with ≫ and for some divisor bounded coefficients ( ) supported on ( , ( )) = 1, we have Fixing | | 2 = and relabeling, it then suffices to show that for ≫ , we have Using Dirichlet characters to detect ≡ ( ) and the Fourier series expansion to the contribution from the principal character and the zeroth frequency in the Fourier expansion of cancels the main term on the right-hand side of (4.1). For nonprincipal characters or nonzero frequencies, we use Melling inversion for and shift the contour past = 1. Shifting the contour is justified by applying the Siegel-Walfisz bound [6, Lemma 16.1] with the character where a Dirichlet character of (ℤ[ ]∕ ℤ[ ]) × . Our Type II information for Theorems 1 and 2 is given by the following.
Definition 2 (Siegel-Walfisz property with main term equal to 0). We say that coefficients ( ), supported on ∼ , satisfy the Siegel-Walfisz property with main term equal to 0 if the following holds for any > 0 and any smooth weight as in Section 3.1 with −1 = log (1) . For any ∈ [ , 5 ], 1 ∼ 2 , ∈ ℤ[ ] and = log (1) , we have We show in Section 4.1 that for Proposition 12, it suffices to prove the following.

Proof That Proposition 13 implies Proposition 12
The idea is similar to one that appears in Heath-Brown's work [9]. Let ∶= 1∕(log log ) 2 . For a given > 0, let ′ = (log ) − for some large > 0 and let ′ be a smooth function as in Section 3.1 with parameter ′ . Let Given the coefficients ( ), we write approximates ( ) by ( , ( ))=1 while mimicking the distribution of (| | 2 ) over short intervals near . If ( ) is bounded and supported on ( , ( )) = 1, then the coefficient # ( ) is also bounded since | ( , ′ )| ≪ 1 and ∫ ′ (1∕ ) = ′ . By Proposition 5, the claim of Proposition 12 holds for replaced by # , that is, if we let ( , ) denote the Type II sum, then since the weight ′ ′ ( ) may be removed by partial summation. Furthermore, by the fundamental lemma of the sieve (arguing similar to Section 3.4), it follows that # ( ) satisfies the Siegel-Walfisz condition (4.1). Thus, if ( ) also satisfies the Siegel-Walfisz property, then the coefficient ( − # )( ) satisfies the Siegel-Walfisz property (4.3) with a main term equal to 0, as can be seen from where the last bound holds by the construction (4.4) because the parameter ′ (associated to ′ ′ ) is small in comparison to the parameter (associated to 1 ). Thus, the using decomposition

APPLICATIONS OF THE WEIL BOUND
For the proof of Proposition 13, we require the Weil bound both for counting points on certain curves over finite fields as well as for bounding algebraic exponential sums. Let be a prime and fix an algebraic closure of the finite field . Let be the projective curve defined by a polynomial equation with varying . For ≠ 0, 1, the homogenization (1 − ) 2 0 + 2 1 − 2 2 = 0 defines a nonsingular curve. For Theorem 2, we end up with affine varieties defined by the equation with varying . By fixing 3 and 4 , this reduces to understanding curves defined by the equation which is nonsingular for ≠ 0.

Points on curves
We set For ⩾ 2, we define and for any , Then, by the Chinese remainder theorem, For any and ⩾ 1, we have For any , we have | ( )| ⩽ ( ) (1) .
Proof. For ≢ 0, 1 ( ), the projective curve over defined by is nonsingular, so the claim follows from the classical Hasse-Weil bound ( [21] or [8, Excercise V.1.10], e.g.). The last claim follows from Recall that an integer is said to be powerful if | implies 2 | . The next lemma gives an evaluation of 1 ( ; ) for a generic unless has a large powerful factor. When we apply this, we take = log and make use of the fact that that have a large powerful factor are very sparse.
Proof. By the Chinese remainder theorem, we have ∑ where ( ∕ ) −1 denotes the multiplicative inverse modulo . For ||( ∕ 1 ), we use the trivial bound For | 1 , we have by Lemma 19
Using Lemma 21, we get the following by the Chinese remainder theorem.
that is, We first bound separately the contribution from the diagonal terms where Δ = 0 or ( 1 , 2 ) > 1.
Hence, from arg 1 = arg 2 , we get a contribution to (1) 1 of size by using the bound ( ) ≪ and the assumption 1 ≫ 1∕4+ . The same bound holds for (1) 2 , so that the diagonal contribution is sufficiently small in terms of (6.2).
Similar arguments apply in the case = 2, with 2 + 1 replaced by 3 + 3 , and by using the trivial bound Ω( , ) ≪ max{ , } ≪ 1∕6 , we get a contribution Using Lemma 8 to bound the number of representations as a sum of two cubes, we get ∑
Thus, the last expression is bounded by where we have used the bounds ∑ (1) .
Similar arguments again apply when = 2 or = 2.
In particular, we have Recall that That is, is fixed for given 1 , 2 , 1 , 2 , and we have Note that ( 1 denoting the complex conjugate) is a congruence in ℤ despite the fact that 1 , 2 ∈ ℤ[ ], and note that ( , Δ) = 1. Hence, we have .
Hence, for any fixed tuple ( , , ), there is a constant ( , , ) such that for all , in the support of (1) Accordingly, we write where in (1) 1112 , we have used the triangle inequality to put absolute values around . We will show in Section 6.4.3 that so that by taking 1 to be large enough in terms of , we get √ (1) 1112 ≪ 1 log − .
Before this, we consider the main term (1) 1111 , for which we need to bound sums of the form Since the cost from introducing smooth weights is bounded by ≪ 6 log 6 = log (1) , to handle (1) 1111 , it suffices to show that for any > 0,  Note that herê(0) ≪ .
Using Lemma 23, we see (taking to be sufficiently large power of log ) In 1 the condition |Δ is equivalent to 2 ≡ 1 ( ) for some ( ), so that The condition ( 1 , 2 ) = 1 can be dropped with a negligible error term by trivial bounds, recalling that ( ) is supported on ( , ( )) = 1. Similarly, the condition max{Δ 2 , ( 2 − 1 , Δ)} ⩽ may also be dropped with a negligible error term by crude bounds.
Recall that in the support of (1) 1 (2) 2 , we have for some constant The contribution from ( √ ) is bounded by trivial bounds, recalling that ( ) are supported on small polar boxes. Applying (4.3) (the Siegel-Walfisz property), we get which is sufficient for (6.7) since the cost from introducing the smooth weights was −6 .
Remark 4. At first, it may seem surprising that in our problem, the evaluation of the main term is vastly easier than in the situation in [6]. This is because here with ( ; Δ), we are computing solutions to a nonsingular equation. This allowed us to bound the large moduli by using the Hasse-Weil bound (Lemma 14).
By Lemma 23, the last expression is at most Remark 5. Morally speaking, we have in the above applied the bound 1 ( ; Δ) ≪ Δ 1∕2 . The corresponding exponential sum in [6] is bounded in terms of Ramanujan sums, so that in there one morally gets a bound ≪ 1 for the exponential sums. This loss is the reason why our Type II range is narrower than in [ 2 by a trivial point count with 1 and 2 in small polar boxes, we can evaluate the sum to obtain for any > 0 Since the integration over the tuple ( , , ) is weighted by −6 log (1) (as in Section 3.1), we get 1112 ≪ 1 log (1) .

√ ) (mod )
To complete the proof for (1) , we need to show the bound (6.4) for the contribution from (1) 112 , where arg 1 = arg 2 + ( √ ) (mod ). The idea is similar as in the previous section but we cannot ignore the weight , which now restricts the variables to a narrow subset. We split (1) 112 dyadically according to the size of Δ ∼ 1 to get .
We now evaluate the inner sum by Poisson summation (Lemma 7) to get The nonzero frequencieŝare bounded just as before, making use of bound which is ≪ 1 log − once we choose a sufficiently large 1 in = log − 1 .
Remark 6. When we split 1 into intervals of length 1 1 , it can happen that 1 1 < 1 for small 1 , so that the sum over 1 may be empty. This is not a problem since we are not trying to show that the error term̂from the Poisson summation is smaller than the main term, only that both terms are smaller than ≪ 1 log − .

6.5
The off-diagonal for ( ) Here, we apply the same arguments as with (1) 11 , except that we count solutions to By similar arguments as in Section 6.4.5, in place of (6.11), we get (2) 11> ≪ 5∕6+2 3∕2 2 + 2 log (1) , which suffices since 2 ≪ 1∕3−10 . Note that it is not necessary to separate large values of ( 2 − 1 , Δ) since the bounds from Section 5 we will use here do not care about the common factor ( − 1, ).
For (2) 11⩽ , the technical issues of removing the smooth cross-condition 2 and bounding the part 1 = 2 + ( √ ) (mod ) are handled similarly as for (1) 11⩽ (cf. Sections 6.4.3 and 6.4.4), so that we need to consider sums of the form  we get a sufficient bound by using Lemma 18 and the Siegel-Walfisz property with main term 0 (4.3) similarly as in Section 6.4.1.

6.6
The off-diagonal for ( ) Here, we can apply the same arguments as in the above sections, but the evaluation is, of course, much easier. We are counting 1 , 2 that satisfy the simple equation (with ( , Δ) = 1) Here, the lengths of satisfy > 1∕2− > ⩾ |Δ| , so that after Poisson summation, we only get a contribution from the frequency (0,0). The point count modulo Δ corresponding to 1 ( ; Δ) is trivially equal to |Δ|. Thus, applying the Siegel-Walfisz property (4.3), we get Again, the technical issues of the contribution from max{Δ 2 , ( 2 − 1 , Δ)} > , removing 1 , and a bounding the part 1 = 2 + ( √ ) (mod ) are handled similarly as above.

THE SIEVE ARGUMENT FOR THEOREMS 1 AND 2
In this section, we give the proofs of Theorems 1 and 2 by applying Harman's sieve method with the arithmetic information given by Propositions 5 and 12. Define While our arithmetic information is not sufficient to give an asymptotic formula for primes, we can still give an asymptotic formula for certain sums of almost-primes with no prime factors below , as the following Proposition shows. Proof. Let  = ( ) ∈ { ( ) , }. Then by expanding using the Möbius function, we get ∑ We split the sum in two parts, ⩽ and > , and show that in each part, we can get an asymptotic formula by Type I information and Type II information, respectively.
For the first part, we write where for some large constant > 0, For ′ 2 ( ′ ) ⩽ ( ′ ), we get by Lemma 11 with = 10 and by trivial bounds 10  where we have plugged in the factor ( ) 10 3 ∕ log ⩾ 1 to get the penultimate step.
Similarly, in the part > , we get an asymptotic formula by Proposition 12. To see this, let us write = 1 ⋯ to get (noting that ≠ 0 since > ∕ > 1) Since 1 ⋯ > , ⩽ and < , by the greedy algorithm, there is a unique 0 ⩽ ⩽ such that Note that [ − , ] is now exactly the admissible range for in Proposition 12. We obtain that the part > is partitioned into ≪ log 2 sums of the form The cross-conditions +1 > and 1 ⋯ > are easily removed by applying a finer-thandyadic decomposition (Section 3.1 with = log − ) to the four variables , +1 , 1 ⋯ , and +1 ⋯ . Hence, writing we obtain for some coefficients ′ ( ′ ) and ′ ( ′ ) supported on ( ′ ′ , ( )) = 1 sums of the form Here ′ ( ′ ) is supported on ′ ∈ [ − , 2 ] with ( ′ , ( )) = 1. The cross-condition ′ ′ ∼ can be removed by a finer-than-dyadic decomposition and the coefficients ′ ( ′ ), ′ ( ′ ) can be reduced to bounded coefficients by a similar argument as in (7.1). Hence, by Proposition 12, we get In the first term, we see by the support of ′ ′ that is restricted to ≡ 1 (4) by the sum-of-twosquares theorem. Hence, by dropping ∤ ′ , we get .
The contribution from the second term is negligible by trivial bounds because we get a square factor ⩾ . Similarly, we may reduce to the case where ′ ( ′ ) is supported on square-free integers. By the construction of ′ ( ′ ), we can write and remove any cross-conditions involving by a finer-than-dyadic decomposition (Section 3.1 with ) applied to and at most (1) variables. Hence, for some ′′ ( ), the coefficient ′ ( ′ ) is up to a negligible error term replaced by coefficients of the form for  =  ( ) , we can insert the trivial estimate ( ) ( ( ) ) ⩾ 0 for any such that the sign = 1we say that these sums are discarded. For the remaining , we will obtain an asymptotic formula by using Propositions 12 and 25. That is, if  is the set of indices of the sums that are discarded, then we will show To bound the error terms corresponding to ∈ , we need a lemma that converts sums over almost primes over into integrals that can be bounded numerically. Let ( ) denote the Buchstab function (see [7,Chapter 1] for the properties below, e.g.), so that by the prime number theorem for Similarly, using the prime number theorem for primes ≡ 1 (4), we get Note that for 1 < ⩽ 2, we have ( ) = 1∕ . In the numerical computations, we will use the following upper bound for the Buchstab function (see [10,Lemma 5], e.g.) In the lemma below, we assume that the range  ⊂ [ 2 , ] is sufficiently well behaved, for example, an intersection of sets of the type { ∶ < } or { ∶ < ( 1 , … , ) < } for some polynomial and some fixed , . Proof. Recall that 1.
By (7.2) and by the prime number theorem for primes ≡ 1 (4), the left-hand side in the lemma is ∑ by the change of variables = . □
Python codes that compute rigorous upper bounds for these integrals can be found at the following links. This completes the proof of Theorem 24, and as mentioned above, we then get Theorems 1 and 2.

The setup
Let ∕ℚ be a Galois extension of degree and fix a basis 1 , … , for the ring of integers . Define the form  Let be as in Section 3.1 with = log − for some large > 0. We define where > 0 is a large enough constant so that ∫ ∈[0, 1∕ ] −1 2 ( ( 1 , … , −1 )) ≠ 0, so that the denominator in the definition of Ω( 1 , … , −1 ) is always nonzero. We also define the arithmetic factor We set For typical 1 , … , −1 , we have where the upper bound holds for all 1 , … , −1 , so that ∑ ∼ ( ) ≍ .
For the comparison sequence, we set which is the same as before except that we essentially restrict to ⩽ 2 −1 √ . When we write ( ) , we have suppressed the fact that the sequence depends also on and the choice of basis .
We will use similar convention below to other quantities, and with the exception of Section 11, all implied constants are allowed to depend on and the .

Lemmas
We need the following lemma for handling the diagonal terms after Cauchy-Schwarz with the Type I and Type II sums. Basically, this ensures that the density of the numbers 2 + ( 1 , … , −1 ) 2 ∼ is ≍ −1∕(2 )+ (1) , as expected, so that we get an optimal control for the diagonal terms. Here, we do not need to restrict to = 0, and the lemma is essentially an analog of the divisor bound ( ) ≪ for the number field . Proof. Let 1 , … , be such that ( 1 , … , ) = . The principal ideal factorizes uniquely into prime ideals We have for some ∈ Gal( ∕ℚ). The number of choices for that give a different ideal is at most by a divisor bound. To see this, by multiplicativity of ∕ℚ , it suffices to consider the case that = so that ⩽ and for some g = , we have = 1 ⋯ g for some prime ideals ⊆ . Then, the number of choices for that give a distinct ideal is bounded by the number of ways choosing (possibly repeating) elements from a set of size , which is Thus, it suffices to show that the number of units 0 of the ring of integers with indexed by the = 1 + 2 complex embeddings ∶ ↪ ℂ. For every such embedding, we have Thus, our task is reduced to enumerating units 0 such that The set of units of is mapped by to a rank − 1 lattice in ℝ of determinant ≪ 1, so that by considering a Minkowski-reduced basis (see [16,Lemma 4.1], e.g.), we see that the number of such units is For a matrix denote by , the matrix that is obtained by deleting th row and th column. For the exponential sums in the next section, we need the following two lemmas. Then det ≠ 0, that is, ∶ → is invertible. Furthermore, for any fixed , the numbers det ∈ , ⩽ are linearly independent over ℚ, so that in particular det ≠ 0 for all , ⩽ .
Note that is independent of as it only depends on the sign of the permutation of rows defined by ( , ) . Thus, we get Then the column vectors of the matrix = ( ) = ( det ) are linearly dependent, that is, for = ( det ) ⩽ , we have Hence, det = 0, which implies that also for the cofactor matrix = ((−1) + det ), we have det = 0, since can be obtained from by the row/column operations of multiplying the rows by (−1) and the columns by (−1) . Then det −1 = (det ) −1 det = 0, which is a contradiction, since in the above, we have shown that det ≠ 0. □ We include the following lemma for completeness, even though we will not need it.
We shall first consider the exponential sum with the complete norm form with variables ∑ 1 , 1 ,…, , ( ) ( 1 ,…, )≡ ( 1 ,…, ) ( ) ( 2 ⋅ ( , )), and in the proof of Lemma 33, we reduce the case of incomplete forms with − 1 variables to this. The square-root bound would be ≪ −3∕2 , whereas we lose a factor of 3∕2 and get only ≪ for the generic . This does not matter in the application, where we take a large so that the relative loss is quite small.
As is often the case with such exponential sums, it turns out to be helpful to consider a more general sum over finite fields , where the additive character ( ) on is replaced by the additive character on (Tr ∕ ( )).
We need the following lemma, which is equivalent to the rationality of the -function associated with the exponential sum. Lemma 31. Let , g ⩾ 1 and let 1 , … , g ∈ ℂ. Suppose that there are constants , > 0 such that for every ⩾ 1, we have for Then | | ⩽ for all ∈ {1, … , g}.
The purpose of the above two lemmas is that they allow us to reduce bounding an exponential sum over to bounding the corresponding sum over for some fixed = . The benefit of this is that after a suitable extension ∕ , the norm form ( 1 , … , ) factors into a product of linear forms. We need to set up some notations for the next lemma. Recall that for any prime , there are integers , , g with g = such that for some prime ideals of . The integer is called the inertia degree of , and we have for all ⩽ g ∕ ≅ .
We will denote = and choose some for each prime (the exact choice is not important, see Remark 7). For each prime , we denote the reduction map by ∶ → . There are algebraic sets with 2 = 0 ⊇ 1 ⊇ ⋯ ⊇ 2 ⊇ 2 +1 = ∅ such that the following hold.
Since ( 1 , … , ) is a norm form, it splits into linear factors over and hence over where ( 1 , … , ) is the th coordinate of (with as in Lemma 28). Note that the linear map is invertible for sufficiently large, since det ≠ 0 has finitely many prime factors. Thus, after the linear change of variables ( , ) ↦ ( , ) =∶ ( ′ , ′ ), it suffices to consider where ′ = ( ′ 1 , ′ 2 ) is obtained from 2 = ( 2 ,1 , 2 ,2 ) by the linear map We split into separate cases depending on whether ′ 1 ⋯ ′ = 0 or ≠ 0. In the latter case, the contribution is ) .
Suppose first that all of the coordinates of ′ 1 and ′ 2 are ≠ 0. Denote If ′ ≠ (−1) −1 , then this is ≪ ( ) −1∕2 by [17, (6.12)]. If ′ = (−1) −1 , then we use a pointwise bound for the hyper-Kloosterman sums and bound the sum over trivially, which gives ≪ ( ) . The equation ′ = (−1) −1 holds precisely when (9.2) holds, so that this is covered by 1 . Suppose then that exactly 1 and 2 of the coordinates in ′ 1 and ′ 2 are 0, and denote the sets of such indices 1 and 2 . By symmetry, we may suppose that 1 ⩾ 1. Then, after a change of variables and denoting which by [17, (6.11)] is We now note that (with as in Lemma 28) so that the equations ℎ ′ = 0 are precisely of the form (9.3). By Lemma 28, the equations are independent (for sufficiently large), so that has dimension − for each . Consider then the part where ′ 1 ⋯ ′ = 0. If ≠ 0, we also have ′ 1 ⋯ ′ = 0. Thus, this part of the sum is 3) are in by the reduction to ∕ , where = 1 ⋯ g and = . If = 1, then they are in and we get a system of linear equations for ∈ . Since any ∈ Gal( ∕ℚ) permutes the primes 1 , … , g , we see that changing in ≅ ∕ simply changes the set of indices ⩽ such that ℎ ′ = 0, so that the choice of the reduction modulus is inconsequential for our application, since we get the same bound for any set of indices. We can eliminate the use of the (countable) axiom of choice simply by averaging over all choices ∈ { 1 , … , g }. If = , then by Lemma 28, if one of the coordinates of ′ is 0, then all are 0 (for ≫ 1), so that the only solution ∈ is = .
Remark 8. Using the Hasse-Davenport relation, one can show that the function is a constant multiple of the hyper-Kloosterman sum Kl ( ; ) (see [2, Applications de la formule des traces aux sommes trigonométriques (7.2.5)], thanks to Emmanuel Kowalski for pointing this out).
We need to set up some more notations for the following lemma, which uses the Chinese remainder theorem to combine Lemma 32 for different primes . Then, it may happen that lie in different for different primes , where are as in Lemma 32. For any prime , we extend the reduction map ∶ → defined in (9.1) to a map ∶ 2 → 2 ∶ ( 1 , … , 2 ) ↦ ( ( 1 ), … , ( 2 )).
Let 1 = 1 … be a square-free integer and denote = . For ( 1 , … , ), ⩽ , we define That is, we have Then we have the following lemma.
Let denote any of the hyperplanes defining in Lemma 32. We need the following bound for the number of (ℎ 1 , … , ℎ ) ∈ ℤ ⊆ which are in ( ) after reduction modulo some , where as before = 1 ⋯ g and = . For a square-free integer 1 = 1 ⋯ and = ( 1 , … , ) ∈ ℤ ⩾0 , denote Then, the following lemma is simply a generalization of the elementary bound: Furthermore, if is the set of indices such that for ∈ , we have = , then Proof. If we fix ℎ 1 , … , ℎ −1 , then for every such that ≠ 0 and ≫ 1, there is at most one ℎ ∈ with (ℎ 1 , … , ℎ ) ∈ ( ), since by Lemma 28, the coefficients det ≠ 0 for ≫ 1. For ≪ 1, the number of ℎ ∈ is trivially ≪ 1. Thus, the number of ℎ ⩽ 1 is ≪ 1 ( ), and the numbers ℎ 1 , … , ℎ −1 are restricted to a set of type which is a set such that ∈ ′ ′ ( 1 ) if for every ⩽ , for some ′ ′ ⊆ −1 which is a hyperplane of dimension − ′ . That is, in this first step, we have simply solved for ℎ in terms of ℎ 1 , … , ℎ −1 in one of the equations and substituted this to the remaining equations to get ′ − 1 independent equations in − 1 variables.
To get the second bound, we note that = means that ℎ ′ ≡ 0 ( ) for all ⩽ , which implies that ℎ ≡ 0 ( ) for all ⩽ . We sum over (ℎ 1 , … , ℎ −1 ) ≠ , so suppose by symmetry that ℎ −1 ≠ 0. Then, ℎ −1 ≡ 0 ( ∏ ∈ ) is nontrivial and there are no solutions if < ∏ ∈ . Hence, the number of ℎ −1 for fixed ℎ 1 , … , ℎ −2 is Remark 10. It is important for our applications that we get full savings in the longest sum ℎ ⩽ 1 , so that for large , we can cancel the losses from expanding the condition = 0 in Lemma 33.
For < 1∕2, we may assume that ⩾ (since ⩾ ∏ is as close as possible to without exceeding it and for ∉ 0 , we have = − 1, so that we get The form ( 1 , … , ) splits into linear factors of , so that after a linear change of variables, we are counting solutions to

Type II sums
Our type II information is given by the following.
Hence, we can give an asymptotic formula for the terms ⩽ 2 − 1 by Proposition 44 if for some small ′ > 0, we take ∶= ′ log , and our task is reduced to estimating Here, we could iterate Buchstab's identity provided that at each stage 1 ⋯ −1 2 ⩽ , but this is not necessary for us. Note also that this sum is too large to be dropped completely, since by Lemma 7.2 and Stirling's approximation, 2 () (, 2 √ ) = (1 + (1)) ∫ ( ) 1 since the largest we can take is = ⌊ log 2 log 2 ⌋ + 1. The problem is that the primes range over too long intervals, and thus, we seek to replace them by variables with shorter ranges. To do so, A C K N O W L E D G M E N T S I am grateful to Kaisa Matomäki and James Maynard for helpful discussions and comments. I am also grateful for the anonymous referee for their careful reading of the manuscript, helpful suggestions, and spotting errors in a previous version of the manuscript. The author was supported by a grant from the Emil Aaltonen Foundation. This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 851318).

J O U R N A L I N F O R M AT I O N
The Proceedings of the London Mathematical Society is wholly owned and managed by the London Mathematical Society, a not-for-profit Charity registered with the UK Charity Commission. All surplus income from its publishing programme is used to support mathematicians and mathematics research in the form of research grants, conference grants, prizes, initiatives for early career researchers and the promotion of mathematics.