An upper bound on the mean value of the Erdős–Hooley Delta function

The Erdős–Hooley Delta function is defined for n∈N$n\in \mathbb {N}$ as Δ(n)=supu∈R#{d|n:eu

Erdős introduced this function in the 1970s [3,4] and studied certain aspects of its distribution in joint work with Nicolas [5,6].However, it was not until the work of Hooley in 1979 that ∆ was studied in more detail [14].Specifically, Hooley proved that (1.1) 2), we see that ∆ is on average of genuinely smaller order than τ .This savings is crucial: as Hooley demonstrated (see [14,23], and Remarks 2 and 4 below), it can be exploited to count solutions to certain Diophantine equations that are not amenable to more "standard" techniques, as well as to improve bounds on certain Diophantine approximation results.
In a series of papers, Hall and Tenenbaum improved significantly Hooley's estimate for ∆ and for various generalizations of it; see [10,11,12], and also [13].Their work culminated in the following estimates [13,Theorems 60 and 70]: for every fixed ε > 0 and for every x 1, we have (1.3) x Log 2 x ≪ The upper bound was improved recently by La Bretèche and Tenenbaum [1] to for every fixed ε > 0 and for every x 1.
The main result of this note is the following further sharpening of the upper bound.
Theorem 1 (Mean value bound).For x 1, we have Remark 1.The average value of ∆ is dominated by "atypical" integers.Indeed, we know from results in [1] and in [8] that, for every fixed ε > 0, we have for all but o(x) integers n ∈ [1, x], where θ := log 2 log 2+1/ log 2−1 = 0.6102 . . .and η = 0.3533 . . . is another constant 1 .However, the leftmost inequality in (1.3) implies that the mean value of ∆(n) over n ∈ [1, x] is of larger order.As a matter of fact, it appears that the average value has significant contributions from integers for which ∆(n) is as large as (log x) log 4−1 .Indeed, in a recent preprint of Kevin Ford and the two authors of the present paper [9], it was shown that with η as above.Ignoring factors of (Log 2 x) O (1) , this paper shows roughly that for any choice of Remark 2. As indicated before, estimates on the partial sums of the ∆-function have applications to counting solutions to certain Diophantine equations.In [21], Olivier Robert studied the following question: A straightforward adaptation of [21] leads to the estimate with L = max{ℓ 1 , . . ., ℓ k } and the implied constant depending at most on parameters k, c 1 , . . ., c k and ℓ 1 , . . ., ℓ k , which improves Theorem 1.1 of [21].In turn, this leads to a similar improvement of Theorem 1.2 of [21].We will outline the proof of (1.5) in Section 8.
Remark 3. Theorem 1 has applications to a problem of Erdős on sets whose subset sums are not squares.Specifically, assume that c is a constant such that (1.6) David Conlon, Jacob Fox and Huy Pham [2] have developed a new combinatorial argument that deduces from (1.6) that any subset A of {1, 2, . . ., N} with |A| N 1/3 (Log 2 N) c ′ for some appropriate c ′ = c ′ (c) has the property that its set of subset sums { b∈B b : B ⊆ A} contains a square.This improves the earlier bound of N 1/3 (Log N) C with C > 0 of Nguyen and Vu [19].
Remark 4. In [14], Hooley used the bound (1.1) to show that for any irrational θ and real γ, and any ε > 0, the inequality 2 +ε holds for infinitely many n.Tenenbaum [23] improved the logarithmic factor in this bound using (1.3).Similarly, it should be possible to use Theorem 1 to improve further the logarithmic factor, but we will not pursue this matter here.In the homogeneous case γ = 0, the more significant improvement n 2 θ n −2/3+ε was achieved (for arbitrary real θ) by Zaharescu [24].DK dedicates this paper to his son Paris Christopher, who rested in his arms as a newborn many sleepless nights during the writing of the paper.

NOTATION
We use X ≪ Y , Y ≫ X, or X = O(Y ) to denote a bound of the form |X| CY for a constant C. If we need this constant to depend on parameters, we indicate this by subscripts, for instance X ≪ k Y denotes a bound of the form |X| C k Y where C k can depend on k.We also write X ≍ Y for X ≪ Y ≪ X.All sums will be over natural numbers unless the variable is p, in which case the sum will be over primes.We use 1 E to denote the indicator of a statement E, thus 1 E equals 1 when E is true and 0 otherwise.
Given an integer n, we write τ (n) := d|n 1 for its divisor-function and ω(n) := p|n 1 for the number of its distinct prime factors.
It will be convenient, for each x 1, to work with the set S <x denote the set of square-free numbers, all of whose prime factors p are such that p < x.Observe that if 1 y x, then every n ∈ S <x has a unique factorization n = n <y n y , where n <y ∈ S <y and n y lies in the set S [y,x) of square-free numbers, all of whose prime factors p are in the interval [y, x).

METHODS OF PROOF
Similarly to other authors, we shall work with logarithmic weights.Specifically, for all x 1, we have [13,Theorem 61] (3.1) Now, for each u ∈ R, let us define As with previous work, we introduce the moments for q 1.Thus, for instance, In view of (3.4), it is then natural to try to control M q (n) for large q, keeping track of the dependence of constants on q.In order to exploit the multiplicative nature of ∆, we employ the identity ∆(np; u) = ∆(n; u) + ∆(n; u − log p) whenever n is a natural number, p is a prime not dividing n, and u is a real.Taking the q th moments of both sides of this identity, we obtain Extracting out the extreme terms with b ∈ {0, q}, we can write this as By the use of Hölder's inequality and other tools, one can use this identity to recursively control expressions such as for various σ > 1 and k 1, where ω(n) denotes the number of distinct prime factors of n.See for instance [1] for an example of this approach.
In our work, we use a variation of the above ideas.Our main guiding heuristic is that ∆(n) behaves roughly as (3.6) max y∈ [1,x] τ (n <y ) Log y To give some support to this heuristic, let us note that for any a ∈ N. Applying this with a = n <y and noticing that ∆(n <y ) ∆(n) and that log n <y is typically of size Log y, we find that the expression in (3.6) is morally a lower bound (up to constants) for ∆(n).
Motivated by the discussion of the above paragraph, we introduce certain sets that are meant to act roughly as level sets of the ∆-function.Precisely, given a parameter A 1, we define SA <x to be the set of integers n ∈ S <x such that Using a simple Markov inequality, we may show that a proportion of 1 − O(1/A) integers in S <x lie also in SA <x .As a matter of fact, using a more careful analysis, the same statement holds if we replace SA <x by the set where e −f A (y) is a Gaussian-type weight concentrated around the region Our goal would then be to also show that ∆(n) A for most n ∈ S A <x .(In fact, we will only be able to show a weaker version of this, which is why the exponent in Theorem 1 is larger than in the lower bound of (1.3).)In order to achieve this goal, we use (3.5) and a recursive argument that allows us to control averages of M q (n) when n ranges over S q−1,A <x , defined to be the set of where the m j,A 's are certain suitable quantities growing roughly like (jA) j (log A) 3j/4 .It is important to note that our recursive argument makes use of the following simple but crucial observation: the integral This proves our claim that the integral in (3.9) is symmetric in a, b.Now, combining (3.5) with the symmetry of (3.9), we have the inequality To eliminate the factors of 2 we observe that τ (pn) = 2τ (n) (recall that p ∤ n here), and hence We then can apply Hölder's inequality (treating the ∆(n; u) a and ∆(n; u−log p) b terms differently) to (3.11), and use our pointwise bounds (3.7) and (3.8), which will allow us to inductively obtain efficient estimates for the sum where q 1, A 1, x 1 are parameters.

BASIC ESTIMATES
We record here a couple of simple lemmas for easy reference, starting with the following standard consequence of Mertens' theorem: Lemma 4.1 (Mertens' theorem estimate).Fix k 0. For x y 1, we have .
Proof.We have We also note the following estimate: Lemma 4.2 (Brun-Titchmarsh inequality).For z y z/100 1, we have Proof.Note that log(z/y) ≍ (z − y)/y and that We now give a refinement of this simple analysis, in which we have a single exceptional set that covers all y ∈ [1, x], and furthermore there is an additional Gaussian-type decay outside of the critical regime Log 2 y = Log A+O( where and δ > 0 is a sufficiently small absolute constant.Then (5.4) Remark.The upper bound (5.4) is sharp.When Log 2 y = Log A log 4−1 , relation (5.2) becomes τ (n <y ) (log y) log 4 or, equivalently, ω(n <y ) 2 Log 2 y.This event occurs with probability roughly equal to 1 − (log y) −(log 4−1) = 1 − 1/A.A more refined analysis, that uses appropriately adapted results of Ford [7] can show that the left-hand side of (5.4) is ≍ Log x A .Hence, the naive Markov bound (5.1) is actually close to the truth in the critical range of y.
Proof.We may assume that A is large, as the claim is immediate from Mertens' inequality otherwise.
Suppose n ∈ S <x \S A <x .Then there exists We claim that this implies the existence of an absolute constant c > 0 such that (5.5) τ (n <y ) cAe −f A (y) Log y for all y ∈ [y 0 , y 2 0 ].Indeed, if Log 2 y 0 10 Log A, then f A (y) = δ(Log A + Log 2 y) for all y ∈ [y 0 , y 2 0 ], so (5.5) holds for some appropriate choice of c > 0; on the other hand, if Log 2 y 0 10 Log A, then both functions in the right-hand side of (5.3) change by at most O(1) when y ranges in [y 0 , y 2 0 ], so (5.5) holds again provided we choose c > 0 to be small enough.Now, using (5.5), we find that We conclude that Factoring n = n <y n y and using Mertens' theorem we have so it suffices to show that (5.6) First, we dispose of some easy contributions.If Log y A 0.01 , then we bound by Lemma 4.1, and the contribution of this case to the left-hand side of (5.6) is easily seen to be acceptable for δ 1/3, which we may assume.
In the other extreme, if Log y A 100 , then we bound A 1/2 e f A (y)/2 using Lemma 4.1 again, and one can check here too that this contribution to the left-hand side of (5.6) is acceptable if δ 1/20, which we may assume.
In conclusion, in order to prove (5.6), it will suffice to establish a bound of the form (5.7) Log y A 100 .This essentially follows by work of Norton [20] (see also [13, Theorems 08 and 09]).We give the details below.
We have τ (n) = 2 ω(n) , and thus τ (n) cAe −f A (y) Log y if, and only if, In addition, for each k ∈ Z 0 we have for some constant C > 0, by Mertens' theorem [15,Theorem 3.4(b)].Notice that k y 1.1(Log 2 y+ C), which implies that the quantities 1 k! (Log 2 y + C) k decay at least exponentially fast for k k y .We thus conclude that By Stirling's formula and the bounds k y ≍ Log 2 y ≍ Log A, we then have (5.8) where Observe that t y ∈ [1.1, 150] when A 0.01 Log y A 100 , δ 1/5 and A is large enough.Now, note that (5.9) In addition, we have 0 f A (y) 100δ| Log 2 y − Log A log 4−1 |, and thus if δ is small enough and A is large enough.We shall now use Taylor's theorem to approximate Q(t y ) by Q(2).Since t y ∈ [1.1, 150], there must exist some ξ ∈ [1.1, 150] such that We have = log 2 and Q ′′ (ξ) = 1/ξ 1/150.We then use (5.10) to obtain a lower bound on (t y − 2) 2 , and subsequently (5.9) to estimate t y − 2. In conclusion, we have as long as δ is small enough.Inserting this estimate into (5.8)completes the proof of (5.7), and thus of the proposition.

THE KEY MOMENT ESTIMATE
For inductive purposes we will need to introduce a quantity m q,A depending on several parameters C 0 , A, q.According to these quantities, we shall then define S q,A <x to be the set of all integers n ∈ S A <x such that (6.1) M a (n)/τ (n) m a,A for all a = 1, 2, . . ., q.
Observe that M 1 (n) = τ (n), and thus the above inequality is trivially satisfied when a = 1 as long as we ensure that m 1,A 1.
In particular, (6.2) S 1,A <x = S A <x .Clearly we have the inclusions In addition, from (3.5) we have whenever p is a prime, n is coprime to p, and a 1.In particular, M a (n <y )/τ (n <y ) is a nondecreasing function of y, and thus M a (n <y )/τ (n <y ) m a,A for a = 1, 2, . . ., q and y ∈ [1, x].
In other words, we have that (6.3) n <y ∈ S q,A <y whevever n ∈ S q,A <x and y ∈ [1, x].
We shall choose (6.4) where C 0 is a large enough constant to be determined.We now show that our choice satisfies certain properties.
Lemma 6.1 (The recursive upper bound).The following properties hold, with all implied constants independent of q, A and C 0 : (i) One has m 1,A 1, m 2,A ≫ A Log A, and m q,A ≫ (C 0 A/3) q−1 q q .
(ii) For any q 3, one has (iii) For any q 1, one has Proof.The claims (i) and (iii) are clear from (6.4) (bounding q! q q and q − 1 + ⌊q/2⌋ 3q/2).For (ii), we calculate Noticing that a + b = q, ⌊a/2⌋ + ⌊b/2⌋ ⌊q/2⌋, and a 2 ≍ q 2 , the claim follows from the summability of ∞ We now prove the following key moment estimate.In its proof, we shall only use the three properties of the parameters m q,A given in Lemma 6.1.We may thus think of these properties as the only axioms our parameters need to satisfy.Proposition 6.2 (Key moment estimate).Suppose that C 0 1 is a sufficiently large constant, and A 1. Then for any q 2 and x > 1 we have the bound (6.5) Proof.We induct on q, assuming that the claim has already been proven for all smaller values of q (this assumption is vacuous for q = 2).We fix A and introduce the notation Every natural number n ∈ S q−1,A <x other than 1 is expressible in the form n = pm with p < x a prime and m ∈ S q−1,A <p (here we use (6.3)).Thus pn .
Applying (3.11), we conclude that where (6.6) We can iterate this inequality in the obvious fashion to arrive at where P − (n) is the least prime factor of n with the convention that P − (1) = +∞.Note that for any prime p 0 < x, and thus (6.7) We now turn to the estimation of Q q (x).Recall its definition in (6.6).Note that if n ∈ S q−1,A <p , then n ∈ S q−1,A <y for all y ∈ [p, p 2 ] because n <y = n <p for all such values of y and the function w → e −f A (w) Log w is increasing.Since p 2 p dy/(y Log y) ≍ 1, we conclude that thanks to the triangle inequality in L b (the proof of inequality (6.8) goes back to Maier and Tenenbaum [17]).Combining all these estimates, we obtain the bound (6.9) At this point we split our analysis into the base case q = 2 and the inductive case q > 2.
Base case q = 2.We must then have a = b = 1.Since M 1 (n) = τ (n) and S 1,A <x = S A <x (cf.(6.2)), the bound (6.9) simplifies to On the one hand, we have from Mertens' theorem that On the other hand, from (5.2) and Lemma 4.1 one has Consequently, and thus by (6.7) Dividing the summation into the ranges Log p A 50 and Log p > A 50 , and using Mertens' theorem, we conclude that thanks to Lemma 6.1(ii).Thus the claim (6.5) follows for C 0 large enough.This concludes the treatment of the base case q = 2.
Inductive case q > 2. We first handle the lower order term R q (x) := appearing in (6.9).We crudely use Hölder's inequality to bound Since we also have a+b=q q a 2 b = 3 q , we conclude that (Log y) q dy y 5/4 = 3 q A q−2 • 4 q+1 q! 12 q+1 q q A q−2 , as can be seen by the change of variables y = e 4u .Inserting this into (6.9)we conclude that (6.10) Q q (x) ≪ 12 q q q A q−2 + Q ′ q (x), where Applying successively (6.1) and (5.2), we find that and thus T a (y) dy y Log y .
Since q > 2, a + b = q, and 1 b q/2, we have 2 a < q, and hence by induction hypothesis Since a q/2, we have a 2 q 2 /4.As a consequence, y) dy y and hence by Lemma 6.1(ii) We make the change of variables y = e e t to find that where we used (5.3) with δ small enough to show that the function t − f A (exp exp(t)) is piecewise differentiable with derivative bounded from below by an absolute positive constant.In conclusion, Together with (6.10), this implies that Inserting the above bound into (6.7), and using Mertens' theorem, we conclude that (6.11) T q (x) ≪ 12 q q q A q−2 Log x + m q,A Log x q 2 A(Log A) where we used that the sum p 1 p log p converges.Finally, we break up the sum p e −f A (p) p over p on the right-hand side of (6.11) into intervals such that j Log 2 p < j + 1 for some j ∈ Z 0 .
For each fixed j, we have f A (p) = f A (exp exp(j)) + O(1) as well as j Log 2 p<j+1 e −f A (exp exp(j)) ≪ (Log A) 1/2 , by the definition of f A (cf. (5.3)).Hence, using Lemma 6.1(i) we conclude (for C 0 large enough) that T q (x) C 0 q 2 A m q,A Log x.
This completes the proof of the proposition.

CLOSING THE ARGUMENT
Henceforth we fix C 0 so that Proposition 6.2 applies, and allow implied constants to depend on C 0 .We may assume that A 1, as the estimate is trivial otherwise.Our task is now to show that From Proposition 5.1 and relation (6.2), we have (7.1) Also, from (6.1), Proposition 6.2, and Markov's inequality, we have for all j 2 that (7.2) Summing (7.1) and (7.2) for j = 2, . . ., q, we conclude that n∈S<x\S q,A <x 1 n ≪ Log x A for all q ∈ N.
The corollary will then follow if we can show that there exists q ∈ N such that (7.3) ∆(n) < λ Log 2 x for all n ∈ S q,A <x .

4 π − 1 for any x 1 .≪ x Log x for x 1 .
Here and in the sequel we use the notation Log x := max{1, log x} for x > 0, and also define Log 2 x := Log(Log x) and Log 3 x := Log(Log 2 x); see also Section 2 below for our asymptotic notation conventions.To put Hooley's estimate (1.1) into context, let us note that 1 ∆ τ with τ (n) = #{d|n} the divisor function.Thus we have the trivial bounds (Comparing (1.1) with (1. Log 2 y ∈ [ε Log 2 x, (1 − ε) Log 2 x], we have ∆(n) (Log y) log 4−1 for x/(Log y) log 4−1 integers n x with ω(n) = Log 2 y + Log 2 x + O(1) (those that have about 2 Log 2 y prime factors y, and about Log 2 x − Log 2 y prime factors in (y, x]).
so the lemma follows by a classical estimate of Mertens [15, Theorem 3.4(b)].