Large sums of high‐order characters

Let χ$\chi$ be a primitive character modulo a prime q$q$ , and let δ>0$\delta > 0$ . It has previously been observed that if χ$\chi$ has large order d⩾d0(δ)$d \geqslant d_0(\delta)$ then χ(n)≠1$\chi (n) \ne 1$ for some n⩽qδ$n \leqslant q^{\delta}$ , in analogy with Vinogradov's conjecture on quadratic non‐residues. We give a new and simple proof of this fact. We show, furthermore, that if d$d$ is squarefree then for any d$d$ th root of unity α$\alpha$ the number of n⩽x$n \leqslant x$ such that χ(n)=α$\chi (n) = \alpha$ is od→∞(x)$o_{d \rightarrow \infty}(x)$ whenever x>qδ$x > q^\delta$ . Consequently, when χ$\chi$ has sufficiently large order the sequence (χ(n))n⩽qδ$(\chi (n))_{n \leqslant q^\delta}$ cannot cluster near 1$\hskip.001pt 1$ for any δ>0$\delta > 0$ . Our proof relies on a second moment estimate for short sums of the characters χℓ$\chi ^\ell$ , averaged over 1⩽ℓ⩽d−1$1 \leqslant \ell \leqslant d-1$ , that is non‐trivial whenever d$d$ has no small prime factors. In particular, given any δ>0$\delta > 0$ we show that for all but o(d)$o(d)$ powers 1⩽ℓ⩽d−1$1 \leqslant \ell \leqslant d-1$ , the partial sums of χℓ$\chi ^\ell$ exhibit cancellation in intervals n⩽qδ$n \leqslant q^\delta$ as long as d⩾d0(δ)$d \geqslant d_0(\delta)$ is prime, going beyond Burgess' theorem. Our argument blends together results from pretentious number theory and additive combinatorics. Finally, we show that, uniformly over prime 3⩽d⩽q−1$3 \leqslant d \leqslant q-1$ , the Pólya–Vinogradov inequality may be improved for χℓ$\chi ^\ell$ on average over 1⩽ℓ⩽d−1$1 \leqslant \ell \leqslant d-1$ , extending work of Granville and Soundararajan.


Introduction and Main Results
1.1.Background.Understanding the value distribution of Dirichlet characters is a central theme in analytic number theory.An old and famous conjecture of I.M. Vinogradov predicts that the least quadratic non-residue n p modulo a prime p satisfies n p ≪ δ p δ as p → ∞ for any δ > 0. This conjecture, relating to negative values of the Legendre symbol • p , may be generalized to other primitive Dirichlet characters.One can ask whether the least integer n χ for which a primitive character χ (mod q) yields χ(n) = 0, 1 satisfies n χ ≪ δ q δ as q → ∞ for any δ > 0. These problems may be recast in terms of cancellation in short character sums.For the Legendre symbol modulo p it is a folklore conjecture going beyond Vinogradov's that n≤x n p = o p→∞ (x) whenever x > p δ for any δ > 0, and more generally for any primitive Dirichlet character χ (mod q) we expect that (1) n≤x χ(n) = o q→∞ (x) whenever x > q δ for any δ > 0.
Currently, the best general result towards (1), due to Burgess [3], allows for such cancellation (at least for q cube-free) whenever δ > 1/4, improved to δ ≥ 1/4 for prime q by Hildebrand [17].However, it is a notoriously difficult problem to extend the zero-free regions of Dirichlet L-functions for individual characters.It is therefore desirable to determine other sufficient criteria that a primitive character might satisfy that guarantees cancellation in its partial sums in a range going beyond Burgess' theorem.Both of the above conjectures are well-known to hold (in a much stronger form) under the assumption of the Generalized Riemann Hypothesis (GRH), but should even hold assuming far less.Indeed, Granville and Soundararajan [14,Cor. 1.2] showed that (1) holds as long as L(s, χ) has sufficiently few zeros in certain small rectangles near the line Re(s) = 1, a condition that is easily implied for typical characters χ by classical zero density estimates.
As has been elaborated upon in various works (see e.g., [2], [6], [21] and [10]), there is also a relationship between such questions about short character sums and corresponding estimates for maximal character sums, in particular regarding improvements to the Pólya-Vinogradov inequality, which asserts that any non-principal character modulo q satisfies (2) M (χ) := max 1≤t≤q n≤t χ(n) ≪ √ q log q.
Montgomery and Vaughan [22] showed that, assuming GRH, the factor log q in (2) may be improved to log log q.This is sharp (up to the implicit constant) for quadratic characters according to a construction due to Paley [25].A general, unconditional improvement to the Pólya-Vinogradov inequality has, however, resisted proof for over a century.Characters of various fixed orders d ≥ 2 have been considered in this connection in a number of works, see e.g.[13] and [8].Building on a breakthrough by Granville-Soundararajan [13], Goldmakher [7] showed that for characters with fixed odd order d the estimate (2) may be improved unconditionally to (3) M (χ) ≪ √ q(log q) 1−δ(d)+o (1) , with δ(d) = 1 − d π sin(π/d) > 0 (see [20] for more precise results).On the other hand, unconditional improvements to (2) for fixed even order characters seem to provide a logjam to generally improving (2).Once again, it would be valuable to determine other collections of primitive characters for which one may improve the Pólya-Vinogradov inequality.
In this paper, we will study the relationship between cancellation in short and maximal character sums and the size of the order of a character χ (mod q), i.e, the minimal positive integer d such that χ d is principal.
1.2.Motivating Questions.In order to obtain improvements in estimates for short and maximal sums of a primitive character χ (mod q), we must rule out the heuristic possibility that χ(p) = 1 for all but very few p ≤ q δ .Since χ is multiplicative, this would further imply that χ(n) = 1 for many integers n ≤ q δ .
In order to preclude this possibility, therefore, we would like to quantify the frequency with which a character takes the value 1 at integers n up to some small threshold x.Note that this question is more refined than simply trying to bound n χ .
In this paper, we shall study the value distribution of characters χ of large order (in a sense to be made precise shortly).If χ has modulus q then the order d of χ divides φ(q), and most such divisors grow as a function of q.Thus, the collection of such characters is rather substantial.Moreover, as the elementary results in Section 2 below show, such characters do exhibit some variation in their values χ(n) for small n, which might suggest that their short and maximal sums also exhibit cancellation.
The role of a character's order in its value distribution has previously been considered in the work of K. K. Norton on small upper bounds for n χ (see [23] and especially [24], which contains a substantial survey on the topic), and of Granville and Soundararajan [11] on upper bounds for |L(1, χ)|.To our knowledge, however, questions surrounding especially the paucity of solutions to χ(n) = 1 and distribution of values χ(n) = 1 for small n have not appeared previously in the literature, in particular when the order of the character is allowed to grow as a function of its modulus.
In this direction we pose the following three motivating questions: Question 1. Fix δ ∈ (0, 1).If χ is a primitive character modulo q with order d = d(q) growing with q, how few solutions n ≤ q δ are there to χ(n) = 1, or more generally to χ(n) = α when α d = 1?
Question 2. If χ is a character as in the previous question, can it be shown that for any fixed δ ∈ (0, 1), n≤x χ(n) = o d→∞ (x) for all x > q δ ?Question 3.For a character χ as in the previous two questions, can it be shown that M (χ) = o d→∞ ( √ q log q)?
The rationale for these questions is the following: if, say, χ is primitive modulo a prime q of order d then {χ(n)} n<q equidistributes among the dth order roots of unity (by orthogonality of Dirichlet characters).If d is large then one expects that the level sets should become sparse i.e., of size o(x), even when x is rather small relative to q. Naturally, we would like to understand how quickly this can occur (i.e.how small can x be for this to happen).The variation in the values of χ(n) also suggests the possibility that the short and maximal sums of χ might exhibit cancellation as well.
1.3.Main Results.Our main results address each of the three questions above.In the interest of clarity we defer relevant remarks about the theorems below to Section 1.5.Our first main result addresses Question 1, provided q is prime and d is squarefree.See Remarks 2 and 3 below regarding the necessity of these assumptions.(Unless indicated otherwise, all implicit constants in this paper are absolute.)Theorem 1.Let q be a large prime, let d ≥ 2 be squarefree with d|(q − 1) and let χ be a primitive character modulo q of order d.Then there is an absolute constant c 1 > 0 such that if , (log q) −c1 and x > q δ then for any Our second main theorem is a mean-square estimate for short sums of order d characters, showing that for typical 1 ≤ ℓ ≤ d − 1 the partial sums of χ ℓ over intervals [1, q δ ] exhibit cancellation for any fixed δ > 0, as long as d has no small prime factors.This shows that Question 2 has a positive answer for almost all powers χ ℓ of χ.See Remark 4 for a discussion of the strength of these bounds.
Theorem 2. Let q be a large prime, let d ≥ 2 with d|(q − 1) and let χ be a primitive character modulo q of order d.Define2 G = G(d) := min{P − (d), log log(ed)}.Then there is an absolute constant c 2 > 0 such that if , (log q) −c2 and x > q δ then Our third main theorem gives an upper bound for the average size of M (χ ℓ ) with 1 ≤ ℓ ≤ d − 1 that improves on the Pólya-Vinogradov inequality for any d|(q − 1) having no small prime factors.This addresses Question 3, again for almost all powers χ ℓ of χ.Theorem 3. Let q be a large prime, let d ≥ 2 with d|(q − 1) and let χ be a primitive character modulo q of order d.Then .
Combined with a (slight extension of a) result of Granville and Soundararajan on characters of fixed odd order [13, Thm.2], this gives the following bound, which provides a uniform saving for all prime d, even when d = d(q) → ∞.See Remark 5 for a discussion of the novelty of this result.Corollary 4. Let q be a large prime.Then, uniformly over all primitive characters χ modulo q of prime order d ≥ 3 with d|(q − 1), 1.4.More precise results.Theorems 1 and 2 may be extended to a larger collection of completely multiplicative functions whose non-zero values are roots of unity of a large order.Considerations of such a general (but slightly different) nature arose also in [11].
Definition 1.Let d ∈ N and x 0 ≥ 1.We say that a completely multiplicative function f : N → µ d ∪{0} weakly equidistributes beyond (a scale) We define M(x 0 ; d) to be the collection of all such completely multiplicative functions.For f ∈ M(x 0 ; d) and x ≥ x 0 we write and we call x 0 the threshold of f .We define M 1 (x 0 ; d) to be the subcollection of those f ∈ M(x 0 ; d) for which (4) For example, let χ be a primitive character modulo q prime and order d|(q − 1).By (2), we have with x > q 3/2+ε uniformly over 1 ≤ ℓ ≤ d − 1 and d|(q − 1).Thus, by the Weyl criterion, max Thus, χ ∈ M 1 (q θ ; d) for any θ > 3/2.Our first general theorem shows that for any fixed δ > 0 and any 4 d large enough relative to δ, if f ∈ M 1 (x 0 ; d) then f (p) = 0, 1 for many primes p ≤ x δ 0 .Theorem 1.1.Let x 0 ≥ 3 be large and let η, δ ∈ (0, 1).Then there are absolute constants c 3 ∈ (0, 1) and 3 The constant 100 here could be replaced by any fixed constant, and is merely chosen for concreteness. 4In contrast to our main theorems, this result makes no assumptions on the arithmetic nature of d. 5 Here and elsewhere, ρ : [0, ∞) → R denotes the Dickman-de Bruijn function, defined uniquely by the initial condition See [27, Sec.III.5.3-III.5.4] for an account of some of its useful properties.
(b) If f is a primitive Dirichlet character modulo a large prime q and d|(q − 1), and ηδ ≥ (log q) −c for some 0 < c < c 3 , then the upper bound constraint in (5) may be removed.
Our second general theorem gives a mean-square estimate for short sums of powers f ℓ , for f ∈ M(x 0 ; d).It shows that if f (p) = 0, 1 sufficiently often and d has no small prime factors then the short sums of f ℓ exhibit cancellation for most 1 Then In Section 3 we will combine Theorems 1.1 and 1.2 to deduce the following corollary.
1.5.Remarks on the results.Remark 1. Underlying the results above is the commonly-exploited strategy that while information about individual characters is usually difficult to ascertain, it is often possible to make progress on average over a family of characters.When d is a large prime, for instance, the characters {χ ℓ } 1≤ℓ≤d−1 all have exact degree d, and this collection, though thin, is large enough for averaging techniques to be effective.Fortunately, since these powers are all generated by χ we are able to use this average information to elucidate some properties of χ, e.g., Theorem 1.
Remark 2. The condition that q is prime in our main theorems is mainly a convenience that ensures that (4) holds, and hence χ ∈ M 1 (x 0 ; d) for an appropriate scale x 0 .As such, Theorems 1.1 and 1.2 can be applied to χ.However, the bound (4) is only used in the proof of Theorem 1.1, and could be removed at the expense of replacing the quantity ρ( C2 ηδ ) by ρ( q φ(q) C2 ηδ ) in (5).As a result, we may equally well extend Corollary 1.3 to a collection of moduli q with uniformly bounded sums p|q p −1 .Note in this connection that the trivial bound valid for any x > q δ and δ > 0 fixed, shows that if p|q p −1 is unbounded as a function of q, in contrast, then we can trivially answer Question 2 in the affirmative.
Remark 3. Our requirement that d be squarefree in Theorem 1 and Corollary 1.3 is needed to ensure that when d is large so are most of its prime factors.As such, the group µ d does not have "too many" small subgroups.Morally, this prevents from occurring the situation that χ(n) has order much smaller than d for many n, which would yield to much repetition in the sequence (χ(n)) n .In place of the squarefreeness of d, it would be sufficient to assume that holds for individual character sums modulo q.Even unconditionally, if we average over all characters χ (mod q) (i.e., the case d = q − 1) then far stronger results than Theorem 2 can be proved.In particular, Harper [16] has recently shown using the theory of random multiplicative functions that We might expect by analogy that at least in some range of x ∈ [1, q].It would be interesting to understand whether Harper's tools (suitably adapted to treat random multiplicative functions taking uniformly distributed values in µ d ) could be used to study the average (6), especially when d grows only slowly with q.
Remark 5. Note that Theorem 3 is only non-trivial when d has no small prime factors, and therefore odd.In view of (3), Theorem 3 is thus much weaker than existing results when d is slowly growing.However, note that the exponent ) is no stronger than (2) as soon as6 d ≫ √ log log q.Theorem 3 and Corollary 4 are therefore new in the range d ≫ √ log log q.It is worth noting that Theorem 3 is also related to (though not implied by) [11,Thm. 3], where, for slowly-growing d an upper bound for the geometric mean of the related quantities is obtained that goes beyond the Pólya-Vinogradov bound.The estimate there does not, however, extend uniformly to the full range of d considered here.Remark 6.We have made no attempt to optimize any of the exponents in Theorems 1 or 2, and we do not believe our results to be best possible.
1.6.Plan of the paper.The paper is structured as follows.
In Section 2 we give context for our main theorems by providing elementary proofs of two results, Propositions 2.1 and 2.3.These results show how assuming d is large helps in finding small n with χ(n) (at times significantly) different from 1 in value.In Section 3.1 we give a brief review of pretentious number theory, then in Section 3.2 we deduce Corollary 1.3 and Theorem 1 from the more general Theorems 1.1 and 1.2.As Theorem 1.2 is the more novel and involved of the two theorems, we provide a sketch of the proof of that theorem in Section 3.3.The proof itself appears in Section 4. In Section 5 we derive Theorem 1.1.Combining this with the work of Section 3, we then deduce Theorem 2. Finally, in Section 6 we prove Theorem 3 and Corollary 4 by combining (slightly extended) work of Granville and Soundararajan with some combinatorial observations related to sum-free sets in abelian groups.Sections 2 and 6 may be read independently of the remaining sections.
Acknowledgments.Parts of this work were completed during visits by the author to the University of Bristol, to the Institut Élie Cartan de Lorraine and to King's College London.We would like to warmly thank Bristol, IECL and KCL for their hospitality and excellent working conditions.We are most grateful to Andrew Granville, Oleksiy Klurman, Youness Lamzouri and Aled Walker for helpful discussions, references and encouragement.We also thank the anonymous referee for reading a previous version of this paper and for providing helpful comments.

Elementary Arguments Towards
Using only elementary arguments, in this section we will prove two results about large order characters that are only conjectural for characters of fixed order.This provides evidence that large order characters are easier to study than their fixed order counterparts, and motivates the investigations in the remainder of the paper.

2.1.
Estimates for n χ .Let δ ∈ (0, 1).In this subsection we show that if χ is a primitive character modulo a prime q with order d sufficiently large in terms of δ then one can find solutions to χ(n) = 1 with n ≤ q δ .When δ < 1/4 this goes beyond what can be obtained using Burgess' theorem.Such an observation has previously been made 7 by Norton (see e.g., [23,Thm. 6.4] or [24,Thm. 1.20]), but we give an alternative, short proof.Proposition 2.1.Let δ > 0 and let q ≥ q 0 (δ) be prime.If χ is a primitive character modulo q of order d > ρ(1/δ) −1 then there is 1 ≤ n ≤ q δ with χ(n) = 1.In particular, for some C > 0 absolute.
To prove this we use the following simple combinatorial lemma.
Proof.Assume the contrary, so that d By assumption, we have t d ≡ 1 (mod q) for all 1 ≤ t ≤ q δ , and in particular for all primes p ∈ [1, q δ ] this congruence holds.But then since (t 1 t 2 ) d ≡ 1 (mod q) whenever t d j ≡ 1 (mod q) for j = 1, 2, all q δ -friable 8 integers 1 ≤ t ≤ q − 1 also satisfy this congruence.But for large enough q there are (ρ(1/δ) + o( 1))q such integers up to q [27, Thm.III.5.8].Thus, if 0 < c < ρ(1/δ) and q is large enough then we find that there are > cq ≥ d solutions to the polynomial equation , where χ q generates the character group modulo q and (ℓ, d) = 1.Setting χ 1 := χ (q−1)/d q , note that χ 1 takes values in roots of unity of order d, and so if we can show that χ 1 (n) = 1 for some n ≤ q δ then the same is true for χ = χ ℓ 1 .Note that 1 ≤ q−1 d ≤ cq for some 0 < c < ρ(1/δ).Now assume for the sake of contradiction that χ 1 (n) = 1 for all 1 ≤ n ≤ q δ .Since χ q is injective on Z/qZ it follows that n q−1 d ≡ 1 (mod q) for all 1 ≤ n ≤ q δ .By the previous lemma we deduce that (q − 1)/d = 0, which is a contradiction.This establishes the first claim.If δ is small enough then, using (7) ρ(u) ≫ e 2u log u u 7 Strictly speaking, Norton states his results as nχ ≪ q 1 4αw +ε for prime q and d ≥ w, where α = αw is the unique solution to ρ(α) = 1/w.Aside from the factor 1/4 that arises from his use of Burgess' theorem, it is easy to see that the parameter choices in Proposition 2.1 correspond with his. 8Given y ≥ 2 we say that a positive integer n is y-friable if any prime factor p|n must satisfy p ≤ y.
with u = 1/δ (see [9,Sec. 3.9]) we deduce that if then d > ρ(1/δ) −1 and we may apply the first claim.The choice δ := log log(Cd log d) log d with C > 0 absolute and sufficiently large, furnishes the second claim.

2.2.
Small n with χ(n) bounded away from 1.We next set out to study to what extent the values χ(n) can vary for n ≤ x and x not much larger than n χ .It is possible that while χ(n) = 1, χ(n) might still take values that are very close to 1 in this range, i.e., χ(n) = e(j/d) where 9 j/d is quite small.As a consequence, the partial sum of χ up to x would not witness significant cancellation, contrary to expectations in line with the conjectured estimate (1).By an elementary argument, however, we show that this is not necessarily the case when d is large.In the sequel, for z ∈ S 1 we write arg(z) to denote the element of (−1/2, 1/2] for which z = e(arg(z)).
Proof.Call S δ the set of n = mk as above.Note that if n ∈ S δ then its representation n = mk is uniquely determined.Now, for each m ≤ q δ/10 define u m to be the unique solution to (q/m) 1/um = q δ ; explicitly, u m = δ −1 1 − log m log q .Taking α = 10 in Friedlander's theorem and using the lower bound (8), we get for large enough q.Since c 0 ≥ 1, ρ is a decreasing function, and u m ≤ 1/δ uniformly over m ≤ q δ/10 , we obtain as claimed. 9Given t ∈ R we write e(t) := e 2πit and t := min n∈Z |t − n|.
Lemma 2.5.Let I ⊆ [0, 1] be an open interval with length |I|.If χ is a primitive character modulo prime q of order d then for any K ≥ 1, Proof.We apply the Erdős-Turán inequality [27, Thm.I.6.15].Given K ≥ 1, If d|k then χ(n) k is principal and the inner sum is q − 1.Otherwise, if d ∤ k then χ k is non-principal and the sum is zero by orthogonality.This yields the upper bound and implies the claim.

Background and Proof Strategy
3.1.A Pretentious Primer.The arguments used towards the proof of our main theorems are grounded in notions of pretentious number theory, as developed by Granville and Soundararajan.Here, we give a brief overview of those ideas from that subject that will be relevant in this paper.Let U := {z ∈ C : |z| ≤ 1}.Given arithmetic functions f, g : N → U and x ≥ 2 we define the pretentious distance between f and g (at scale x) as x), and by Mertens' theorem, 0 ≤ D(f, g; x) 2 ≤ 2 log log x.This distance function also satisfies a triangle inequality: given f, g, h : N → U we have which implies the useful inequality (see [13,Lem. 3.1]) If f and g are multiplicative functions for which D(f, g; x) 2 = o(log log x) then f (p) ≈ g(p) for most p (in a suitable average sense), and we think of f and g as approximating one another.In the particular case that D(f, g; x) is bounded as a function of x we say that f is g-pretentious (or, symmetrically, that g is f -pretentious).
The pretentious distance can be used to express upper bounds for Césaro averages of bounded multiplicative functions.The Halász-Montgomery-Tenenbaum inequality [27, Cor.III.4.12], a quantitative refinement of fundamental work of Halász, states that for a multiplicative function f : N → U and parameters x ≥ 3 and T ≥ 1, where M := min |t|≤T D(f, n it ; x) 2 .Thus, if f is not n it -pretentious for all |t| ≤ T , and T is large enough, then the partial sums of f are small.This result will be used several times in the sequel.
3.2.Deductions of Corollary 1.3 and Theorem 1.We show in this section that our main results on level sets, Theorem 1 and Corollary 1.3, are consequences of our general Theorems 1.1 and 1.2.Given a multiplicative function f : N → S 1 ∪ {0} and α ∈ S 1 define the level set Corollary 3.1.Assume the hypotheses and notation of Theorem 1.2.Let {α j } 1≤j≤d be an ordering of µ d so that Then for any J ≥ 1, In particular, Proof of Corollary 3.1 assuming Theorem 1.2.The second claim follows from the first with J = 1 so it suffices to prove the first.By orthogonality modulo d, Next, we note that by positivity, Combining this with the previous equation and applying Theorem 1.2, we obtain as claimed.
In Section 4 we will prove Theorem 1.2.Since its proof is somewhat involved, we will explain here our strategy towards its proof.
3.3.1.Initial Setup.To prove Theorem 1.2 we will show that for a judicious choice of We will eventually show that we may take ε ≪ (log Σ)/Σ 1/15 , from which the theorem follows.
In this direction, define We may suppose that 12) is verified, and so our task is reduced to understanding the case It turns out that this lower bound on |C d (ε)| puts rigid constraints on f .Consequently, we show that for almost all ℓ ∈ C d (ε) we still obtain some cancellation in the partial sums of f ℓ , more precisely For this to be the case, according to (11) it would be sufficient to show that the minimal distances ( 13) grow as a function of 1/ε for all but o ε→0 (d) powers ℓ.We endeavour to verify this type of condition in the sequel.

3.3.2.
Proving the theorem assuming t ℓ ≡ 0. Our task turns out to be significantly simplified if we can show, roughly speaking, that t ℓ may be replaced by 0, or more precisely ( 14) Let us assume this is the case for the moment.Then we may bound the partial sums of f ℓ in terms of the level sets Note that while we know nothing about the sizes of the individual σ j , we do know that their sum satisfies S f (x) := 1≤j≤d−1 We heuristically expect that the (non-zero) prime values f (p) with p ≤ x are uniformly distributed in µ d , so that each σ j should be of roughly the same size σ j ≈ S f (x)/d.In particular, σ j should be small relative to S f (x) as d → ∞ for every j.However, we do not know that this is the case in practice.
As is reflected in the bound in Theorem 1.2, we instead seek lower bounds for Using a simple Fourier analytic argument we are able to show (Lemma 4.4) that if the prime factors of d are all large in terms of ε, and each σ j satisfies σ j < εS f (x) then for most 1 Thus using 1 − cos(2πx) ≥ 8 x 2 in (15), for most ℓ we obtain As S f (x) ≥ Σ, (11) then yields an estimate of the shape for some c > 0. This bound is more than sufficient.The possibility remains that some σ j0 is large in the sense that σ j0 ≥ εS f (x).We show in this case (see Proposition 4.5) that if |C d (ε)| > ε 2 d and ℓ ∈ C d (ε) then as long as ℓj 0 /d (mod 1) is ≫ ε 1, thus bounded away from zero, we still obtain To prove this we use the Turán-Kubilius inequality [27, Thm.III.3.1] and the complete multiplicativity of f to obtain a decomposition Since the normalized partial sums y → y −1 n≤y g(n) of a multiplicative function g are known to be slowly-varying 10 with y, we show roughly speaking that uniformly in the range p ≤ x o (1) .Combined with the decomposition above, this leads to an estimate of the shape 1 It is not hard to show (e.g. using the Erdős-Turán inequality) that |e(ℓj x uniformly in ℓ.To motivate this, suppose for convenience that 1 ∈ C d (ε).Then f is n it1 -pretentious for some |t 1 | ≪ ε −2 .Now assume instead that |t 1 | were bounded away from 0. Then f (p) ≈ p it1 for typical primes p, and (at least for ℓ not too large, see Remark 8), f ℓ should be n iℓt1 pretentious and t ℓ ≈ ℓt 1 .Now if ℓ ∈ C d (ε) then f ℓ must have large partial sums as well.On the other hand, it can be shown (see (23) below) that Thus, ℓ|t 1 | cannot be large.However, |C d (ε)| contains many large values of ℓ, thus |t 1 | itself must be quite small.Unfortunately this argument is too simplistic, as when ℓ is large the powers (f (p)p −it1 ) ℓ may be significantly different from 1 even if the values f (p)p −it1 typically are not.To make it rigorous we appeal to the theory of sumset arithmetic in additive combinatorics.Using an inverse sumset result due to Freiman [4], we show that if d has no small prime factors then every 0 where ℓ j ∈ C d (ε) for each 1 ≤ j ≤ m and m = O ε (1) (see Corollary 4.3).Under these conditions, we leverage properties of the pretentious distance to show that 10 This is an oversimplification; in order to apply the appropriate Lipschitz estimates we must first twist f ℓ by a suitable character n iy ℓ ; luckily, we may show that |y ℓ |, like |t ℓ |, is small and therefore negligible in the arguments.
and as a result, that the map φ : Z/dZ → R given by φ(ℓ) := t ℓ satisfies the approximate homomorphism condition By applying a result due to Ruzsa on approximate homomorphisms [26], we find that there is a genuine homomorphism ψ : Z/dZ → R such that max ℓ∈Z/dZ Since Z/dZ is a finite group and R is torsion-free, ψ must be identically zero, which leads to max ℓ∈Z/dZ |t ℓ | ≪ ε 1/ log x, as claimed.

Proof of Theorem 1.2
Following the outline provided in Section 3.3, we prove Theorem 1.2 in this section.
4.1.The structure of the minimizers t ℓ .In this subsection we show the following proposition, which bounds the minimizers t ℓ uniformly over ℓ under the assumptions that |C d (ε)| is large and d has no small prime factors.
Proposition 4.1.Let d be a positive integer and let c > 0 be chosen such that Remark 8.If d grows sufficiently slowly then a simpler argument would suffice.By the minimal property of t ℓ and repeated applications of (10), If d = o( √ log log x) and 1 ∈ C d (ε) then (using ( 11)) the right-hand side is o( √ log log x).It can then be shown that Since t 1 = t d+1 , we deduce that |t 1 | = O ε (1/ log x) as a result.The novelty of Proposition 4.1 is that the same conclusion still holds, even when d is fairly large, provided many of the powers f ℓ have large partial sums.
To prove Proposition 4.1 we will need the following inverse sumset result, which follows from classical work of Freiman [4] in additive combinatorics (see [28] for an accessible proof).Lemma 4.2.Let c > 0. Let G be a finite Abelian group and let A ⊂ G be a symmetric subset 11 Proof.Call K := 1 + ⌈ log(1/c) log(3/2) ⌉.Since A is symmetric, we have A = −A.Now, by Freiman's theorem, we see that if B ⊂ G is symmetric and B is not contained in a coset of a proper subgroup of G then either B + B = G or else |B + B| ≥ 3  2 |B|.Applying this iteratively, we find that if j ≥ 1 and we assume that none of the sets A, 2A, . . ., 2 j−1 A is contained in a coset of a proper subgroup of G then either 2 j A = G, or else If j ≥ K this is impossible, and so we deduce that 2 j A = G for some 1 ≤ j ≤ K − 1, as long as we can verify that 2 j A is not contained in a coset of a proper subgroup of G for any 1 It follows by induction that if 2 j A were contained in a coset of a proper subgroup of G for some j ≥ 1 then the same is true of 2 i A for any 0 ≤ i ≤ j.Since, by hypothesis, A is not contained in a coset of a proper subgroup of G we may conclude that none of the iterated sets 2 j A are, and the conclusion follows.
Corollary 4.3.Let c > 0 and let d ≥ 1 be an integer such that dZ then the result is trivial, so we may assume instead that A is a proper subset of Z/dZ.By Lemma 4.2, it suffices to verify that any symmetric subset A in Z/dZ cannot be a subset of a coset of a proper subgroup of Z/dZ.But if this were the case then where here tℓ ∈ [−1/ε 2 , 1/ε 2 ] is chosen to minimize min |t|≤1/ε 2 D(f ℓ , n it ; x) (note that t ℓ = tℓ in general).If ε is small enough (and thus x is large enough), then upon rearranging we deduce that where r j ∈ C d and 1 ≤ λ ≤ m.But by ( 17), (10) and induction, By the Vinogradov-Korobov zero-free region for the Riemann zeta function, it follows that if |t| ≥ 100 then for large x, if |u ℓ | ≥ 100.Thus, we may assume that |u ℓ | ≤ 100.Taking squares and instead using whenever |t| ≤ 100, say, we thus deduce that Choose any additive representations with λ 1 , λ 2 ≤ m, then by the same argument applied with ℓ ∈ {ℓ 1 , ℓ 2 , ℓ 1 + ℓ 2 } we obtain, uniformly over ℓ 1 , ℓ 2 ∈ Z/dZ, Since we may always select t 0 := 0, and as f ℓ+kd = f ℓ we can choose t ℓ+kd = t ℓ for all k ∈ Z, the map φ : Z/dZ → R given by φ(m) := t m satisfies the approximate homomorphism condition By a generalization due to Ruzsa of a result of Hyers [26,Statement (7.3)], there is a genuine homomorphism ψ : But there are no non-trivial homomorphisms from Z/dZ to R since the latter is torsion-free.Hence, ψ(ℓ) = 0 for all ℓ and we deduce that as claimed.

4.2.
Studying the distances D(f ℓ , 1; x) using the level sets of f (p).Having shown that the minimizers |t ℓ | are uniformly small, we next study the sizes of the distances D(f ℓ , 1; x), which control the partial sums of f ℓ .We write Our analysis now splits into two cases, according to how large each σ j is relative to the sum 4.2.1.Case 1: each σ j is small.When each σ j is small relative to S f (x) the following lemma provides lower bounds on the distances D(f ℓ , 1; x) 2 , for almost all 1 ≤ ℓ ≤ d − 1.
Lemma 4.4.Let ε ∈ (0, 1) be small and satisfy εP − (d) > 1, and assume that Then for all but O(ε Proof.Given t ∈ R, observe first of all the inequality It follows that for any ℓ = 0, ( 20) We now seek upper bounds for the variance The 1-periodic function t → t 2 has the absolutely convergent Fourier series (−1) r r 2 e(rt).
Proof.We may assume that x is as large (and ε as small) as desired, otherwise at least one of the alternatives is trivial.We may also assume that σ j ≥ 1 and λε −32m 2 < 1/2, since otherwise the second alternative is trivial.Suppose the first alternative fails.Arguing as in the proof of Proposition 4.1, for ε small enough.Seeking to apply [12,Thm. 4] below, we must introduce some notation.For each 1 ≤ ℓ ≤ d − 1 define Let y ℓ,0 ∈ [−2 log x, 2 log x] be chosen so that |F ℓ (1 + iy ℓ,0 )| = max |y|≤2 log x |F ℓ (1 + iy)|, and set We will need an upper bound for |y ℓ | whenever y ℓ = 0, so assume for the moment that this is the case.By Mertens' theorem, given any y ∈ R, Thus, if y ℓ = 0 then .
By (10) and the crude bound |t ℓ − y ℓ | ≤ 2 log x we thus obtain In light of ( 18) and the hypothesis log(1/ε) < 1 32 log log x, we see that Using Proposition 4.1, we therefore conclude that ( 22) a bound that we shall employ momentarily.
With this setup complete we may now proceed with the proof of the proposition.By [12,Lem. 7.1] we have Let now S j := {p ≤ q : f (p) = e(j/d)} and define the completely additive function whose mean-value over n ≤ x is asymptotically By the Turán-Kubilius inequality [27, Thm.III.3.1] and complete multiplicativity, we deduce that We split the sum over p ≤ x into the segments p ≤ x λ and x λ < p ≤ x.The contribution from the second segment to the above is Now, for each p ≤ x λ we apply the Lipschitz bound [12,Thm. 4], which yields (1) .
Proof of Theorem 1.2.Let ε > 0 be a small parameter to be chosen later in terms of We may assume in what follows that Σ (and thus also x and d) is larger than any specified constant, since otherwise the claim is trivial (by adjusting the implicit constant appropriately).
Set M := ⌈2/ε⌉.Note that as Σ ≤ log log x + O(1), the constraint (24) implies that when x is large enough, As above, write We consider several cases.Case 1: Suppose first that Adding in the contribution from ℓ = 0, we trivially have As above, let and define We consider two subcases.Case 2.a) Suppose there is 1 ≤ j 0 ≤ d − 1 such that (26) σ j0 ≥ εS f (x) ≫ εΣ.

Proof of Theorem 1.1
We now turn to the proof of Theorem 1.1, concerning lower bounds for the sum We will analyze the prime values (f (p)) p≤x the obvious implication and the (weak) equidistribution property of the integer values (f (n)) n≤x0 .
Our general argument will allow us to handle d ≪ (ηδ) −1 e (log x0) c for some absolute constant c > 0.
Using zero-density estimates for Dirichlet L-functions, we may extend this range when f is a character of order d as follows.
As θ < c, we obtain |B| = o(d) when q is sufficiently large.Invoking both of these estimates in (32), we deduce that Comparing this with (31), we find Since ( √ q log q) 1/d ≪ 1, on taking dth roots and rearranging we obtain As this bound holds for every θ ∈ (0, c) the claim follows.
Proof of Theorem 1.1.The conclusion is strongest when x = x δ 0 , so we assume this in what follows.(a) Write and as above set Since f weakly equidistributes at scale x 0 , ( We now derive a corresponding lower bound.Let g be the non-negative completely multiplicative function defined at primes by g(p) := 1 if p ≤ x and f (p) = 1 0 otherwise.
We then observe that Now if x 0 is large enough then by [18,Thm. 2] there are absolute constants A, β > 0 such that (35 where σ − (u) := uρ(u).Evaluating the various factors in this expression, we see that and also We set c 3 = β, take C 3 > 0 to be a large constant and assume that (36) Suppose for the sake of contradiction that E =0,1 (x) < log(1/η).Since σ − (u) is a decreasing function of u, there is an absolute constant C ′ > 0 such that Thus on combining (33), (34) and ( 35) we find that for some absolute constant B > 0, (37) If C 3 is large enough relative to B, the error term in (37) will contribute ≤ 1 2 c f x 0 /d to the right-hand side.Thus, rearranging (37) and suitably modifying the implicit constant, this term may be deleted from (37).Incorporating the definition of σ − (u), we thus get ηδ , and part (a) follows.(b) Assume next that d ≥ C 3 (ηδ) −1 e (log x) β and f = χ a primitive character modulo prime q, of order d (taking x 0 = q 3/2+ε , say).If q is large enough then for any c ∈ (0, β) we have d ≥ e (log q) c .Thus, by Proposition 5.1, Since log(1/(ηδ)) ≤ c log log q for some 0 < c < β, once q is large enough we have and the claim follows.

Improvements to Pólya-Vinogradov on Average
Let ε > 0 and let χ be a primitive character of order d modulo a prime q.We define To prove Theorem 3 we will show, using work of Granville and Soundararajan [13], that the iterated sumsets of Ξ d (ε) satisfy rigid conditions; see Lemma 6.3 below.When d no small prime factors this rigidity puts limits on the size of Ξ d (ε).
Proposition 6.1.Let q ≥ 10 be large.Then there is an absolute constant C > 0 such that if k ≥ 1 and ε > 0 satisfy k ≤ log log q 10 log log log q , ε ≥ C(log q) .
To prove Proposition 6.1 we need the following (slight) extension of [13, Thm.2] that gives a precise dependence of the bound on the number of characters involved.Lemma 6.2.Let q ≥ 10 be large and let 3 ≤ g ≤ log log q log log log q be odd.Then there is an absolute constant C 0 > 0 such that the following holds.Let χ 1 , . . ., χ g be primitive characters with respective conductors q j ≤ q and for which M (χ j ) > √ q j (log q j ) 1−1/g for all 1 ≤ j ≤ g.

Suppose in addition that
2g .
Lemma 6.3.Let q ≥ 10 be large.Then there is an absolute constant C ≥ 1 such that whenever 1 ≤ k ≤ log log q 10 log log log q and C(log q) Proof.For ease of notation, write A := Ξ d (ε).Suppose for the sake of contradiction that a ∈ 2kA ∩ A.
This gives a contradiction whenever C ≥ C 0 and q is large enough, and the claim follows.
Let G be a finite Abelian group, written additively.For k, ℓ ≥ 1 we say that A ⊆ G is a (k, ℓ)-set if kA ∩ ℓA = ∅.Lemma 6.3 shows that, under the claimed constraints on k and ε, Ξ d (ε) is a (2k, 1)-set.It is clear that if B ⊆ A and A is a (k, ℓ)-set then B is also a (k, ℓ)-set, so that (k, ℓ)-sets form a partially-ordered set under inclusion.We say that A is a maximal (k, ℓ)-set if A is maximal with respect to this partial order.
Proof of Theorem 3. Let q ≥ 10 be large, and let 1 ≤ k ≤ log log q 10 log log log q be a parameter to be chosen later.Setting ε := C(log q) − 1 3(2k+1) 2 and splitting the sum over j according to whether or not j ∈ Ξ d (ε), Proposition 6.1 implies that .
If we set k := 1 10 log log q log log log q then ε ≤ C(log log q) − 1 2 , and the claimed bound follows.
Proof of Corollary 4. Note that since d ≥ 3 is prime, χ ℓ has odd order d for all d ∤ ℓ.Applying Lemma 6.2 with χ j = χ ℓ for all 1 ≤ j ≤ d, we obtain 13 (whether or not M (χ ℓ ) ≤ √ q(log q) 1−1/d ) √ q log q ≤ C 0 ( √ q log q)(log q) − 1+o(1) for each 1 ≤ ℓ ≤ d − 1.Combined with Theorem 3, as q → ∞ this gives 1 d 1≤ℓ≤d−1 M (χ ℓ ) ≪ ( √ q log q) min (log q) − 1 3d 2 , log log log q log log q The transition point in these bounds occurs for d ≍ log log q log log log q , and with a suitable implicit constant the claimed upper bound follows.It is reasonable to ask whether there could be a substantial difference in size between the intersection of the A k and the minimal |A k |.However, note that any symmetric set A satisfies (2k)A ⊆ (2k + 2)A, as we can express any m-fold sum of terms in A as an (m + 2)-fold sum via for any a ′ ∈ A. It follows that any (2(k + 1), 1)-set is also a (2k, 1)-set, and it is possible therefore that A k+1 ⊆ A k , i.e., the sums are nested.In such an event, the latter inequality in (40) is sharp.
Thus, when M > d Proposition 2.3 gives no further information than Proposition 2.1 does.The proposition is interesting, however, when d is significantly large compared to M .The proof requires a few auxiliary results.The first is due to Friedlander [5, Thm.1(B), Thm.6(B)].

and the hypothesis 1 δ 2 ≪ c 2 1 c 1
log d log log(ed) , choosing c 1 smaller if needed we also have the required lower bound d ≥ C 4 ρ( C5 ηδ ) −1 , with C 4 , C 5 as in the statement of Corollary 1.3.Moreover, since q > d, log log q, log log d − log log log(ed) + O(1) ≫ log log d.
The savings obtained in Theorem 2, though non-trivial, are admittedly weak.By comparison, if we assume GRH then the far stronger square-root cancelling bound 1 [13,• ξ g ; q) 2from the proof of[13, Lem.3.3],we thus obtain