The real tau-conjecture is true on average

Koiran's real $\tau$-conjecture claims that the number of real zeros of a structured polynomial given as a sum of $m$ products of $k$ real sparse polynomials, each with at most $t$ monomials, is bounded by a polynomial in $m,k,t$. This conjecture has a major consequence in complexity theory since it would lead to superpolynomial bounds for the arithmetic circuit size of the permanent. We confirm the conjecture in a probabilistic sense by proving that if the coefficients involved in the description of $f$ are independent standard Gaussian random variables, then the expected number of real zeros of $f$ is $O(mk^2t)$.


Introduction
We study the number of real zeros of real univariate polynomials. A polynomial f is called t-sparse if it has at most t monomials. Descartes rule states that a t-sparse polynomial f has at most t − 1 positive real zeros, no matter what is the degree of f . Therefore, a product f 1 · · · f k of k many t-sparse polynomials f j can have at most k(t − 1) positive real zeros. What can we say about the number of zeros of a sum of m many products? So we consider real univariate polynomials F of the following structure where all f ij are t-sparse. In other words, F is given by a depth four arithmetic circuit with the structure ΣΠΣΠ, where the parameters m, k := max i k i , and t bound the fan-in at the different levels except at the lowest (since we don't require a bound on the degrees of the f ij ).
The following conjecture was put forward by Koiran [12].
Conjecture 1 (Real τ -conjecture). The number of real zeros of a polynomial F of the form (1.1) is bounded by a polynomial in m, k, and t.
Koiran [12] proved that the real τ -conjecture implies a major conjecture in complexity theory, namely the separation of complexity classes VP 0 = VNP 0 over C. In Tavenas' PhD thesis [17] it is shown that the real τ -conjecture also implies that VP = VNP over C. Tavenas also shows that a seemingly much weaker upper bound on the number of real zeros of F is sufficient to deduce VP = VNP: in fact, an upper bound polynomial in m, t, 2 maxi ki is sufficient [17,§2.1,Cor. 3.23]. In other words, the real τ -conjecture implies that the permanent of n by n matrices requires arithmetic circuits of superpolynomial size. For known upper bounds on the number of real zeros of polynomials of the form F , we refer to [13] and the references given there.
The motivation behind Conjecture 1 is Shub and Smale's τ -conjecture [15] asserting that the number of integer zeros of a polynomial computed by an arithmetic circuit is polynomially bounded by the size of the circuit. If true, it gives a superpolynomial lower bound on the circuit complexity of the permanent polynomial [4]. Moreover, it also entails the separation P C = NP C in the Blum-Shub-Smale model [15,3]. One drawback of the τ -conjecture is that, by referring to integer zeros, it leads to number theory, which is notorious for its hard problems. The τ -conjecture is false when we replace "integer zeros' by "real zeros". Koiran's observation is that when restricting to depth four circuits, the conjecture may be true and we can still derive lower bounds for general circuits. We refer to Hrubes [9] for statements equivalent to the real τ -conjecture that are related to complex zero counting. A τ -conjecture for the Newton polygons of bivariate polynomials, having the same strong complexity theoretic implications, has been formulated by Koiran et al. in [14]. Hrubes [10] recently showed that the real τ -conjecture implies this conjecture on Newton polytopes.
In this work, we prove that the real τ -conjecture is true for random polynomials. More specifically, let k 1 , . . . , k m and t be positive integers and for 1 ≤ i ≤ m and 1 ≤ j ≤ k i we fix supports S ij ⊆ N with |S ij | ≤ t for the t-sparse polynomials f ij . We choose the coefficients u ijs of the polynomials f ij (x) = s∈Sij u ijs x s as independent standard Gaussian random variables. The resulting F given by (1.1) is a structured random polynomial and we investigate the random variable defined as the number of real zeros of F .
Our main result states that the expectation of the number of real zeros of F is polynomially bounded in m, k := max i k i , and t. In fact, we get an at most quadratic bound in the number of parameters! Theorem 1.1. The expectation of the number of real zeros of a polynomial F of the form (1.1) is bounded as O(mk 2 t) if the coefficient u ijs are independent and standard Gaussian. Thus the real τ -conjecture is true on average.
Our result can be interpreted in two ways: on the one hand, it supports the real τ -conjecture since we show it is true on average; on the other hand it says that for finding a counterexample to the real τ -conjecture, it is not sufficient to look at generic examples.
We don't think the assumption of Gaussian distributions is relevant. In fact, we have a partial result confirming this (Theorem 6.3). If we assume the coefficients u ijs are independent random variables whose distribution have densities satisfying some mild assumptions, then the expected number of real zeros of F in [0, 1] is bounded by a polynomial in k 1 + . . . + k m and t, provided 0 ∈ S ij for all i, j. The latter condition means that all the f ij almost surely have a nonzero constant coefficient.
The main proof technique is the Rice formula from the theory of random fields, which has to be analyzed very carefully in order to achieve the good upper bounds. (In fact, we rely on a "Rice inequality", which requires less assumptions.) An interesting intermediate step of the proof is to express the expected number of real zeros of the random structured F from (1.1) in terms of the expected number of real zeros of random linear combinations R(x) = m i=1 u i q i (x)x di of certain weight functions q i (x)x di . The deterministic functions q i (x) are obtained by multiplying and dividing sparse sums of squares in a way reflecting the build-up of the arithmetic circuit forming F ; see (6.11). The randomness comes from independent coefficients u i , whose distribution is the one of a product of k i standard Gaussians.
It would be interesting to strengthen our result by concentration statements, showing that it is very unlikely that a random F of the above structure can have many real zeros.
Outline of paper. Section 2 provides hands-on information on how to deal with conditional expectations, which is mainly basic calculus. In Section 3 we outline the idea of the Rice formula and state a weak version of it (Theorem 3.2), which requires only few technical assumptions. In Section 4 we prepare the ground by proving general estimates on conditional expectations of random linear combinations. Section 5 develops general results of independent interest on the expected number of real zeros of random linear combinations m i=1 w i (x)u i of weight functions w i , for independent random coefficients u i having densities satisfying some mild assumptions. We upper bound this in terms of quantities LV(w i ), for which we coined the name logarithmic variations, and which are crucial for achieving good estimations (see Definition 5.6). Finally, combining everything, we provide the proof of the main results in Section 6.

Preliminaries
We provide some background on conditional expectations in a general continuous setting, relying on some results from calculus related to the coarea formula. Then we discuss some specific properties pertaining to the distribution of products of Gaussian random variables.
2.1. Conditional expectations. We fix a smooth function f : R N → R with the property that {u ∈ R N : ∇f (u) = 0} has measure zero. In most of our applications, f will be a nonconstant polynomial function, which satisfies this property. By Sard's theorem, almost all a ∈ R are regular values of f . For those a, the fiber f −1 (a) is a smooth hypersurface in R N .
Suppose we are further given a probability distribution on R N with the density ρ. To analyze its pushforward measure with respect to f , we define for a regular value a ∈ R here df −1 (a) denotes the volume element of the hypersurface f −1 (a). The coarea formula is a crucial tool going back to Federer [7], see [6, Thm. III.5.2, p. 138] for a comprehensive account. We only need its smooth version [6, p. 159]; see also [8,Appendix] for a short and self-contained proof. The smooth coarea formula implies that ρ f defined in (2.1) is a probability density on R, namely the density of the random variable f (a). More precisely, ρ f is the pushforward measure with respect to f of the measure on R N with density ρ. Let us point out the following simple rule, which we will use all the time: for λ ∈ R * We view now u ∈ R N as a random variable with the density ρ. Let a ∈ R be a regular value of f such that ρ f (a) > 0. We want to define a conditional probability measure on the hypersurface H := f −1 (a) that captures the idea that we constrain u to lie in H. We do this by defining the conditional density for u ∈ H as .
Note that we indeed have H ρ H dH = 1 by construction, where dH denotes the volume measure of H. (As a warning, let us point out that in general, ρ H does not only depend on H, but also on the representation of H by the function f .) Using the conditional density, we can define the conditional expectation of a nonnegative measurable function Z : (This quantity is only defined for regular values a such that ρ f (a) > 0.) In our application, we will always use the following equivalent formula which is valid for all regular values a of f , when interpreting the left-hand side as 0 if ρ f (a) = 0. Thus by Sard's theorem, the equation makes sense for almost all a ∈ R After defining all these notions, we summarize our discussion by stating the following important fact, which is an immediate consequence of the smooth coarea formula (cf. [6, p. 159] or [8,Appendix]).
Proposition 2.1. Let f : R N → R be a smooth function such that {u ∈ R N : ∇f (u) = 0} has measure zero. Moreover, let ρ be a probability density on R N and Z : We next discuss how to compute the right-hand side in concrete situations. As a first step, we express the volume element of the hypersurface H in local coordinates. If ∂ u1 f = 0, then by the implicit function theorem, we can locally express u 1 as a function of u 2 , . . . , u N . The following lemma is well known. For the understanding of the following, it is helpful provide the proof.
and the assertion follows.
Assume now that H is parametrized when (u 2 , . . . , u N ) runs over (an open dense subset of) R N −1 . Then, due to Lemma 2.2, we can express the pushforward density ρ f as follows: Moreover, Formula (2.3) reads as Then H = f −1 (a) is a hyperplane and ∇f = w. We have by definition In the special case f (u) = u N , we retrieve the known notion of the marginal distribution ρ uN (a) = R N −1 ρ(u 1 , . . . , u N −1 , a) du 1 · · · du N −1 , and the conditional density of Z satisfies Example 2.4. Consider the product function f (y) = y 1 ·. . .·y k , and for nonzero a ∈ R the smooth hypersurface C a := {y ∈ R k : y 1 · . . . · y k = a}. If ρ is the joint density of y ∈ R k , then the pushforward density ρ f of the product f (y) satisfies, by (2.1) and Lemma 2.2, that Zρ dy 2 |y 2 | · · · dy k |y k | .

Products of Gaussians.
In the sequel, we denote by ̟ k the density of the product y 1 · . . . · y k of independent standard Gaussian distributed random variables y 1 , . . . , y k ; see [16]. According to (2.7) we have for a ∈ R * where ϕ(y) = (2π) − 1 2 e − y 2 2 denotes the density of the standard Gaussian distribution. More generally, if y i ∼ N (0, σ 2 i ) are independent centered Gaussians with variance σ 2 i , then we may write It is easy to see that the density ̟ k of the product of k standard Gaussians is unbounded for k ≥ 2: we have lim a→0 ̟ k (a) = ∞, which causes some technical problems. However, the following lemma states that the growth of ̟ k for a → 0 is slow, which will be needed for the proof of Theorem 1.1: more specifically, for guaranteeing the assumption (4.1) so that Proposition 5.5 can be applied to the random linear combination where the coefficients u i are independent random variables with the distribution ̟ ki .
(3) The case k = 1 follows from (1). Suppose now k ≥ 2. We have by (2.9) By item (2) we can bound this by Is well known that gives the growth for small δ: it is straightforward to verify that 1 k ) k < e and assertion follows.

The Rice Formula
3.1. Outline. The Rice formula is a major tool in the theory of random fields. It gives a concise integral expression for the expected number of zeros of random functions. For comprehensive treatments we refer to [1,2]. We are going to apply this formula in the following special situation. Let R[X] ≤D denote the finite dimensional space of polynomials of degree at most D in the single variable X. We study a family of structured polynomials given by a parametrization is a polynomial function in the parameter u and the variable X. In our case of interest, it is the parametrization of polynomials by arithmetic circuits of depth four in terms of their parameters.
Here is a rough outline of the method. We fix a probability density on the space R N of parameters. Its pushforward measure on R[X] ≤D defines a class of random polynomial functions F : R → R. (It is common to notationally drop the dependence on the parameter u.) The number #{x ∈ [0, 1] : F (x) = 0} of real zeros of F then becomes a random variable, whose expectation we wish to analyze. For this, let us assume that for almost all x ∈ R, the real random variable F (x) has a density, denoted by ρ F (x) . Moreover, we assume that the conditional The Rice Formula states that, under some technical assumptions, While the idea behind this formula can be easily explained (e.g, see [2, §3.1]), the rigorous justification can be quite hard, especially in case of nongaussian distributions that we encounter in our work; compare [2,Thm. 3.4]). For this reason, we will rely on a weaker version of the Rice formula, tailored to our situation, that only claims the inequality ≤ above, but has the advantage of requiring less assumptions. This is the topic of the next subsection. Let us emphasize that we do not attempt to state this weaker version of the Rice formula in the greatest generality possible.
be a polynomial function, where I is a compact interval. We think of F as a parametrization of structured polynomial functions in the variable x in terms of the parameters u 1 , . . . , u N . We assume that for all x ∈ I, the polynomial function is not constant and thus {u ∈ R N : ∇F (x)(u) = 0} has measure zero.
Thus, for all x ∈ R, F (x) is surjective and 0 is its only singular value. (We generalize this example in Lemma 6.1.) Following Section 2.1, if a probability distribution with a density ρ is given on the space R N of parameters, for all x ∈ I, F (x) becomes a random variable with a well-defined density ρ F (x) .
The following "Rice inequality" is the version of Rice's formula that we apply in this paper. It is essentially Azaïs and Wschebor [2, Exercise 3.9, p. 69]. We state it in a way that makes the method convenient to apply in our setting. We provide the proof for lack of a suitable reference.
has measure zero. Moreover, we assume that, for almost all u ∈ R N , the function [x 0 , x 1 ] → R has only finitely many zeros. Further, let a probability density ρ be given on R N . We assume there exists an integrable function g : and ε > 0 such that for all x ∈ [x 0 , x 1 ] and almost all a ∈ (−ε, ε) we have Then, for a random u with the density ρ, we can bound the expected number of zeros of the random function x → F u (x) in the interval [x 0 , x 1 ] as follows: In fact, for sufficiently small δ > 0, the right-hand side equals N (f ) + η, where η = 0, 1 2 , 1 according to as none, one or both of the numbers x 0 , x 1 are zeros of f . Proof of Theorem 3.2. In the setting of this theorem, we apply Lemma 3.3 to f = F u for a random u ∈ R N . Taking expectations over u and using Fatou's lemma, we obtain (for convenience, we drop the index u) Due to Tonelli's lemma (nonnegative integrands), we can interchange the integral over x and the expectation. We obtain where we have put By assumption, the integrand is upper bounded by g(x) for almost all a ∈ (−ε, ε), hence we obtain J δ (x) ≤ g(x) for δ < ε. Therefore, Finally, E (#{x ∈ (x 0 , x 1 ) : f (x) = 0}) = E (N (F )) since F (x 0 ) = 0 and F (x 1 ) = 0 happens with probability zero.

Conditional expectations of random linear combinations
Throughout, we assume that u 1 , . . . , u m are independent real random variables having the densities ϕ 1 , . . . , ϕ m , respectively. We fix real weights w 1 , . . . , w m , not all being zero, and study the random variable f := w 1 u 1 + · · · + w m u m .
We shall study bounds for the quantity E |u i | | f = a ρ f (a). Since ∇f = w = 0, there is no singular value of f . We begin with a simple bound on the density ρ f of f . It is only useful if the densities ϕ i are bounded (which is not the case for ϕ = ̟ k ).
which we can bound as Since the same argument works for w i , this finishes the proof.
Definition 4.2. We call a probability density ϕ on R convenient if ϕ is monotonically decreasing on (0, ∞) and symmetric around the origin, i.e., ϕ(−u) = ϕ(u) for all u ∈ R. Moreover, we require Clearly, a distribution with a convenient density ϕ is centered: R uϕ(u) du = 0. The densities ̟ k of the products of independent Gaussian random variables provide examples of convenient densities (see Section 2.2). Note that E ̟ k = (E ϕ ) k ≤ 1 with ϕ denoting the density of the standard Gaussian distribution. Lemma 4.3. Let ϕ and ψ be densities on R and assume that ϕ is convenient. Then: Now we use that ϕ is monotonically decreasing on (0, ∞) to upper bound this by The assertion follows by the symmetry of ϕ.
Proposition 4.4. Consider f = w 1 u 1 + · · · + w m u m , where (w 1 , . . . , w m ) = 0. If the density ϕ i of u i is convenient, then we have for any a ∈ R Proof. We begin with a general observation. Let v 1 and v 2 be independent random variables with the densitites ψ 1 and ψ 2 and assume ψ 1 to be convenient. Consider the sum g(v 1 , v 2 ) := v 1 + v 2 . By (2.5) we have for a ∈ R, Applying this observation to v 1 := w i u i and v 2 := j =i w j u j yields the assertion.
We provide now another bound on the conditional expectation, which is better for small weights w i . For this we need a stronger assumption on the densities. We will have to deal with unbounded densities, namely with the density ̟ k of the product of k ≥ 2 standard Gaussian random variables. Lemma 2.5 will allow us to apply the following result to these densities.  for some constants C > 0 and 0 < δ ≤ 1. Then, for all w 2 , . . . , w m ∈ R, the random linear combination f := u 1 + w 2 u 2 + · · · + w m u m satisfies for i ≥ 2 and all a ∈ R, Proof. Using the symmetry of ϕ i , we can assume w.l.o.g. that all the weights w i are positive. We first provide the proof in the case m = 2. Let f = u 1 + wu 2 with w > 0. By (2.5) we have By assumption, we have ϕ 1 (a − wu 2 ) ≤ C|a − wu 2 | δ−1 for all u 2 ∈ R. Using this, we obtain We bound this integral by splitting according to whether a w − u 2 is smaller or larger than one. Using that |u 2 |ϕ 2 (u 2 ) ≤ 1 2 , which holds since ϕ 2 is convenient (see Lemma 4.3(1)), we get We have thus shown that E (|u 2 | | f = a) ρ f (a) ≤ C ′ |w| δ−1 , where C ′ := C(δ −1 + B), settling the case m = 2.
We now turn to the general case m ≥ 2. Let f := u 1 + w 2 u 2 + · · · + w m u m and w.l.o.g. i = 2. As for (4.2), We bound the inner integral using the case m = 2 and obtain, which finishes the proof.

Random linear combinations of functions
Throughout this section we fix analytic functions w 1 , . . . , w m : [x 0 , x 1 ] → R and study for u ∈ R m their linear combination We assume that w 1 , . . . , w m do not have a common zero in [x 0 , x 1 ]. Note that ∇F (x) = (w 1 (x), . . . , w m (x)) = 0 for all x.
Lemma 5.1. The set of u ∈ R m such that m i=1 w i (x)u i has finitely many zeros is of measure zero.
Proof. W.l.o.g. we can assume that w 1 , . . . , w k is a basis of the span of w 1 , . . . , w m , where k ≥ 1.
If the analytic function F (x) has infinitely many zeros in [x 0 , x 1 ], then it must vanish identically and thus v j = u j + m i=k+1 λ ij w j = 0 for all j ≤ k. Since the set of u ∈ R m satisfying these conditions lie in a lower dimensional subspace, the assertion follows.
We note that any family of polynomials without common zeros in [x 0 , x 1 ] satisfies the above assumptions. For instance, we can take the family of monomials w i (x) = x di with d 1 = 0 ≤ d 2 ≤ . . . ≤ d m , which amounts to studying the random fewnomial We assume now that the u 1 , . . . , u m are independent real random variables with the densities ϕ 1 , . . . , ϕ m and consider random linear combination F (x). (Notationally, we again drop the dependence on u.) Our goal is to bound the expected number of real zeros of F via the Rice inequality.
We begin with a simple estimation.

Moreover, we have
Then u 2 and v := u 1 +w 3 u 3 +. . .+w m u m are independent random variables and F (x) = w 2 u 2 +v. Let ϑ denote the density of v. By Lemma 4.1 we have ϑ ∞ ≤ A. Hence, by (2.5), The same bound holds for all u i with i ≥ 2 and the first assertion follows. By the assumptions on the functions w i made at the beginning of Section 5, we can apply Theorem 3.2 and the second assertion follows.
The following corollary is of independent interest.
In particular, in the case of monomials , which can be seen as a probabilistic version of Descartes rule. u i x di with independent standard Gaussian coefficients; see [5].
Following Proposition 4.5, we now provide an estimation, which is better for small values of w i (x). It is relevant that this does not require the density ϕ i to be bounded. This estimation can be applied to the distributions of products of independent Gaussians, which will be of importance for the proof of the main result.
Proposition 5.5. Suppose u i has a convenient density ϕ i with E ϕi ≤ B, for i = 2, . . . , m. Further, assume there are C ≥ 1 and 0 < δ ≤ 1 such that the density ϕ 1 of u 1 satisfies ϕ 1 (u) ≤ C |u| δ−1 for all u. Then, for all w 2 , . . . , w m ∈ R, the random linear combination F (x) := u 1 + m i=2 w i (x)u i satisfies for all x ∈ [x 0 , x 1 ] and all a ∈ R: Proof. Put C ′ := C(δ −1 + B). Proposition 4.5 gives for i ≥ 2, a ∈ R, and x ∈ [x 0 , x 1 ], On the other hand, Proposition 4.4 gives Therefore, since C ′ ≥ C ≥ 1, The assertion follows now with In order to make effective use of Proposition 5.5 for certain structured weight functions having product form, we introduce the following notion, related to the total variation 1 0 |q ′ (x)|dx of a function q.
Definition 5.6. The logarithmic variation of a function q : The logarithmic variation has the following basic properties, whose proof is obvious.
For reasons to become clear in the next section, we assign to a finite subset S ⊆ N the sparse sum of squares with "support" S defined as the polynomial We will assume 0 ∈ S, hence α S (x) ≥ 1 for all x ∈ R and α S (0) = 1. Moreover, α S (1) = |S|.
Assume now we have a family of subsets S i ⊆ N satisfying 0 ∈ S i and |S i | ≤ t, for 1 ≤ i ≤ ℓ. We choose 1 ≤ k ≤ ℓ and define the function Proposition 5.8. Let d ∈ N and 0 < δ ≤ 1. The function w : Proof. 1. By Lemma 5.7, we have LV(α Si ) ≤ ln t, since α Si is monotonically increasing. Moreover, α Si (0) = 1, and α Si (1) ≤ t. Again using Lemma 5.7, we get LV(q) ≤ 1 showing the first assertion. 2. We will choose ε = ε(k, t, d) ∈ (0, 1) and bound For bounding the left-hand integral, we take logarithmic derivatives to get from w(x) = q(x)x d Integrating over [0, ε], we obtain We now choose ε := e − kt d . Then ε d = e −kt and With this choice of ε, we therefore have We next bound the integral over [ε, 1], again using (5.1), where we used d ln 1 ε = kt by our choice of ε. Altogether, we obtain completing the proof.

Sum of products of sparse polynomials
Let us first fix some notation. We assign to a finite subset S ⊆ Z of exponents and a collection of coefficients u s , for s ∈ S, the Laurent polynomial This allows to achieve a normalization by shifting exponents: let d be the minimum of S and put S ′ := S − d. Then S ′ ⊆ N and 0 ∈ S ′ . Since f S (x) = x d f S ′ (x), the functions f S and f S ′ have the same number of nonzero roots.
Let now k 1 , . . . , k m and t be positive integers and fix supports S ij ⊆ Z for 1 ≤ i ≤ m and 1 ≤ j ≤ k i such that |S ij | ≤ t. We study the number of nonzero real roots of the sum of products By shifting exponents, we assume without loss of generality ∀i, j S ij ⊆ N, 0 ∈ S ij and |S ij | ≤ t, and consider where we allow for a degree pattern 0 = d 1 ≤ d 2 ≤ . . . ≤ d m consisting of natural numbers d i .
The probabilistic setting is as follows. For each i, j and s ∈ S ij we fix a convenient probability density ϕ ijs on R and assume that there are constants A, B such that ∀i, j, s ϕ ijs ∞ ≤ A, E ϕijs ≤ B.
We suppose that we have random univariate polynomials with independent real coefficients u ijs having the convenient density ϕ ijs . The goal is to study the expected number of real zeros of the resulting random polynomial F . We assign to the support S ij the following generating functions , since E (u ijs ) = 0. The next lemma makes sure we can apply Theorem 3.2 in the above setting. The conditional density ρ F (x) (a) is defined at every nonzero a ∈ R. However, it is undefined at a = 0, unless k 1 = . . . k m = 1.
Proof. We fix x ∈ R. (a) After specializing u ijs := 0 for s = 0, F (x) becomes the function mapping (u ij0 ) to m i=1 u i10 · . . . · u iki0 , which clearly is a surjective function. (b) The f ij (x) are linear functions in disjoints sets of variables and all have a nonzero coefficient. Therefore, their gradients, viewed as vectors in R N , are linearly independent. Suppose now u = (u ijs ) ∈ R N is a singular point of F (x). We have (dropping the argument u) Since the ∇f i,j are linearly independent, we must have f i,1 ·. . .·f i,j−1 ∇f i,j f i,j+1 ·. . .·f i,ki = 0 for all i, j. This means that for all i there are different j and j ′ such that f ij (x) = 0 and f ij ′ (x) = 0. In particular, we have F (x)(u) = 0 for such u and hence 0 is the only possible singular value of F (x). If k i > 1 for some i, then u = 0 is a singular point of F (x) and thus 0 is a singular value.
(c) This follows from the reasoning in (b).
For applying Theorem 3.2, the main work consists now in exhibiting a "small" integrable function g(x) that upper bounds the conditional expectations. We embark on this next. 6.1. Products of sparse polynomials. We analyze here the case m = 1 of one product of random t-sparse polynomials f j (x) = s∈Sj u js x s , where for convenience, we drop the index i = 1. In particular, we write β j (x) := s∈Sj x s . So we assume 0 ∈ S j and |S j | ≤ t for all j.
By Lemma 6.1, every nonzero a ∈ R is a regular value of the map g(x) : R N → R, thus the conditional density ρ g(x) (a) is well defined and so are the conditional expectations with respect to the condition g(x) = a, provided ρ g(x) (a) > 0. Lemma 6.2. For all x ∈ R and all nonzero a ∈ R we have Proof. Fix x ∈ R and consider the random variables y j := f j (x) and z j := f ′ j (x). If ψ j (y j , z j ) denotes the joint density of (y j , z j ), then by the independence of (y 1 , z 1 ), . . . , (y k , z k ), the probability density of (y, z) ∈ R k ×R k is given by ψ 1 (y 1 , z 1 )·. . .·ψ k (y k , z k ). Note that g(x) = y 1 ·. . .·y k .
We thus obtain from (6.4) (6.5) Proposition 5.2 applied to the random linear combination f j (x) = s∈Sj u js x s implies Here we essentially use that, due to the assumption 0 ∈ S j , the polynomial f j (x) = u j0 + . . . has a constant term. Using this bound, we get from (6.5), Using (2.8), the integral over C a simplifies to (y2,...,y k )∈R k−1 |y 2 · . . . · y k | |a| · ψ 2 (y 2 ) · . . . · ψ k (y k ) dy 2 · · · dy k |y 2 | · · · |y k | Therefore, indeed The same argument works with f j instead of f 1 , so that we have proved the first statement. In order to show the second statement, taking logarithmic derivatives, we get Therefore, Inserting here the bound of the first statement yields the second statement.
6.2. Polynomials with nonzero constant coefficient. We deal here with the special case d 1 = . . . = d m = 0. So we are in the situation where all the f ij almost surely have a nonzero constant coefficient. It turns out that this situation is way easier to analyze than the general case. The next result shows that the real τ -conjecture is true on average under the assumption d 1 = . . . = d m = 0, if we only count zeros in [0, 1]. It is worthwile noting that this results holds for any convenient distribution of the coefficients u ijs , as long as they are independent. Theorem 6.3. Under the assumptions from the beginning of Section 6, the random polynomial Proof. Lemma 6.1 guarantees that (u, x) → g(x)(u) satisfies the assumptions of Theorem 3.2.
We are going to show that forall x ∈ R and all nonzero a ∈ R, where we recall that β ij (x) was defined in (6.3). Then, taking into account Lemma 6.1 and , the assertion will follow by Theorem 3.2. Towards proving (6.6), we put Lemma 6.2 gives for nonzero b ∈ R that For proving (6.6), it suffices to show that the same bound holds when conditioning on F (x) = a, namely For showing this, we fix 1 ≤ i ≤ m. We put y i := g i (x) and z i := g ′ i (x), and denote by ψ i (y i , z i ) the joint density of (y i , z i ). Moreover, we write ψ i (y i ) := R ψ i (y i , z i ) dz i for the first marginal distribution. By construction, the pairs (y 1 , z 1 ), . . . , (y m , z m ) are independent. We claim that (6.9) E (|z 1 | | y 1 + . . . + y m = a) ρ y1+...+ym (a) = R E (|z 1 | | y 1 = b) ψ 1 (b) ρ y2+...+ym (a − b) db.
6.3. Proof of main result. We specialize the setting described at the beginning of Section 6 to the case where all the coefficients u ijs are standard Gaussian. For 1 ≤ i ≤ m we define the auxiliary analytic weight functions Note that the right-hand contribution equals E |R ′ (x)| | R(x) = a ρ R(x) (a) as desired. In order to bound the left-hand sum, we can apply Proposition 4.4 since the densities of the u i are convenient, and we thus obtain This yields Summarizing, we have shown that which completes the proof.
We can finally provide the proof of the main result.
Proof of Theorem 1.1. The right-hand term in the statement of Proposition 6.4 can be bounded with Proposition 5.5. Indeed, due to Lemma 2.5 we know that ̟ k1 (a) ≤ e |a| Applying Proposition 6.4 implies for x ∈ R and a ∈ R * , recalling that w i (x) := q i (x)x di , The function g(x) on the right-hand side of (6.13) is integrable: LV(q i ) + e(2k 1 + 1) m i=2 (2LV(q i ) + k i t + 2k 1 ) < ∞.
By Proposition 5.8 we can bound LV(q i ) ≤ 1 2 2k i ln t. Moreover, Theorem 3.2 can be applied (see Lemma 6.1) and states that E (#{x ∈ [0, 1] : F (x) = 0}) ≤ where k denotes the maximum of the k i .
The number of zeros of F in [1, ∞) equals the number of zeros x ∈ (0, 1] of F (x −1 ). Moreover, F (x −1 ) has the same structure as F except that the supports S ij are replaced by −S ij . Since we can shift the degrees without changing the number of positive zeros, we conclude that E (#{x ∈