Although a large quantity of geochemical data on oceanic basalts has been collected, it is not sufficient to characterize the recycling processes between the Earth's mantle, on the one hand, and the crust and the lithosphere, on the other hand. In particular, it remains unclear why mid-oceanic ridge basalts (MORBs) are relatively homogeneous throughout the world, while oceanic island basalts (OIBs) present a wide spectrum of heterogeneities. Assuming whole mantle convection, it is often argued that the increase of viscosity with depth could be responsible for some stratification of the mixing properties of the mantle and consequently could generate the observed geochemical and isotopic differences between MORBs and OIBs. In this study we test this assumption by means of two-dimensional circulation models where passive tracers are advected. First, we quantify the ability of the upper and lower mantles to erase heterogeneities. Second, we simulate the evolutions of U, 4He, and 3He concentrations, taking into account the magmatic processes at ridges (differentiation, outgassing, and recycling). We conclude that the viscosity layering does not induce any vertical stratification of the mixing properties of the mantle. On the contrary, the partial segregation of the oceanic crust in the D″ layer would explain the generation of large zones of high 3He/4He ratios on top of D″. In our simulations these zones have a primitive He signature, although they have already been processed at ridges. Trapping oceanic crust in D″ would not only explain the presence of recycled components in hotspots and particularly in the high μ type hotspots [Hofmann and White, 1982] but would also suggest that the high 3He/4He ratios of volcanoes like Loihi could have their origin in ancient subducted lithosphere.
 Geochemical systematics has given a relatively clear picture of the compositional and isotopic differences between mid-oceanic ridge basalts (MORBs) and oceanic island basalts (OIBs). OIBs present wide spectra of isotopic heterogeneities. Some have an apparently primitive signature [Kaneoka and Takaoka, 1980; Kurz et al., 1982; Allégre et al., 1983], although the signature of recycled oceanic crust is at the same time ubiquitous [White, 1985; Blichert-Toft et al., 1999]. Variability is also found in MORBs but with a spread reduced by a factor 3–4.
 The case of helium is rather exemplary of the differences between MORBs and OIBs: 3He/4He ratio, which corresponds to stable over radiogenic isotope ratio, is equal to ∼8Ra (Ra stands for the atmospheric ratio) in MORBs with very small dispersion, whereas in OIBs it ranges from ∼0 to ∼35Ra [Hart and Zindler, 1989]. Undegassed, and subsequently old material, has a high 3He/4He ratio because it has kept a large amount of 3He. Plumes like Hawaii, Iceland, Galápalogos, and others have such a primitive signature. On the contrary, helium ratio is low in oceanic (or continental) crust, though equal 3He and 4He degassing, because it is enriched in incompatible elements, 235U, 238U, and 232Th, that produce 4He. Plumes called high μ (HIMUs) like St. Helena, or Tubuaï have a He ratio lower than MORBs and thus tap an enriched, rather than depleted, reservoir.
 In the 1970s through 1980s most interpretations of the geochemical data were based on a layered convection model of the mantle: MORBs coming from a depleted, shallow, and young reservoir and OIBs coming from deeper and rather primitive reservoirs. From simple mass balance considerations the depleted reservoir was estimated to have a mass larger than, but of the same order as the upper mantle [Jacobsen and Wasserburg, 1979; Allégre and Turcotte, 1985]. Though [Anderson, 1982] considered a rather different model with the various geochemical reservoirs located in the upper mantle, he also assumed that there was very little exchange between the lower and the upper mantle.
 The standard model of the 1980s with a primitive lower mantle is now untenable. The OIBs are too variable in composition to be explained by a uniform reservoir, and the presence of some ancient oceanic crust is documented in most cases. The tomographic studies [van der Hilst et al., 1997; Grand et al., 1997] detect high-velocity anomalies (interpreted as slabs) in the lower mantle, which imply an important stirring of the whole mantle. Moreover, several works [Richards and Engebretson, 1992; Ricard et al., 1993] have demonstrated that deep slab penetration into the lower mantle remarkably explains the large-scale gravity anomalies.
 A modified version of the two-layer model has recently been proposed [Kellogg et al., 1999]. It suggests that the mantle is not stratified below the transition zone but at a much larger depth. The bottom ∼1000 km of the mantle would consist of primitive and chemically dense material separated from the rest of the mantle by a thermochemical interface with large undulations. Although appealing, this model does not clearly explain the variability of OIB signature and their crustal component, and the model requests a very particular mineralogy. The deep mantle has to be dense enough not to be entrained by convection without being easily detected by seismology, neither in radial models that suggest a uniform lower mantle nor by tomography.
 Assuming whole mantle convection, it is, however, difficult to understand the geochemical data. Some authors have simply studied the mixing properties of the mantle and how initial heterogeneities can survive convection. Their results are complex and sometimes contradictory. Christensen  and Kellogg and Turcotte , using two-dimensional (2-D) numerical experiments agree that no large-scale anomaly could survive after one billion years in a chaotically convective system. This conclusion has been disputed since the flows used in the previous calculations might be too active to be appropriate for the Earth [Gurnis and Davies, 1986; Davies, 1990]. The existence of large-scale plates might organize the flow and decrease its mixing ability. Mixing properties of a real 3-D flow can, however, be quite different from those at 2-D. Three-dimensional convection seems to have less efficient mixing properties [Schmalzl et al., 1995, 1996], and large unmixed zones might survive [Ferrachat and Ricard, 1998].
 More promising may be the studies where rather than erasing heterogeneities the problem has been stated in terms of generating heterogeneities. van Keken and Ballentine [1998, 1999] have used a 2-D cylindrically symmetric convection model mimicking degassing under oceanic ridges. They have shown that even a large viscosity increase with depth does not lead to a stronger degassing of the upper mantle than of the lower mantle.
 At least two mechanisms have been proposed to maintain the heterogeneities continuously created at ridges. Albaréde  proposed that the slab content in incompatible elements stored in the oceanic crust layer can be totally stripped off by fluid extraction during subduction. This mechanism concentrates the incompatible elements in the upper mantle and keeps a primitive He ratio in the lower mantle while barren slabs penetrate the lower mantle. This model that leads to an enriched upper mantle is, however, difficult to reconcile with the apparent depleted nature of the MORB source.
 Another process have been suggested by Hofmann and White . The oceanic crust, entrained in the deep mantle with the harzburgitic lithosphere, could segregate and accumulate in D″, at the core-mantle boundary. This could be justified by the fact that eclogitized basalts are slightly denser than the surrounding mantle at almost all depths [Ringwood and Irifune, 1988]. Christensen and Hofmann  have tested these assumptions in a 2-D convective model and concluded that it is viable from a fluid dynamic point of view and that enriched HIMU-type OIBs could indeed rise from ancient segregated oceanic crust. More recently, Coltice and Ricard  have extended this model to show that it could also explain the apparent primitive He ratios of some Hawaiian-type OIBs if they derived from a mixture made mostly from depleted peridotitic old lithosphere.
 The aim of this paper is to test separately the role of the viscosity increase and the possible trapping of ancient oceanic crust in the D″, on both mixing efficiency and isotopic distribution of material in the case of He–U system. Convection of the mantle will be modeled with 2-D advective systems in a Cartesian geometry.
2. Model Features
 We simulate the convective mantle with a 2-D circulation model. We think that unless the generation and evolution of plate tectonics can be self-consistently modeled in thermomechanical numerical codes, simple Stokes flows driven by both boundary conditions and prescribed internal loads are as realistic to represent the mantle flow as many much more complex simulations. Therefore we do not solve the whole set of convection equations including heat conservation.
 We use a flow that verifies the Navier Stokes equation inside a box of depth H, width L = 3H, with periodic conditions and a free slip bottom boundary. The viscosity can increase with depth in a step-like manner between ηUM and ηLM. The flow is partially driven by internal density anomalies beneath the subduction zones in order to simulate cold slabs sinking in the whole mantle. The flow is also forced by diverging-converging velocities at the surface of a box. The configuration of these plate-like velocities simulates a ridge and two subduction zones on the two lateral faces. Time dependence is imposed by moving the position of the ridge successively located at L/8, L/2, and 7L/8. The ridge jumps every time step T = 2H/U. Each plate is in mechanical equilibrium between the slab pull induced by internal loads and the mantle drag due to the imposed surface motion [Ricard and Vigny, 1989] (because our model is missing the ridge push force, we scale the slab pull to be only half of the mantle drag). The requirement of plate equilibrium imposes the plate velocities. When the ridge is at the center of the box, the two plates have the same velocity amplitude U. When the ridge is excentered, the largest plate is the slowest.
 Scaling with U = 10 cm yr−1 and H = 3000 km, this model assumes that the transit time H/U is 30 Myr, that the two subduction zones are 9000 km apart, and that the ridge system reorganizes every 60 Myr. To simulate the evolution of the mantle through the Earth's history, ∼100 transit times are needed.
 A simple sketch of the flow is depicted in Figure 1. Because of the particular choice of the surface time dependence and of the internal loads, the circulation pattern has a long-term symmetry with a broad central upwelling.
 For a given viscosity profile the total root-mean-square velocity in the box remains roughly constant through time, i.e., does not change much when the ridge jumps. Increasing the lower mantle viscosity decreases the average velocity. However, increasing the mantle viscosity by 100 in the lower half of the box only reduces the root-mean-square velocity by 0.40 (as all models are scaled by the same surface velocity, when the lower mantle is stiffer the amount of internal loads must increase).
 Once the flow is computed, we advect ∼300,000 tracers by means of a fifth-order Runge–Kutta method, interpolating the velocity field with a bilinear scheme from a 64 × 65 grid. These tracers are passive; e.g., they have the same density and viscosity as their vicinity.
3. Erasing Heterogeneities: Insignificant Effects of an Increase in Mantle Viscosity With Depth
 In this section, we present two different types of numerical experiments to try to understand the effects of the viscosity layering on the mantle mixing. We only consider the intrinsic mixing properties of the different flows without any chemical consideration: tracers are simply advected, and we only focus on their distribution through time.
3.1. Mixing Time as a Function of Length Scale
 The main idea of this experiment is to compare the mixing times of upper and lower parts of the mantle with different viscosity jumps between them.
 The concept of mixing time is intrinsically related to the particular length scale considered: the better the resolution (defined as the reciprocal length scale), the longer the time necessary to homogenize. Our way of computing mixing times is largely inspired by Olson et al. . Olson et al. performed experiments where mixing times are computed in the following manner: the system is divided into several grids, from the coarsest (size of the cells equal to H/2, H being the total height of the box), to the finest (e.g., H/64). At the beginning, tracers are distributed in a chessboard manner as shown in Figure 2a.
 The principle is to compute, regularly and for each grid, the variance of the number of tracers in the cells. In mathematical terms it can be written as
where λ characterizes the length scale of the grid, cλ the number of cells, nλi the number of tracers in the ith cell, and the average number of tracers per cell for the grid λ. The variance characterizes the deviation of the distribution from a strictly homogeneous one. When normalized by Varλ(t = 0), it ranges from 1 to 0. Olson et al. defined the mixing time related to the grid λ as
When the variance decreases exponentially with time, τλ corresponds to the usual decay time. The faster the decrease of variance, the shorter the mixing time and the more efficient the mixing. The problem is that as the number of tracers is finite, Varλ tends (in an ideal case of ergodic system) to the statistical limit and not to zero. Therefore the previous integral does not converge. The statistical limit of Varλ(t)/Varλ(0) is
We slightly change the definition of τ and use the implicit expression:
This integral converges and gives a mixing time numerically independent of n. The numerical protocol is depicted in Figure 2. As the purpose is to apply the results for the Earth, the upper bound of the integral is chosen equal to 100 transit times (∼3 billion years). This is sufficient to reach the statistical limit of uniform distribution in cases with homogeneous viscosity. In the case where the lower mantle is very sluggish, a significant departure from uniformity remains at the end of the numerical simulation. This indicates that the Earth's history may not be long enough for a thorough mixing of the mantle.
Figure 3 summarizes the results in the log(1/λ) − log(τλ) plane for three different viscosity profiles: ηLM/ηUM = 1, 10, and 100. The viscosity discontinuity is placed at middepth in order to coincide with the grid cells used for the variance computation. The mixing time has only been computed up to the resolution 1/λ = 64 in order to have a statistically significant number of tracers in each cell in uniform distribution limit (at least 25 tracers per cell). We have simultaneously performed the computations for the two subsystems UM (Figure 3a), LM (Figure 3b), and compared the results (Figure 3c). The mixing times are regularly increasing with 1/λ [Olson et al., 1984]. We can see that the effects of the viscosity are more or less identical for all length scales. Increasing the lower mantle viscosity increases the mixing times in both the upper and the lower mantle. However, even a 2 order of magnitude increase in ηLM only lengthens the mixing times by a factor lower than 5.
 In Figure 3c a τLM/τUM ratio around 1 indicates that the upper and lower mantle mixing times are comparable in the case of a viscosity jump lower than 10. Even in the case of a rather large viscosity jump by a factor 100, the mixing times of the more viscous part are only less than 4 times those of the less viscous part. This extreme viscosity ratio is much larger than the ratio of ∼30 generally estimated for the mantle [Ricard et al., 1993; Forte et al., 1991].
 The slight increase of the mixing time ratios in the short length scale limit is likely to be attributed to the intrinsic configuration of the flow. Even in the isoviscous case (solid curve) the velocity field in richer in short wavelength components in the upper mantle than in the lower mantle as the flow induced by the surface motions smoothes out with depth.
 From a physical point of view, it would have been more pertinent to nondimensionalize the computed mixing times by a global quantity like the inverse of the averaged stretching rate є to derive specific mixing properties. Instead, we normalized by the transit time computed from the surface velocity. Transit times are the same for the three viscosity jumps, whereas є varies with the viscosity ratios. Surprisingly, the average stretching rate increases with the lower mantle viscosity as a rheological stratification develops a strong horizontal shearing near the upper mantle–lower mantle interface. The geophysicist should prefer to normalize by the observed plate velocities than by the poorly constrained averaged stretching rate of the mantle.
 These various experiments do not favor the assumption of a strong role of the viscosity layering to substantially differentiate the mixing states of the upper and lower parts of the mantle. This conclusion holds for all wavelengths larger than H/64 ∼ 50 km. In order to understand what happens at shorter wavelengths we can compute Lyapunov exponents rather than mixing times.
3.2. Lyapunov Exponents
 Beyond the computation of mixing times, Lyapunov exponents are another powerful tool to quantify mixing efficiency. It consists in computing the exponential stretching rates of initially infinitesimal segments. A more rigorous mathematical definition is
where α is the Lyapunov exponent (in fact, the largest Lyapunov exponent), t the time, and l(t) the time-dependent length of the segment [Lichtenberg and Lieberman, 1983]. It has the dimension of an inverse time. The greater the Lyapunov exponent, the faster the mixing. A zero Lyapunov exponent does not mean that there is no mixing but simply that the stretching rate is slower than exponential growth (in the case of simple shear, the distance between two advected tracers increases linearly which implies that the Lyapunov exponent is zero).
 Coming back to the former experiments, the Lyapunov exponents could intuitively be related to mixing times in the short length scale limit. According to a simple interpretation of the definition of a Lyapunov exponent the size λH of a small segment increases with exp(αt). This size will be comparable to the size of the box itself, H, after a time τλ corresponding to the mixing time associated to the resolution λ. Therefore, in the limit of very small λ,
As previously, though Lyapunov exponents are defined as a limit at infinite time, here we only consider finite time Lyapunov exponents computed over 100 transit times. Practically, we advect 54,000 pairs of tracers initially regularly distributed in the box. The viscous jump is kept at middepth of the box for coherence with the previous experiments.
 The results are plotted in Figure 4. It represents the histograms of finite time Lyapunov exponents for thousands of tracers initially located in the upper part (Figure 4a) and lower part (Figure 4b) of the mantle (the histograms remain basically unchanged when the tracers are classified by their final, rather than initial position in the mantle). Histograms in Figures 4a and 4b are very similar with only a slight difference in the extreme case of ηLM/ηUM = 100. The maxima of these distributions correspond to Lyapunov exponents very similar to those obtained using (6) with λ = 1/64 and the values for τλ taken from Figure 3. This indicates that the behavior of the mixing time discussed at large wavelength (for 1/2 > λ > 1/64) can be safely extrapolated to short wavelengths.
 We conclude that whatever the viscosity stratification (at least up to 2 orders of viscosity increase), the global stretching history is nearly the same in the upper and lower mantle. This is confirmed at large and small wavelengths by the mixing time and the Lyapunov studies. The physical reason is simple: even with a very large viscosity contrast, a significant mass flow crosses the upper mantle–lower mantle interface in our model [Davies, 1988]. This flow continuously introduces tracers in the lower mantle that have experienced a large stretching in the upper mantle and replenishes the upper mantle with poorly stretched material. Of course, this conclusion is obtained under the assumption that the major effects at 670-km depth are related to a viscosity increase and that other phenomena that could impede the slab penetration (such as endothermic phase transitions) are negligible, which is supported by recent convection simulations [Bunge et al., 1997].
4. Generation of Heterogeneities at Ridges
 In a new set of experiments each tracer carries a time-dependent concentration of 238U, 3He, and 4He. Initially, the whole system is homogeneous and Table 1 summarizes the different initial concentrations. These concentrations evolve in different manners, owing to radioactive decay or production and because of chemical fractionation during tectonic processes. 238U decays following
and [4He] is produced following
The factor 18.7 takes into account the fact that 4He is not only provided from the radioactive decay of 238U but also from 235U and 232Th and that the Th/U ratio remains constant.
Table 1. Concentrations Associated to Each Tracer at the Beginning of Experimentsa
 Our model takes into account melt fractionation and outgassing during the formation of oceanic crust and lithosphere. We compute the running average of the global 3He, 4He, and 238U content in a semicircular area beneath the ridge (with a dimensionalized radius of 150 km). This way, we mimic a magma chamber with full homogenization of material during the partial melting. As soon as a new tracer enters the magma chamber, one tracer is released, either in the crust (7 km thick) with probability p = 1/10, or in the underlying lithosphere (63 km thick) with probability 1 − p = 9/10. This tracer is 3He and 4He degassed with respect to the magma chamber by a factor 1/1000 in the crust, or 1/100 in the lithosphere. Similarly, its U content is enriched by 9.82 in the crust or depleted by 1/50 in the lithosphere (these coefficients respect element conservation as 9.82p + 0.02(1 − p) = 1). Figure 5 depicts the global fractionation process.
 The viscosity jump (at a depth H/3 to reproduce the volume ratio between the upper mantle and the lower mantle in Cartesian geometry) is deliberately rather large: ηLM = 100ηUM. With such a viscosity stratification, the total mass flux between upper and lower mantle is about half that at mid upper mantle. This flux corresponds to both the motion of the lithosphere and that of the entrained surroundings. Most of the subducting lithosphere and crust do penetrate the lower mantle in our simulation. In the real Earth some slabs may be temporarily stored in the transition zone before further sinking [Fukao et al., 1992; Christensen, 1996]. However, the mass flux integrated over the long term corresponds in numerical simulations to that of a more continuous slab penetration [Bunge et al., 1997].
 We advect ∼300,000 tracers initially homogeneously located, with the chemical rules of evolution described above. The time since each tracer has left the ridge is mapped in Figure 6. To plot this map and the following ones, the discrete values carried by each tracer have been gridded over surfaces equal to that of the magma chamber. The density of tracers remains rather homogeneous during the numerical experiment so that each interpolated value represents an average of ∼100 tracers.
 The first-order symmetry of Figure 6 is, of course, explained by the long-term symmetry of the imposed flow. However, the quantitative considerations that will be developed in the following are mostly insensitive to the particular circulation flow.
 We observe that the lower mantle is on average somewhat younger than the upper mantle in this simulation (Figure 6b). This astonishing distribution of ages is explained by the fact that the viscosity increase does not constitute a strong barrier to the flow: almost all the slabs do penetrate the lower mantle, and material in the upper mantle is mostly provided from the return flow. The existence of a rather young lower mantle seems to be an inescapable implication of whole mantle convection models. The large-scale heterogeneities of the lower mantle correspond to slabs that are thickened and folded during their sinking. Because of the wrap around lateral boundary conditions, the slab folds enter the right side of the box when the surface ridge is close to the left side of the box. As we initialize the computation with the ridge on the left, a disymmetry is maintained through the simulation. The color scale is graduated in transit times: every 180 Myr a new fold is formed (the period of the imposed surface tectonics), and this piece of slab reaches the bottom of the mantle in ∼150 Myr (i.e, the sinking velocities predicted in the lower mantle are of order 1.5 cm yr−1).
Figure 7 depicts the 3He/4He ratios after the end of the simulation (100 transit times). Animation 1 (available at http://www.g-cubed.org/publicationsfinal/articles/2000GC000092/2000GC000092-a1.mpg) shows the evolution of He ratios from t = 0 to t = 100 transit times. The first and surprising remark we can make is that the viscosity layering is absolutely not visible in the He ratio distribution (Figure 7a and Animation 1) nor in the vertical averaged curve (Figure 7b). Although the 3He/4He ratio decreases with depth, no obvious large-scale He anomaly can be detected. The upper part of the mantle has a more primitive 3He/4He ratio because it contains tracers that have undergone a fewer number of degassing events. Small-scale structures with sizes lower than ∼200 km are rather homogeneously distributed and correspond to lumps of either mantle lithosphere or oceanic crust. Crustal and lithospheric bodies close together have generally a similar age. This explains why the age map (Figure 6a) is smoother than that of the He ratio (Figure 7a).
 The He ratios under the ridge are ∼20 Ra at the end of the simulation, which is rather different from those found in MORBs (8 Ra). However, there are various reasons why the absolute He ratios of the simulation should be significantly higher than those observed. The duration of the experiment ∼3 billion years is shorter than the age of the Earth. The vigor of mantle convection has decreased with time [Schubert et al., 1980; Blichert-Toft and Albaréde, 1994], and therefore we underestimate the number of melting episodes. The early Earth may have undergone a large degassing [Allégre et al., 1986]. These three phenomena decrease the 3He/4He ratio. The recycling of continental crust would also lower the He ratio, but recent work suggests that recycling is minor [Coltice et al., 2000].
 The comparison of Figures 6a and 7a show that there is very little correlation between the helium ratio and the time since the last melting at ridge. It is not very surprising since both crustal and lithospheric layers are simultaneously generated at ridges, whereas they will develop very different helium ratios owing to their drastically different U concentrations.
 As a conclusion of this numerical experiment, whole mantle circulation (with or without viscosity increase) is associated with a lower mantle younger than the upper mantle, and with a more depleted He signature. The viscosity layering alone does not seem to be responsible for large isotopic variations in the mantle. Moreover, it does not induce any vertical stratification of the material properties (composition, age since last ridge, etc.) as the amplitudes of lateral and vertical variations are on the same order of magnitude in Figures 6a and 7a.
5. Role of an Oceanic Crust Trapping in the D″ Layer
 We now test the effects of a possible oceanic crust trapping in D″ on the variations of isotopic compositions. The purpose is not to decipher whether it really happens or not but to evaluate the impact for helium and uranium distributions.
 As the tracers are passive in our model, the segregation of the crust cannot spontaneously occur because of the negative buoyancy of eclogites as could be the case for the real Earth. We simply mimic the trapping process by removing from the simulations all the tracers coming from the oceanic crust that enter the lowermost 200 km of the mantle. These tracers do not participate anymore to the U, 3He, and 4He budgets of the mantle.
 The map of time since each tracer has left the ridge is not shown because it is closely similar to Figure 6. Trapping some oceanic crust in the lowermost kilometers of the mantle has no effect on the ages distribution in this model.
 Contrary to the previous numerical experiment, the map of 3He/4He ratio (Figure 8a) presents a large, high 3He/4He area in a globally homogenized mantle with a significantly lower 3He/4He (see also Animation 2). It is important to notice that this area has roots in the very deep part of the box, where hotspots (in a globally convective system) are intended to rise. This large-scale anomaly is not sampled in large proportions in the magma chamber: the average today's helium ratio is ∼31 Ra in the magma chamber, far from ∼50 Ra in the red area, for example (Figure 8a).
 The apparently “primitive” anomaly of the lower mantle is clearly not associated with a zone of the mantle that has remained pristine (the ages are depicted in Figure 6). This can be also shown in Figure 10a, where 3He/4He ratios are plotted as a function of their 3He content. Although there is some increase of 3He/4He with 3He at very low 3He content, the 3He content of MORBs (at 3He/4He ∼ 31 Ra) is similar to that of the mantle with much larger 3He/4He ratios.
 On the contrary, the 3He/4He ratios and the U concentrations (Figures 8 and 9) are strongly anticorrelated as depicted in Figure 10b. This anticorrelation is indeed observed in real samples [Coltice and Ricard, 1999]. U provides 4He through radioactive decay: where U concentration is high, 3He/4He is likely to be low, and vice versa. The segregation of the oceanic crust has left a U-depleted ancient lithosphere on top of the CMB. The paucity of radioactive elements in this layer freezes the evolution of the 3He/4He ratio. This layer is slowly and passively entrained by the average return flow and forms a huge upwelling zone far from subductions.
 We have performed other simulations by changing the relative compatibilities of He and U. The conclusions remain the same whether He is more incompatible than U, which is the traditional view, or less incompatible [Graham et al., 1990]. As the MORB source is made of stripes of ancient lithosphere and oceanic crust, 3He/4He its ratio is necessary in between the low 3He/4He ratio of the pieces of oceanic crust and that of the lithosphere. Any zone of the mantle rich in ancient lithosphere therefore has a high, apparently more primitive 3He/4He ratio. At first order our model suggests that most 3He in the MORB source comes from the peridotitic component, while most 4He comes from the ancient oceanic crust component. This interpretation works as soon as both U and He are incompatible whatever their relative incompatibilities).
6. Discussion and Conclusions
 In the previous sections, we saw that though attractive, the intuitive idea that the viscosity layering of the mantle plays, by itself, an important role for the mixing efficiency and isotopic or ages distributions is likely to be wrong. If the convection is assumed to be one layered, almost all the material goes across the viscosity discontinuity, even if the viscosity increase is large. The material properties (whatever their nature) remain more or less homogeneous in the whole system due to this interpenetration.
 On the contrary, we have shown that segregation of ancient oceanic crust could be a viable assumption for explaining the helium data, as already argued by Coltice and Ricard  using a simpler box model. If so, hotspot basalts could originate from a variable mixture of two endmembers: ancient lithospheric material in large-scale zones of the lower mantle and ancient oceanic crust in D″. The former could possibly be at the origin of the apparently “primitive” He signature of some hotspots like Loihi (although containing evidence of recycling according to Blichert-Toft and Albaréde , the latter being at the origin of HIMU hotspots. The deep mantle heterogeneity would not be due to the preservation of primitive mantle in a poorly mixing regime but to an ongoing mechanism of chemical differentiation by segregation of the oceanic crust. This model implies that the 3He/4He ratio is negatively correlated with the U concentration but mostly uncorrelated with the 3He concentration. This is in contradiction with various models that explain the He data by assuming a two-layered convection with a reduced mass flux between upper and lower mantle [Kellogg and Wasserburg, 1990; Porcelli and Wasserburg, 1995]. With these assumptions, 3He/4He should correlate positively with both the U and 3He contents.
 We have already discussed the fact that we obtain helium ratios that are significantly larger than those found in natural samples. We could have refined the numerical code by renormalizing the unit of time to the secular heat flow decrease, like in the paper of Gurnis and Davies . However, it would not have qualitatively affected the intrinsic dynamics, which is responsible for those specific distributions of the 3He/4He ratios. This complexification would only have increased the number of times each tracer would have been processed in the magma chamber.
 In contradiction to most usual views of mantle chemistry, it seems difficult in the framework of whole mantle convection to maintain a lower mantle older than the upper mantle. Allégre and Turcotte  explains the elemental budget of the Earth by a balance between a pristine lower mantle, a depleted upper mantle and the enriched continental crust. Geophysical observations, according to our interpretation, favor a balance between a depleted whole mantle and two enriched reservoirs, the continental crust and D″. This depleted whole mantle would contain heterogeneities due to an ongoing process of segregation, not the preservation of pristine components.
 This work benefited from numerous and constructive discussions with Nicolas Coltice. It has been supported by CNRS-INSU programs. Parallel computations were performed thanks to the PSMN computing facilities at Ecole Normale Supérieure in Lyon.