We present a multiple linear regression model developed for describing global river export of dissolved SiO2 (DSi) to coastal zones. The model, with river basin spatial scale and an annual temporal scale, is based on four variables with a significant influence on DSi yields (soil bulk density, precipitation, slope, and area with volcanic lithology) for the predam situation. Cross validation showed that the model is robust with respect to the selected model variables and coefficients. The calculated global river export of DSi is 380 Tg a−1 (340–427 Tg a−1). Most of the DSi is exported by global rivers to the coastal zone of the Atlantic Ocean (41%), Pacific Ocean (36%), and Indian Ocean (14%). South America and Asia are the largest contributors (25% and 23%, respectively). DSi retention in reservoirs in global river basins may amount to 18–19%.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Silicon (Si) clearly shows the link between rock and life. Silicon dioxide (SiO2) or silica is the most abundant component of the Earth's crust. It occurs as silicate minerals in igneous, metamorphic, and sedimentary rocks. These minerals undergo physical and chemical weathering, which is the major natural source of dissolved silica (dissolved SiO2, hereafter referred to as DSi) in aquatic ecosystems [Berner and Berner, 1996]. On its way through soils, aquifers, and riparian zones, Si exerts control over the cycling and fate of carbon (C), nitrogen (N), phosphorus (P), and other nutrients [Ittekot et al., 2006].
 Terrestrial plants take up a significant portion of the DSi produced during weathering [Bartoli, 1983; Sommer et al., 2006]. Amorphous silica (ASi) in phytoliths in many plants and soils is an important Si reservoir and may have an impact on the Si transfer from the terrestrial to the aquatic biosphere, because of its high dissolution rates compared to other particulate silica compartments in sediment fluxes [Bartoli, 1983; Conley, 1997; Van Cappellen, 2003]. ASi is produced and recycled on the land before its eventual transfer to the coastal areas through rivers. ASi may therefore make up an important contribution to total river Si loads [Conley, 2002; Derry et al., 2005; Kurtz and Derry, 2004].
 Diatoms are the essential phytoplankton group that needs Si as a major nutrient [Conley, 2002]. Marine diatoms in particular are often limited by Si [Kristiansen and Hoell, 2002], while diatoms in river systems experience Si limitation occasionally, for example under high anthropogenic inputs of N and P [Billen and Garnier, 2007]. The Si for diatoms in coastal waters is delivered from rivers, from recycling within the water column at the sediment-water interface, and from atmospheric deposition. The role of oceans in the global C cycle is coupled with the global Si cycle because diatoms comprise 50% of the biomass of today's ocean with a large contribution to C burial [Treguer and Pondaven, 2000].
 The silicate weathering process (desilication) consumes carbon dioxide (CO2) [Gaillardet et al., 1999] and produces alkalinity. At the global scale, the magnitude of desilication is believed to depend primarily on lithology, runoff, and erosion; weathering rates only showed a reasonable correlation with average temperature when large tropical rivers with abnormally low weathering rates were excluded [Gaillardet et al., 1999]. Because of their geological and climatic settings, tropical river basins play a major role in chemical weathering and transfer of DSi and alkalinity to rivers and oceans [Jennerjahn et al., 2006]. Because global warming is believed to be especially pronounced at high latitudes in the Northern Hemisphere, a change in structure and cover of vegetation could rapidly alter the biogeochemistry of river systems and land-ocean interactions along the coasts of the Arctic Ocean [Humborg et al., 2006].
 While riverine N loads have increased during the past decades [Bouwman et al., 2005] and similar changes have occurred for P [Smith et al., 2003], Si loads have remained constant or even decreased in many rivers primarily as a result of Si retention in reservoirs and lakes through eutrophication and increased diatom productivity [Conley, 2002; Ittekot et al., 2006]. Settling diatom frustules accumulate rapidly in bottom sediments, because their specific gravity is far greater than that of nonsiliceous algae [Reynolds, 1984]. This has often altered the stoichiometric balance of N, P, and Si [Rabalais, 2002] which may not only affect the total production in freshwater and coastal marine systems, but also its quality. When diatom growth is compromised by Si limitation, nondiatoms may be competitively favored, with dominance of flagellated algae including noxious bloom-forming communities [Turner et al., 2003]. Thus the biogeochemical cycling of C, N, and P and food web dynamics leading to fisheries harvests are affected by shifts in the availability of Si [Billen and Garnier, 2007; Ragueneau et al., 2006].
 In this paper we describe, evaluate, and apply a new model for predicting current global river loads of DSi to the world's coastal zones. This model was developed as part of an international interdisciplinary effort to model river export of multiple bioactive elements (C, N, P, and Si) and elemental forms (dissolved/particulate and inorganic/organic) called Global Nutrient Export from Watersheds (Global NEWS). We hereafter refer to our model as “NEWS-DSi.”
 A number of attempts have been made to estimate the current riverine DSi export to coastal zones. Many used simple extrapolation schemes based on data mostly from large rivers. For example, Clarke  and Livingstone  used data for a few large temperate rivers. Meybeck  used a biome typology and included data for 60 rivers. Treguer et al.  estimated a global river export of 0.34 Pg a−1 of dissolved SiO2 using data from Meybeck and Ragu . Lacking a global representative data set for ASi river export, our study focuses on DSi.
 Apart from current DSi loads in rivers, it is important to know how and where these loads will change in the future under changing land use, dam construction, and climate change. Statistical methods using multiple regression are useful for analyzing the relationships between river nutrient export and controlling factors and have been successfully applied to estimate river export of the various compounds of C, N, and P [Seitzinger et al., 2005].
 NEWS-DSi is also based on a regression approach to analyze a large data set of DSi measurements representing the predam situation. The aim of this work is to analyze the controls of DSi. Because NEWS-DSi was developed as part of a larger system of models with consistent input data sets and formulation, its output can be directly compared with output from other NEWS models [Seitzinger et al., 2005]. NEWS-DSi also represents the first spatially explicit, global estimate of river DSi export to the oceans with an uncertainty range, based on a statistical, lumped river basin–scale model.
2. Data and Methods
2.1. River DSi Loads and Ancillary Data
 We used data on DSi load or concentration measured at or close to the river mouth. DSi annual load data were converted to annual DSi yield (DSiY, in ton SiO2 km−2 a−1) using the basin area estimates from Fekete et al. . The data set includes DSiY data for 208 rivers representing the “predam,” natural, or pristine situation (Figure 1 and Data Set S1). The DSiY data cover the period between the 1920s and 1990s. Hence, instead of a fixed base year, the criterion of selection was the absence of dams and reservoirs or human impact. The DSiY data are generally multiyear averages from (1) Meybeck and Ragu  (selecting the “predam” situation), (2) data on pristine rivers with data from numerous reports on river chemistry prior to 1950/1960, i.e., before the main development of large reservoirs in world rivers [Vörösmarty et al., 1997], and (3) recent analyses in regions with limited human impacts like Alaska and Canada, Amazon and Orinoco basins, Patagonia, and West and South Africa. References and more details on the data selection are given by Dürr et al. .
 There are some uncertainties associated with the DSi data. The main sources of uncertainty are inconsistent measurement techniques and insufficient sampling frequency. Considerable bias may be caused by variation in the hydrology, concentration-discharge relationships, and sampling frequency [Stelzer and Likens, 2006]. Regarding the computation of the annual DSi load, Meybeck and Ragu  note that the various reports used are not always clear about how average annual values were obtained. It is therefore not possible to provide accurate uncertainty estimates for our data, but in general, best available discharge-weighted data were used, and the Meybeck and Ragu  database is currently still the most frequently used data set at global scale. Moatar and Meybeck  estimated the uncertainty induced by the frequency of regular river measuring campaigns to result in errors around 10% for major ions, expressed by the electrical conductivity.
 For model development and extrapolation, we used ancillary information at the river basin scale, such as river basin area, climate, elevation, lithology, soil properties, and relief (Table 1). We also included data from different sources representing the same variable (see, for example, relief and climate-related variables). Because of the lack of reliable maps for land use for the first part and middle part of the 20th century, we included data on current land use, assuming that the broad patterns of regional agricultural areas including large areas suitable for irrigation have been similar during the period covered by the DSi load measurements. The error caused by this assumption on the global scale is limited, because the global agricultural area has increased by only 11% between 1960 and 2000 (FAOSTAT database collections, available at http://faostat.fao.org/default.aspx). In addition, the DSi data for river DSi export cover a long period (∼1920–1990), so there is not one single year for land cover that matches all observations. Furthermore, we included current climate data, assuming that the recent changes in climate are insignificant compared to recent anthropogenic modifications of the hydrologic cycle such as dam construction [Vörösmarty et al., 2000]. We recognize that in specific regions where rapid land use or climate changes have occurred, there may be a mismatch between the period represented by the DSi measurements and the land use, climate, and hydrological data.
Table 1. River Basin Characteristics Included in the Regression Analysis
The 79 river basins from Fekete et al.  had to be excluded because of conflicts with our delineation of land areas, and 5342 river basins representing 89% of the global land area were included in our analysis.
For mean annual precipitation we assumed a minimum value of 0.01 mm d−1 for three river basins with less rainfall. We excluded all river basins with annual runoff less than 3 mm a−1 (660 rivers out of 6292) and glaciated river basins (211). The “Fournier” expression of precipitation and runoff is calculated as the sum of the square values for all months divided by the annual sum. These expressions provide a representation of the variability within a year (seasonal variation).
Classes for lithology include 1, sand/sandstone; 2, carbonate rock; 3, shales; 4, plutonic/metamorphic; 5, gabbros; 6, acid volcanic rock; 7, basalt; and 8, ice.
Classes according to Dürr et al.  (1, major water bodies; 2, ice + glaciers; 3, plutonic basic; 4, plutonic acid; 5, volcanic basic; 6, volcanic acid; 7, Precambrian basement; 8, metamorphic rocks; 9, complex lithology; 10, siliciclastic consolidated sedimentary; 11, mixed consolidated sedimentary; 12, carbonated consolidated sedimentary; 13, evaporates; 14, semiconsolidated to unconsolidated sedimentary; 15, alluvial deposits; 16, loess; 17, dunes and shifting sand) were regrouped as follows: 1, water and ice (classes 1 and 2); 2, plutonic and metamorphic rocks (classes 3, 4, 7, and 8); 3, volcanic (classes 5 and 6 and 50% of class 9); 4, siliciclastic sediment (classes 10 and 50% of class 9); 5, carbonated sediment (classes 11 and 12); 6, unconsolidated sediments (classes 14 and 15); 7, Quarternary sediments (classes 13, 16, and 17); 8, all no data classes.
Mechanical erodibility ranges from 1 to 40 with 1, plutonic and metamorphic rocks; 2, volcanic rocks; 4, consolidated sedimentary rocks; 10, different rock types in folded zones; 32, nonconsolidated sedimentary rocks; and 40, recent alluvial.
In the DEM there are 36 cells with 5 by 5 min resolution within one cell of 0.5 degree, our working resolution. The slope is the absolute height difference between two midpoints of 5 by 5 min cells in the latitudinal direction divided by the arclength. The same is done for the longitudinal direction. Slopes are calculated for land cells only. Average slope is the sum of all values divided by the number of values (normally 72 if all 5 by 5 min grid cells are land).
The fractional distribution within each 0.5 degree grid cell of the classes in the Global Agroecological Zones (GAEZ) map (1, 0–2%, 2, 2–5%, 3, 5–8%, 4, 8–16%, 5, 16–30%, 6, 30–45%, 7, >45%) were recalculated to a mean slope: 1% (class 1), 3.5% (class 2), 6.5% (class 3), 12% (class 4), 23% (class 5), 37.5% (class 6), and 50% (class 7).
The Fournier slope for each 0.5 by 0.5 degree grid cell is calculated as the sum of the squares of the average slope of each 5 by 5 min resolution grid cell in each direction divided by the number of values (72 or less).
 We made plots of the various variables and the DSiY to inspect if other relationships yield a better fit with the data than linear ones. We thus added ln(precipitation) and temperature sqared as variables to include in the regression analysis.
2.2. Multiple Regression of DSiY
 We assessed the influence of a range of independent variables listed in Table 1, without a priori selection. The influence of these variables on DSiY was analyzed with S-PLUS [Insightful, 2005] in three steps to develop and validate a statistical model: (1) stepwise regression and identification of outliers to select the important variables which explain the variance in the behavior of TDSiY and obtain the best linear regression model based on the full data set; (2) cross validation of the model using randomly selected subsets of the data containing 75% of the rivers in the full data set to analyze the robustness and the uncertainty of the model predictions; (3) testing and validation of the fitted regression model and Monte Carlo simulation to obtain the distribution and confidence interval of the model coefficients. Steps 1 to 3 are based on the data set of rivers with measurements. The three steps will be elaborated below (sections 2.2.1–2.2.3).
2.2.1. Stepwise Regression
 Multiple linear regression requires the observations to be normally distributed. We used the Box-Cox procedure [Box and Cox, 1964] for transforming the DSiY data into a normal shape using an appropriate exponential lambda (λ) according to
 The relevant independent variables for the multiple regression were selected with the S-PLUS function “step.” This function uses forward selection. In forward selection the model is either acceptable or otherwise the most significant variable that is not yet included in the model is added. The best model in the function step is that with the lowest Akaike's Information Criterion (AIC), which is the log likelihood of the model plus a penalty for the number of variables included [Akaike, 1974]. This penalty is used to include only those variables in the model for which the likelihood decreases sufficiently to gain accuracy.
 After each run we checked for the presence of outliers. Potential outliers were identified on the basis of a combination of (1) the highest Cook's distance [Cook, 1977] of all rivers and (2) the distribution of the residuals; that is, if the regression model is adequate, the residuals are normally distributed. When the residuals are plotted against the quantiles of a standard normal distribution, the residuals should be on a straight line. Outliers are visible by their large deviation from the straight line. A potential outlier was actually excluded if there was also a clear effect on the multiple regression coefficient (>1%). Outliers were excluded and the stepwise regression with the same initial model was repeated.
 Models thus developed have the following form:
where E[TDSiY] is the expectation of the transformed prediction based on the independent variables Xi and the estimated regression coefficients βi.
2.2.2. Cross Validation
 To investigate the robustness of the model in equation (2), we cross validated the outcome using two approaches. The first one was used to analyze whether the variables Xi are the same for subsets of the measurement data. This robustness of the selected model variable Xi was tested by constructing 5000 models similar to the standard model. However, in this part of the cross validation we based the 5000 models on a subset of ∼75% of the complete set of rivers with TDSiY data, excluding a randomly selected subset of ∼25%.
 The second approach is used to determine the robustness of the regression coefficients βi and to obtain the distribution and confidence interval of the model coefficients. This is done by estimating βi values on the basis of a randomly selected subset of 75% of the data (training set) and testing predictions against the remaining 25% (validation set). This step was repeated 5000 times.
2.2.3. Testing and Validation
 The transformed prediction for each river with known Xi values is obtained from equation (2). The uncertainty of the model was assessed with the estimated regression coefficients from step 2 having a multinormal distribution with known mean and covariance. We used Monte Carlo simulation to draw 5000 equally probable sets of βi. Each set was used to predict the TDSiY on the basis of the known Xi values.
 Subsequently, we used the model developed for extrapolation to all global rivers, including those for which no measurements are available, to estimate the global DSi river export, and analyzed the effects of dams. We used the DSi load instead of DSiY to present predicted river export of DSi. First TDSiY predictions are back-transformed and DSi load is then calculated as the product of DSiY and basin area. By calculating the global sum of the DSi river export using the sets of equally probable βi from step 3 of the multiple regression, we obtained 5000 estimates representing the complete distribution of the global DSi river export. We used the mean and the 2.5 and 97.5 percentiles of this distribution for our uncertainty estimates of the global DSi river export.
 We also calculated the effect of dam construction on the retention of DSi in river basins. Where the water residence time is increased by dam construction, the growth of diatoms is increased causing a reduction of the river DSi load. Sedimentation of the diatoms in the form of suspended phytoliths [Reynolds, 1984] is closely related to sedimentation of suspended solids [Conley, 2002]. Furthermore, diatom and nondiatom phytoplankton growth depends on the N:P:Si element ratios [Conley, 2002, 2000] and conditions like temperature, light, and water turbidity influencing photosynthesis and respiration. Lacking a globally applicable approach for estimating DSi retention, here we use two methods as a first-order estimate. The first one is the retention of dissolved inorganic phosphate (PR) estimated by Harrison et al.  for all global river basins [Fekete et al., 2002], assuming that retention of dissolved inorganic phosphate and DSi are similar, although this will probably change with photosynthesis:respiration (P:R) ratios and may only be true for high P:R ratios. The second one is the sediment trapping efficiency (SR) for all global rivers [Fekete et al., 2002], proposed by Vörösmarty et al.  on the basis of the idea that sedimentation rates of suspended solids and diatom frustules are similar.
3. Results and Discussion
3.1. Multiple Linear Regression Model for DSiY
 Using the Box-Cox transformation procedure, we found that for λ = 0.0686 the distribution of our TDSiY data is as close to a normal distribution as possible. We found four rivers to be consistent outliers, i.e., the Rio Negro (Argentina), Neva (Russia), Sous (Morocco), and Inguri (Georgia) (Data Set S1). These four rivers were excluded from the model development, leaving 204 rivers with DSiY data for all further steps. Together these 204 river basins cover 68 Mkm2 which is about 58% of the global ice-free land area connected to the oceans (i.e., exorheic).
 The stepwise regression resulted in the selection of four variables with a significant influence on TDSiY. These variables are the natural logarithm of annual precipitation, topsoil bulk density, the fraction of the river basin area covered by volcanic rocks, and terrain slope (Table 2). The multiple regression coefficient (r2) for this model is 0.80. The predicted and observed values of TDSiY are presented in Figure 2.
The selected independent variables with the estimated coefficients, standard errors, t statistic and p values (Pr(>∣t∣)) for the standard model.
The p values show that all variables are highly significant (p ≪ 0.05).
Ln (precipitation) (mm d−1)
Lith. Volcanic (-)
Bulk density (Mg m−3)
GAEZ slope (m km−1)
 The natural logarithm of annual precipitation has the strongest correlation with TDSiY and is first added to the regression model, followed by the aerial fraction covered by volcanic rock. The model is further enhanced by the soil bulk density. The terrain slope is the last variable added to the model and exerts the smallest influence on TDSiY (Table 2). There are no anthropogenic variables with significant effect on TDSiY. These model variables are the large-scale controls of TDSiY in this lumped multiple regression approach at the scale of river basins. The model variables should therefore not be regarded as process parameters.
 The logarithm of annual precipitation is highly significant and important. The river yield of DSi is higher in river basins with high annual precipitation than in rivers with dry climates. The amount of water percolating through the soil, subsoil, and parent material is a major variable determining rock weathering rates [Gaillardet et al., 1999; Hilley and Porder, 2008; Kump et al., 2000].
 A potential role of vegetation is confirmed by the factor bulk density of the topsoil, which has a negative influence on river DSi yields (Table 2). Generally, soils with low bulk density are more developed, have more soil organic matter (including phytoliths), and stable aggregate structure than soils with high bulk density [Brady, 1990]. This also reflects the development of the ecosystem. Moreover, soils with low bulk density have a high porosity, and thus minerals and phytoliths are more easily accessible for dissolution and uptake by the vegetation.
 Terrain slope is the third variable with significant influence on DSi yields (Table 2). Relief is one of the major determinants of natural erosion rates within one climate zone [Schumm, 1977]. Erosion is generally more severe in landscapes with steep slopes compared to gently sloping or flat terrain. In addition, Allison  reported that eroded material contains more organic matter than the soil remaining. Therefore, in sloping areas, erosion should stimulate the transport of soil organic matter containing phytoliths and subsequent dissolution and transport of DSi [Conley, 2002]. In addition, erosion or physical denudation is intimately coupled to chemical weathering. The physical removal of soil material sustains chemical weathering by continuously refreshing mineral surfaces and by precluding the development of thick soils [Dupré et al., 2003; Gaillardet et al., 1999].
 Finally, the occurrence of volcanic rock has a positive influence on DSi yield (Table 2), similar to results for Japan [Hartmann et al., 2009]. The important influence of volcanic material is due to the fast desilication of poorly ordered aluminosilicates such as allophane [Bolt and Bruggenwert, 1976].
 In summary, the overall importance of precipitation and occurrence of volcanic rocks point to the role of weathering as the ultimate source of DSi. The results also support the important biological control of the global silicon cycle [Bartoli, 1983; Conley, 2002], mainly through the factor bulk density (indicator for soil and ecosystem development). Precipitation and slope probably influence erosion, transport, and dissolution of DSi from both the mineral and biological components.
3.2. Cross Validation
 The robustness of the multiple regression model, evaluated by developing 5000 models based on randomly selected subsets of the rivers with DSiY data, is discussed on the basis of the significant model variables and the predictions for the β values.
 To test the robustness of the selected model variable Xi, we made 5000 models based on the measurements with a randomly selected subset of 50 measurements excluded. In all 5000 models both the natural logarithm of annual precipitation and bulk density are significant variables. In 99% of the models the occurrence of volcanic rock is significant. Overall, slope is significant in 92% of the models, although in 65% of the models this is based on the information from the Global Agroecological Zones (GAEZ) data, and in 27% of the models the slope data are from the FAO (Table 1). There were 121 models with only 3 parameters. The combination of the natural logarithm of annual precipitation, bulk density, and the occurrence of volcanic rock was found in 100 out of these 121 models. The slope (GAEZ) instead of the occurrence of volcanic rock was found in 21 models. Precipitation and temperature (both without transformation) were found in none of the 5000 models.
 The robustness of our model is also illustrated by the fact that 58% out of the 5000 models have exactly the same model variables as the standard model, while 83% models are similar to the standard model, the only difference being the source of information for slope (GAEZ or FAO; see Table 1).
 Temperature squared (not temperature as such) is a significant variable in only 11 out of 5000 models, always in combination with the four variables of the standard model. The lack of a temperature effect may be due to the influence of a small number of tropical lowland rivers such as the Amazon and Congo [Gaillardet et al., 1999]. These large catchments are covered by thick, highly weathered soils [Driessen and Dudal, 1990] with low chemical weathering fluxes of silicates [Dupré et al., 2003]. Close to 50% of the DSi in the Amazon originates in the Andes region, with extensive volcanic deposits [Mortatti and Probst, 2003].
 Temperature determines the rates of chemical and biological processes at all levels [Bolt and Bruggenwert, 1976; Garnier et al., 2006]. However, we find that temperature is not an important control of DSi river export at the global scale. We therefore specifically investigated the effect of temperature by a number of data analyses: excluding the Amazon, Nile, and Congo, excluding all river basins with temperatures > 15 degrees, and excluding all temperatures > 20 degrees. In a further analysis we excluded all river basins with warm humid tropical and warm seasonal tropical dry climates (according to AEZ; see Table 1), and finally we excluded only warm humid tropical climate. None of these experiments yielded temperature as a significant variable. In fact, all the resulting models confirm the above cross validation. The lack of a temperature influence confirms the conclusions of Kump et al. . They showed that laboratory studies reveal a strong dependence of mineral dissolution on temperature, but at larger spatial scales this is often obscured by other environmental factors that covary with temperature. This is also in agreement with Gaillardet et al.  who studied silicate weathering rates and found a weak correlation of weathering rates with temperature only when excluding some large tropical rivers such as the Amazon and Congo.
 To assess the robustness of the model regression coefficients (βi), we estimated βi values on the basis of a randomly selected subset of 75% of the data (training set) and made predictions for the remaining 25% (validation set). The minimum and maximum values of each estimated βi are not fixed but change for random selections of 75% of the 204 rivers with DSiY measurement data. The coefficients obtained with the 5000 selections of rivers generally range within an acceptable factor of 2 around the value of the standard model (Table 3).
Table 3. Minimum and Maximum Values of Regression Coefficients Obtained With 5000 Simulations and the Value for the Standard Modela
 Another approach for testing the model is a comparison between the modeled DSi load (note that this is back-transformed from TDSiY and multiplied by the river basin area) with that based on the measurements. We use the Bland-Altman test [Bland and Altman, 1986], which involves comparison of the residuals (difference between observed and predicted DSi load) with the mean of the predicted and observed DSi load. This test showed that there is no systematic relation between the residuals and the mean of the predicted and observed DSi load.
 The relationship between observed and predicted river DSi loads is presented in Figure 3. Figure 4 shows the DSi yield of the standard model for all global rivers. We can now use the full DSi model to predict DSi load (i.e., the back-transformed value of DSiY times the river basin area for the river considered) for the 204 rivers with DSiY measurement data (Figure 1). The estimated value with the standard model of 190 Tg a−1 is slightly lower than the total observed DSi load of 194 Tg a−1 for the same rivers (Data Set S1). The 95% confidence interval based on Monte Carlo simulation is (173, 212) (Table 4). Together these 204 river basins cover 68 Mkm2 (Table 4 and Data Set S1) which is about 58% of the global ice-free exorheic, i.e., connected to the oceans, land area (118 Mkm2).
Table 4. Area Covered, DSi River Export From the Standard Model, and 2.5 and 97.5 Percentiles Obtained With 5000 Monte Carlo Simulationsa
Number of River Basins
Area Covered (Mkm2)
Predicted DSi Export (Tg SiO2 a−1)
2.5 Percentile (Tg SiO2 a−1)
97.5 Percentile (Tg SiO2 a−1)
For the 204 rivers included in the DSi data set, and for all 3840 rivers for which the variables are within the range of the DSi data set (Table 5).
The estimate based on the DSi observations in the data set is 194 Tg SiO2 a−1.
 The estimated global DSi export from the total of 5342 river basins is 380 Tg SiO2 a−1 (Figure 4) with a 95% confidence interval of (340, 427). In our global prediction we avoided using values of variables outside the validity range of the model, which are minimum and maximum values of the 208 rivers with DSiY data in our data set listed in Table 5. The model estimate for those river basins (3840 out of 5342) with all model variables within the validity range is 346 Tg SiO2 a−1 with a 95% confidence interval of (315, 386) (Table 4). This represents an area of 111 Mkm2 or 94% of the global exorheic land area.
Table 5. Minimum and Maximum Values of Variables in the DSi Data Set of Observations, Minimum and Maximum for All Global River Basins, and the Number of River Basins Outside the Range of Values for Rivers Included in the DSi Data Set
Range of Values for Rivers Included in the DSi Data Set
Number of River Basins Outside the Range in the DSi Data Set
The 5342 rivers within the 0.5 by 0.5 degree river network of Fekete et al.  which are included in this study.
 For 1502 out of 5342 river basins covering 6% of the global exorheic land area, at least one model variable is outside the validity range (Table 5). If these values are restricted to the minimum and maximum values of the model (Table 5), we obtain an additional load of 33 Tg a−1 with a range of 25 to 41 Tg a−1.
 Turning to individual rivers, we see that the global DSi river export is dominated by only a small number of rivers. For example, the DSi load of the Amazon is 42 Tg SiO2 a−1, or 11% of global DSi river export, for 5% of the exorheic landmass. The Zaire has the second largest DSi river export with 16 Tg SiO2 a−1, or 4% of global DSi river export, for 3% of the exorheic landmass. There are eight rivers covering 15% of the global exorheic land area (Amazon, Orinoco, Parana, and Magdalena in South America, Chang Jiang, Ganges-Brahmaputra, and Mekong in Southeast Asia and Zaire in Africa) that together contribute 25% to the global DSi river export.
 Most of the predam DSi is exported by global rivers to coastal zones of the Atlantic Ocean (41%, with a dominant contribution of about one quarter from the Amazon), Pacific Ocean (36%) and Indian Ocean (14%) (Table 6). South America and Asia are the largest contributors (25% and 23%, respectively) (Table 7). Oceania, with a total basin area of only 3 Mkm2, contributes 18% of global DSi river export, which is similar to that from south Asia with a total basin area of 18 Mkm2 and a contribution of 17%.
Table 6. Predicted River Export of DSi to the World's Oceans for the Predam Situation and Retention in Global Reservoirs Based on Two Methods
 We recognize that the two methods (PR, representing the retention of inorganic phosphate, and SR, which represents sediment trapping) for estimating Si retention may not correctly describe DSi retention. Also, they differ on the river basin, regional, and continental scale, although the global average DSi retention is similar (retention of 18% for PR approach and 19% for SR) (Tables 6 and 7). The largest differences are found for Africa (26% based on the PR approach and 35% for SR), which results in a global DSi river export difference of 4 Tg SiO2 a−1 (Table 7) and the Arctic Ocean (17% for PR and 9% for SR). However, a smaller difference in estimated retention for the Atlantic Ocean, the largest global recipient of DSi, of 2% has large consequences for the global DSi retention (a global DSi river export difference of 3 Tg SiO2 a−1).
 The DSi retention of 18–19% causes a reduction of global DSi river export from 380 to 307–312 Tg SiO2 a−1. Our global estimate of river DSi export accounting for DSi retention is 9% lower than the 336 Tg SiO2 a−1 estimated by Treguer et al. .
 NEWS-DSi explains ∼80% of the variability in the transformed TDSiY, leaving 20% of the variability unexplained. We recognize that the multiple regression coefficient is not a proper indicator of the uncertainty. A better way to express the behavior of the model is to show that predictions for DSiY (ton km−2 a−1) are within a factor of 1.5 of the observations for 50% of the river basins in the data set used (204 rivers) (Figure 5). This relative error of 1.5 and a fraction of the rivers of 0.5 means that for 50% of the rivers with DSiY data the following statement is valid: 1/1.5 < (prediction/observation) < 1.5. DSiY predictions are within a factor of 2 for 70% and within a factor of 3 for 90% of the 204 rivers with observations (Figure 5).
 The NEWS-DSi model is robust with respect to the available data. By using Monte Carlo simulations to obtain a range of predictions, and not just one, we have tried to account for model uncertainty. The NEWS-DSi model prediction for the total DSi load for 3840 rivers is 346 Tg a−1. The 97.5% upper bound is 386 Tg a−1, which exceeds the mean by only 11%.
 Apart from the model uncertainty, there are other uncertainties related to the DSiY data and the ancillary basin data. Uncertainty associated with available DSiY data is discussed in detail elsewhere [Stelzer and Likens, 2006]. The main sources of uncertainty are (1) inconsistent measurement techniques and (2) varying and often low sampling frequency. The potential bias caused by these uncertainties in the data is recognized, but because of the lack of information provided in the measurement reports this bias cannot be quantified.
 Regarding the ancillary basin data, there are two major uncertainties. First, we use river basin averages for most basin characteristics. Averages may not reflect the influence of a factor such as temperature or precipitation for large river basins occurring in different climate zones. Also, the effect of seasonal variation in climate, runoff, vegetation and land cover, and agricultural and forestry management is not reflected in our approach. For other factors such as slope or soil properties the variability may be lost by averaging. Our global model should therefore not be used to predict DSiY for individual river basins, or within river basins, but rather for regional-scale to continental-scale applications.
 We found land use not be significant. This does not mean that differences in agricultural land use versus natural vegetation are not important. For example, Conley et al.  showed that deforestation causes increasing DSi river export. For proper analysis of the effects of changing land use, time series would be needed to relate land use changes to TDSiY. Similarly, we have not analyzed the importance of DSi in wastewater flows, which may contribute perhaps 8% in densely populated river basins [Sferratore et al., 2006], but have been estimated to be <2% at global scale for total additional human DSi additions [van Dokkum et al., 2004].
 There is also uncertainty in the river basin data per se. One example of uncertainty in the ancillary data is that found in the river basin area estimates. Comparison of data provided by Meybeck and Ragu  and Fekete et al.  show significant disagreement in some river basins. This has important repercussions for the calculation of DSiY.
 Finally, globally applicable estimates for DSi retention in reservoirs are not available, so that the uncertainty in our estimates for the actual DSi load may be larger than that for the predam load.
 We developed a robust lumped model for DSiY at the scale of river basins and with an annual temporal scale. The model was cross validated by using training and validation data sets. Our model predictions realistically describe the information in the measurement data set. Our approach provides new insights on the main drivers of river export of DSi at the scale of river basins on the basis of the limited data set of DSi river export available to us. The cross validation of the regression model gives strong indications that the DSi yield depends on the natural logarithm of annual precipitation, bulk density, and the occurrence of volcanic rock. Terrain slope has a smaller, but significant, influence on DSi export than the other variables, but is still robust (more than 90% of the models found this variable significant). Temperature is not found as a significant variable, even when we focused on river basins in extratropical climates.
 The overall importance of precipitation and occurrence of volcanic rocks point to the role of weathering as the ultimate source of DSi. The results also support the important biological control of the global silicon cycle proposed earlier by Bartoli , mainly through the factor of bulk density (indicator for soil and ecosystem development). Precipitation and slope probably influence dissolution of DSi from both the mineral and biological components.
 Our regression approach is not the only way to model DSi river export. Other approaches such as process-based models [Billen and Garnier, 2000; Garnier et al., 2002; Sferratore et al., 2005] require far more knowledge and data on processes and their controls in the DSi cycle at the river basin scale than our approach. Such data and knowledge is currently still lacking for global-scale application of process-based models.
 There are multiple scale problems related to our lumped model approach. Our model should therefore not be used to predict the DSi load for individual rivers; it is more appropriate to estimate the regional, continental, or global DSi river export to the coastal zone and changes therein as a result of climate change and dam construction [e.g., Syvitski et al., 2003].
 Inevitably there is considerable uncertainty associated with our predictions. Nonetheless, as the first attempt to develop a robust, internally consistent, and spatially explicit global model of DSi river export, NEWS-DSi constitutes a significant advance in its own right. With the NEWS-DSi model now available, it is possible to analyze the exported ratios of N, P, and Si in all different forms [Billen and Garnier, 2007; Seitzinger et al., 2005], also including information on the effect of dam construction.
 We thank E. Struyf and one anonymous reviewer for their extensive comments and helpful thoughts about the explanation of the model variables in terms of the biological control. We are also thankful for the support and advice from G. Billen, J. Garnier, and E. Mayorga. We gratefully acknowledge the support of the UNESCO Intergovernmental Oceanographic Committee (IOC) for funding various workshops which formed the basis for the work described in this paper. The work of A. Beusen, L. Bouwman, and A. Dekkers was part of the project Integrated Terrestrial Modeling of the Netherlands Environmental Assessment Agency. The contribution of H. Dürr was funded by Utrecht University (high potential project G-NUX) and by the EU program Si-WEBS (contract HPRN-CT-2002-000218), and J. Hartmann was funded by the German Research Foundation (DFG) (global river project HA 4472/6-1).