## 1. Introduction

[2] A statistical model aims at explaining an exogenous variable *y* from several explanatory variables *x*_{1},…, *x*_{n}. In the case where *x*_{1},…, *x*_{n} are deterministic variables, it expresses the dependence of the expected value *E*[*y*] on the explanatory variables and an unknown parameter vector ω, as a function *f*(*x*_{1},…, *x*_{n}; ω). In the random case, the model is written conditionally to the observations, i.e., *E*[*y*] is replaced by the conditional expected value *E*[*y*∣*x*_{1},…, *x*_{n}]. The function *f* is called the link function between *y* and *x*_{1},…, *x*_{n} and, depending on its expression, defines a linear or non-linear regression statistical model. Models such as perceptrons, falling in the class of so-called ridge constructions, achieve this statistical modeling goal with several well-known interesting properties. Let us just mention the density or universal approximation property [*Cybenko*, 1989; *Lin and Pinkus*, 1993], and the results related to the approximation rate, including the dimension-independent upper bound [*Barron*, 1993; *Burger and Neubauer*, 2001; *Makovoz*, 1998], and the asymptotic expression obtained by *Maiorov* [1999].

[3] In this vein, we focus on a slightly different regression problem, for which we propose a modified solution, based on ridge function approximants, that inherits the interesting mathematical properties mentioned above. This problem still consists in explaining *y* from *x*_{1},…, *x*_{n}, but with the difference that, in fact, only some of the *x*_{i}, say *x*_{1},…, *x*_{d} (*d* < *n*), convey information about *y*, while the remaining variables act as parameters, or conditioning variables, in the sense that they influence the link function between *y* and the true informative variables *x*_{1},…, *x*_{d}.

[4] Typical examples of this kind of problem are found in geosciences, where the observed data may depend on several angular variables that define the geometry of the observation process. They include the retrieval of ocean color and aerosols from reflectance measurements in the visible and near infrared, and the retrieval of wind speed, salinity, and sea surface temperature from brightness temperature measurements at microwave wavelengths. In ocean color remote sensing, the objective is to estimate the concentration of oceanic constituents, such as phytoplankton chlorophyll-a. The informative variables *x*_{1}, *x*_{2},…, *x*_{d}, in this case the top-of-atmosphere reflectance measurements, depend continuously on the angular variables that characterize the positions of the observing satellite and of the Sun relatively to the target on the Earth's surface. Hence these angular variables, which obviously do not carry any information about the chlorophyll-a concentration, have to be taken into account, for the link function between chlorophyll-a concentration and *x*_{1}, *x*_{2},…., *x*_{d} depends on them.

[5] For this kind of problem, it seems natural to separate the variables being effectively informative with respect to *y*, from the conditioning variables. We shall denote by **x** the *d*-dimensional vector of informative variables, and by **t** the *p*-dimensional vector of conditioning variables. The proposed solution consists in attaching to **t** a nonlinear regression model explaining *y* from **x**, and where we demand that the attachment vary smoothly in **t**. This approach yields a field of nonlinear regression models over the set of permitted values for **t**.

[6] The paper is organized as follows. In section 2, the problem of interest is stated more formally, and fields of nonlinear regression models are defined. In section 3, construction schemes of such a model from scattered data are presented. In section 4, results obtained by applying this methodology to ocean color remote sensing are discussed. Finally, conclusions are given, as well as perspectives on future work.