## 1. INTRODUCTION

Despite the proliferation of methods for semiparametric and nonparametric regression, the use of these techniques remains relatively rare in empirical practice. Increased computational difficulty and mathematical sophistication, and perhaps most importantly, the *curse of dimensionality*—wherein the rate of convergence of the nonparametric regression estimator slows with the number of variables treated nonparametrically—all seem to provide barriers which prevent the widespread use of nonparametric techniques.

The rapid increase in computing power and growth in nonparametric routines found in statistical software packages has helped to mitigate computational concerns. To combat the curse of dimensionality problem, many researchers have adopted the use of the *partially linear* (or *semilinear*) regression model. This model, though not fully nonparametric, provides a convenient generalization of the standard linear model which is not as susceptible to the curse of dimensionality since only one, or perhaps a few, variables are treated nonparametrically. Finally, some studies (e.g. Blundell and Duncan, 1998; Yatchew, 1998; DiNardo and Tobias, 2001) have tried to bridge the gap between theory and practice and make these techniques accessible to applied researchers.

In this paper we continue in this tradition and describe and implement simple and intuitive Bayesian methods for semiparametric and nonparametric regression. Importantly, the methods we describe can be used in the context of multiple equation models, thus generalizing the class of models for which simple Bayesian semiparametric methods are available. In our discussion we focus primarily on the seemingly unrelated regression (SUR) model. This model is of interest in and of itself, but is also of interest as the (possibly restricted) reduced form of a semiparametric simultaneous equations model (or the structural form of a triangular simultaneous equations model).

Before describing the contributions of this paper, it is useful to briefly review a simple method used in related work (Koop and Poirier, 2004a) in the single-equation partially linear regression model. This partially linear model divides the explanatory variables into a set which is treated parametrically, *z*, and a set which is treated nonparametrically, *x*, and relates them to a dependent variable *y* as:

for *i* = 1, …, *N*, where *f*(·) is an unknown function. Because of the curse of dimensionality, *x*_{i} must be of low dimension and is often a scalar (see Yatchew, 1998 for an excellent introduction to the partial linear model). For most of this paper we will assume *x*_{i} is a scalar, although this assumption is relaxed in Section 4.

In this model we assume for *i* = 1, …, *N*, and all explanatory variables are fixed or exogenous. Observations are ordered so that *x*_{1} < *x*_{2} < ··· < *x*_{N}. Define *y* = (*y*_{1}, …, *y*_{N})′, *Z* = (*z*_{1}, …, *z*_{N})′ and ε = (ε_{1}, …, ε_{N})′. Letting γ = (*f*(*x*_{1}), …, *f*(*x*_{N}))′, *W* = (*Z* : *I*_{N}) and δ = (β′, γ′)′, Koop and Poirier (2004a) show that the previous equation could be written as:

Thus, the partially linear model can be written as the standard normal linear regression model where the unknown points on the nonparametric regression line are treated as unknown parameters. This regression model is characterized by insufficient observations in that the number of explanatory variables is greater than *N*. However, Koop and Poirier (2004a) showed that, if a natural conjugate prior is used, the posterior is still well-defined. In fact, they showed that the natural conjugate prior did not even have to be informative in all dimensions and that prior information about the smoothness of the nonparametric regression line was all that was required to ensure valid posterior inference. Thus, for the subjective Bayesian, prior information can be used to surmount the problem of insufficient observations. Furthermore, for the researcher uncomfortable with subjective prior information, the required amount of prior information was quite small, involving the selection of a single prior hyperparameter called η that governed the smoothness of the nonparametric regression line. Koop and Poirier (2004b) went even further and showed how (under weak conditions) empirical Bayesian methods could be used to estimate η from the data.

The advantages of remaining within the framework of the normal linear regression model with a natural conjugate prior are clear. This model is very well understood and standard textbook results for estimation, model comparison and prediction are immediately available. Analytical results for posterior moments, marginal likelihoods and predictives exist and, thus, there is no need for posterior simulation. This means methods which search over many values for η (e.g. empirical Bayesian methods or cross-validation) can be implemented at a low computational cost. Furthermore, as shown in Koop and Poirier (2004a), the partial linear model can serve as a component in numerous other models which do involve posterior simulation (e.g. semiparametric tobit and probit models or the partial linear model with the errors treated flexibly by using mixtures of normals). The ability to simplify the estimation of the nonparametric component in such a complicated empirical exercise may provide the researcher with great computational benefit.

In this paper we take up the case of Bayesian semiparametric estimation in multiple equation models and adopt a similar approach for smoothing the regression functions. In particular, we consider the estimation of a semiparametric SUR model of the form:

where *y*_{ij} is the *i*th observation (*i* = 1, …, *N*) on the endogenous variable in the *j*th equation (*j* = 1, …, *m*), *z*_{ij} is a *k*_{j} × 1 vector of observations on the exogenous variables which enter linearly, *f*_{j}(*x*_{ij}) is an unknown function which depends on a vector of variables, *x*_{ij}, and ε_{ij} is the error term. For equations which have nonparametric components, *z*_{ij} does not contain an intercept since the first point on a nonparametric regression line plays the role of an intercept.

The approach we describe for the estimation of this model is simple and intuitive and, hopefully, will appeal to practitioners seeking to add flexibility to their multiple equation analyses. As in Koop and Poirier (2004a,b), we employ a prior which serves to smooth the nonparametric regression functions. It is important to recognize that for the (parametric) seemingly unrelated regressions model (and the reduced form of the simultaneous equations model), the natural conjugate prior suffers from well-known criticisms (see Rothenberg, 1963 or Dreze and Richard, 1983). On the basis of these, Dreze and Richard (1983, p. 541) argue against using the natural conjugate prior (except for certain noninformative limiting cases not relevant for our class of models). Their arguments carry even more force in the present semiparametric context since the natural conjugate prior places some undesirable restrictions on the way smoothing is carried out on nonparametric regression functions in different equations (i.e., the nonparametric component in each equation is smoothed in the same way). Thus, in the present paper we do not adopt a natural conjugate prior, but rather use an independent normal–Wishart prior.

The basic ideas behind our approach are straightforward extensions of standard textbook Bayesian methods for the SUR model (see, e.g., Koop, 2003, pp. 137–142). Thus, textbook results for estimation, model comparison (including comparison of parametric to nonparametric models), model diagnostics (e.g. posterior predictive *p*-values) and prediction are immediately available. This, we argue, is an advantage relative to the relevant non-Bayesian literature (see, e.g., Pagan and Ullah, 1999, chapter 6; Newey *et al.*, 1999; Darolles *et al.*, 2003) and to other, more complicated, Bayesian approaches to nonparametric seemingly unrelated regression such as Smith and Kohn (2000).

We illustrate the use of our methods by estimating a two-equation simultaneous equations model in parallel with the development of our theory. This application takes data from the National Longitudinal Survey of Youth and involves estimating the returns to schooling, job tenure and ability for a cross-sectional sample of white males. Our triangular simultaneous equations model has two equations, one for the (log) wage and the other for the quantity of schooling attained. After estimating standard parametric models that have appeared in the literature, we first extend them to allow for nonparametric treatment of an exogenous variable (weeks of tenure on the current job) in the wage equation (Case 1). Subsequently, we consider Case 2 where single explanatory variables enter nonparametrically in each equation. In this model we additionally allow a measure of cognitive ability to enter the schooling equation nonparametrically. We complete our empirical work with Case 3 by giving cognitive ability a nonparametric treatment in both the wage and schooling equations (with tenure on the job also given a nonparametric treatment in the wage equation).

Our results reveal the practicality and usefulness of our approach. In some cases, our semiparametric treatment yields results which are very similar to those from simple parametric nonlinear models (e.g. quadratic). However, one advantage of a semiparametric approach is that a particular functional form such as the quadratic does not have to be chosen, either in an *ad hoc* fashion or through pre-testing. Furthermore, in some cases our semiparametric approach yields empirical results that could not easily be obtained using standard parametric methods. In terms of our application, our results reveal the empirical importance of controlling for nonlinearities in ability, particularly in the schooling equation, when trying to estimate the return to education.

The outline of our paper is as follows. In the next section, we outline our basic semiparametric SUR model, describe our data, and obtain parametric results and semiparametric results for a model where job tenure is treated nonparametrically. In Section 3, we describe the process of estimating a model with nonparametric components in both equations, and estimate the model in Case 2. Finally, in Section 4, we describe how to handle the estimation of additive models and provide estimation results for our most general Case 3. The paper concludes with a summary in Section 5.