SEARCH

SEARCH BY CITATION

SUMMARY

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

This paper considers a panel data stochastic frontier model that disentangles unobserved firm effects (firm heterogeneity) from persistent (time-invariant/long-term) and transient (time-varying/short-term) technical inefficiency. The model gives us a four-way error component model, viz., persistent and time-varying inefficiency, random firm effects and noise. We use Bayesian methods of inference to provide robust and efficient methods of estimating inefficiency components in this four-way error component model. Monte Carlo results are provided to validate its performance. We also present results from an empirical application that uses a large panel of US commercial banks. Copyright © 2012 John Wiley & Sons, Ltd.

INTRODUCTION

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

In recent years the use of stochastic frontier models to estimate (in)efficiency in production and cost functions has been growing exponentially. Many of these applications use panel models that utilize the data in more efficient ways. The standard stochastic frontier panel data models have been extended in several directions. Estimation of some of these models can be conducted under less demanding assumptions and at the same time using more flexible modeling approaches. For example, heterogeneous technologies have been the focus of much research, including random coefficient stochastic frontier models. Other examples include latent class or mixture models and Markov switching models.1 More recently, an important line of research has been the formulation and estimation of panel models in which firm effects are separated from inefficiency.

In a standard panel data model, the focus is mostly on controlling firm heterogeneity due to unobserved time-invariant covariates. The innovation in the time-invariant stochastic frontier models (developed in the 1980s) was to make these firm effects one-sided so as to give them an inefficiency interpretation. In some models these inefficiency effects were treated as fixed parameters (Schmidt and Sickles, 1984) while others treat them as random variables (Pitt and Lee, 1981; Kumbhakar, 1987). The models proposed by Kumbhakar (1991), Kumbhakar and Heshmati (1995) and Kumbhakar and Hjalmarsson (1993, 1995) treated firm effects as persistent inefficiency and included another random component to capture time-varying technical inefficiency. These formulations are in contrast to the ‘true’ random or fixed-effect models proposed by Greene (2005a,2005b) in which firm effects are not parts of inefficiency. The truth might be somewhere in between. That is, part of the firm effects in Greene (2005a,2005b) might be persistent inefficiency. Similarly, part of persistent inefficiency in the models proposed by Kumbhakar and co-authors might include unobserved firm effects. Since none of the assumptions used in the above-cited models are fully satisfactory, we consider a generalized true random-effects (GTRE) model that decomposes the time-invariant firm effects into a random firm effect (to capture unobserved heterogeneity à la Greene, 2005a,2005b) and a persistent technical inefficiency effect (as in Pitt and Lee, 1981; Schmidt and Sickles, 1984; Kumbhakar, 1987).

Most of the popular panel data models that are widely used in empirical applications do not control for unobserved firm effects. For example, the inefficiency specification used by Battese and Coelli (1992) (which is a variant of the model due to Kumbhakar, 1991) and its variants allow inefficiency to change over time but without firm effects.2 Thus these models confound firm effects with inefficiency. Two other panel data models, viz., the ‘true-fixed’ and ‘true-random’ effects (Greene, 2005a,2005b Kumbhakar and Wang, 2005) frontier models separate firm effects (fixed or random) from inefficiency, where inefficiency can be a function of exogenous variables.

Some of the models that are widely used in the literature can be summarized as in Table 1. Details on estimation of these models can be found in Kumbhakar et al. (2011).

Table 1. Main characteristics of some of the panel data models
 Model 1Model 2Model 3Model 4Model 5Model 6
Firm effect:NoNoYes (fixed)Yes (random)NoYes (random)
  1. a

    Time-inv. mean inefficiency models include determinants of inefficiency in the mean function.

  2. b

    Zero truncation models assume inefficiency distribution to be i.i.d. half-normal.

  3. c

    Hetero. (Homo.) refers to models in which variances are functions of covariates that are both firm-specific and time-varying (constant).

Technical inefficiency
PersistentNoNoNoNoYesYes
TransientNoNoNoNoYesYes
Overall technical inefficiency
MeanTime-inv.aTime-inv.Time-inv.Zero trunc.bZero trunc.Zero trunc.
VarianceHomo.Hetero.Hetero.Hetero.Homo.Homo.
Symmetric error term
VarianceHomo.Hetero.cHetero.Homo.Homo.Homo.

Before proceeding further it might be worth asking questions like: Is there any economic rationale for all these components? Why do we need a model with all these components? To address these questions, we start with the following. Assume that αi represents firm heterogeneity (effect of unobserved factors). The necessity to control αi in estimating the regression function is well understood in the panel data literature. So it is not necessary to rationalize modeling αi separate from vit. The nest question is: Why do we want to decompose inefficiency into various components? To justify this consider the case where inefficiency is associated with (unobserved) management, and assume that management is time-invariant. Consequently, inefficiency will also be time-invariant (i.e. inline image). This gives us a three-way error component model, i.e., inline image, in which one needs to separate αi from inline image, both of which are time-invariant. Now assume that management changes over time (which is probably more realistic), although most of it might be time-invariant. That is, management has a time-invariant and a time-varying component. If, as before, inefficiency is associated with management, we have a situation in which one part of inefficiency is time-invariant and the other part is time-varying (inline image0) (to be consistent with the management story). This example illustrates the rationale for a four-way error component model, i.e. inline image, which we consider in this paper. Colombi et al. (2011) give some other examples to justify the need for a four-way error component model. If the policy makers (regulators) are interested in eliminating persistent inefficiency that is often attributed to regulation, it is necessary to estimate them first. Estimating a model with only one inefficiency component (with or without controlling for firm effects) is likely to give incorrect estimates of inefficiency.

In view of the above discussion we consider a general model, which in a cost function framework is3

  • display math(1)

where the dependent variable is (log) cost and the independent variables, represented by the vector x, are input prices and outputs (log).4 Subscripts i and t refer to firm and time (i = 1, 2, …, n and t = 1, 2, …, T), respectively. Note that the model in (1) has four error components. If we denote the composed error inline image where the superscript (+) indicates non-negative value of the corresponding error component, we can give a meaningful interpretation of each of the error components. First, the random noise component is vit, which is similar to the noise component in a standard regression model. Second, the persistent (long-run) technical inefficiency component is inline image. Third, short-run or transient technical inefficiency is inline image. Fourth, firm-specific random effects (firm heterogeneity) are captured by the αi term.

In terms of technical inefficiency, the above model decomposes the overall inefficiencyinline image into a long-run or persistent component (inline image) and a short-run or transient component (inline image). The decomposition proposed here is more flexible and does not rest on parametric assumptions on the dynamics of the overall inefficiency, Uit. This decomposition is similar to decomposing the noise term into two orthogonal components, viz., random firm effects and random firm and time effects in a panel data model. The only difference is that these are one-sided, which helps identify them from the two-sided error components αi and vit. Moreover, the presence of firm effects (αi) can be tested once we allow for a decomposition of technical inefficiency into permanent and transient or short-run components. Previous panel stochastic frontier models either included firm effects but failed to accommodate persistent inefficiency (Greene, 2005a,2005b) or included persistent inefficiency but failed to separate it from firm effects (Kumbhakar and co-authors, cited earlier). Failure to accommodate either of these is likely to give incorrect inefficiency estimates.

The generality of the model in (1) can be viewed from both cross-sectional and panel points of view. If only cross-sectional data are available, the model in (1) will have only the cross-sectional components in the error term, viz., inline image, which is the original stochastic frontier model proposed by Aigner et al. (1977) and Meeusen and van den Broeck (1977). A standard pooled panel model will have two error components and the error term will be of the form inline image. Greene's true random-effects model will have the error specification inline image. The composite error term in the Kumbhakar–Heshmati (1995) and Kumbhakar–Hjalmarsson (1995) model is of the form inline image. However, the fully flexible error specification is inline image. It is possible to identify each of these components using the standard distributional assumptions used in the stochastic frontier models. For example, inline imagecan be viewed as the firm-specific component in a standard panel model, the predicted value of which can be obtained from estimating (1) using a standard one-way error component panel estimation. Decomposing inline image from δi is standard in a cross-sectional stochastic frontier model (the Jondrow et al., 1982, procedure) where δi can be viewed as the two-sided noise term and inline image is inefficiency. Similarly, estimation of the panel model also gives predicted values of inline image, from which one can obtain estimates of inline image using the exact same procedure.5 Colombi et al. (2011) considered a single-step maximum likelihood method to estimate the technology parameters and technical efficiency components of this model.

In this paper we consider an alternative approach, viz., a Bayesian Markov chain Monte Carlo (MCMC) approach to estimate the GTRE model. There are some advantages to the Bayesian approach compared to the ML approach used in Colombi et al. (2011). First, it has good finite-sample properties even in samples where n and T are relatively small. Second, in light of the recent advances in the treatment of the incidental parameters problem, there is reason to believe that average likelihood or a fully Bayesian approach can perform much better relative to sampling-theory treatments. We provide both simulation results and results from real data.

Note that there are no Bayesian frontier models that allow random firm effects along with time-invariant inefficiency. Koop and Steel (2003) proposed a panel data model where technical inefficiency can be time-varying. This is based on a known parametrization of technical inefficiency through γ = Du, where u is technical inefficiency and D is a known matrix. However, nothing is mentioned about persistent inefficiency or separation of inefficiency from firm effects.6 Thus our model is not just another application of an ‘off-the-shelf’ Bayesian MCMC approach. It contributes to the Bayesian stochastic frontier literature in terms of a new model that has not been applied before.

In summary, the contributions of this paper are twofold. We propose two parametrizations for the Gibbs sampler that effectively provide accurate inferences and less autocorrelation in the MCMC scheme to address the problem of the relationship between time-invariant (persistent) inefficiency and firm effects. We also propose an efficient reparametrization of the MCMC scheme to account for the correlation of the three random-effect components (inline imageand inline image). This reparametrization improves considerably the performance of MCMC. We show in artificial experiments that the ‘straightforward’ Gibbs sampler suffers from problems of slow convergence and extremely high autocorrelations and thus it cannot help in a full exploration of the posterior. In sampling experiments where our reparametrized Gibbs sampler is used, the Bayes posterior mean (or median) performs quite well for the sample sizes typically encountered in economic applications (n > 100 and T = 5 or 10).

The rest of the paper is organized as follows. The econometric model is discussed in Section 2, followed by the Bayesian inference procedure in Section 3. Performance of the Bayesian estimation procedure in the light of some artificial examples is discussed in Section 4. Section 5 reports results from an application using a panel data on US banks. Section 6 concludes the paper.

THE ECONOMETRIC MODEL

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

We consider the model in (1), i.e.

  • display math

and rewrite it as

  • display math(2)

If we treat δi as random firm-specific effects, it would be Greene's true random-effects model. In Greene's model δi are nuisance parameters. The focus is in the estimation of β parameters and time-varying technical inefficiency inline image. In our model αi are nuisance parameters and we are interested in estimating both persistent and time-varying technical inefficiency (inline image and inline image) along with the β parameters. Since both firm effects and persistent technical inefficiency are time-invariant, one cannot separate inline image from αi without making some additional assumptions. That is, even if one estimates δi it cannot be separated into inline image and αi. Firm heterogeneity parameters δi in the above model capture both true firm effects and persistent technical inefficiency. Since the model fails to take persistent inefficiency into account, resulting estimates of inefficiency capture only the transient (short-run) component, inline image, which will underestimate the overall inefficiencyinline image.

If we rewrite the model as

  • display math(3)

then (3) resembles the model proposed by Kumbhakar and Heshmati (1995) and Kumbhakar and Hjalmarsson (1993, 1995), among others. The only difference is that in the papers by Kumbhakar and co-authors ξit is assumed to be i.i.d., whereas in (3) ξit is not i.i.d., although vit and αi are i.i.d. unless the variance of αi is zero.7 The formulation used by Kumbhakar and Heshmati (1995) is thus opposite that of Greene (2005a,2005b), in the sense that the former ignores firm effects whereas the latter ignores persistent inefficiency. Thus both formulations are misspecified, although their impacts on estimated inefficiency are not the same. The Greene formulation is likely to produce downward bias in the estimate of overall inefficiency. Perhaps it would be better to say that the Greene formulation gives an estimate of transient inefficiency, and therefore it will give a downward bias of overall inefficiency, especially if persistent inefficiency exists. Given these shortcomings, it is important to consider estimation of the model in (1).

At this point it is useful, perhaps, to explain why we use a random-effects formulation. The model with inline image has received a lot of attention in the literature. It is known that the model is subject to the incidental parameters problem, since the number of αis increases with the sample size, leading to inconsistent inferences by the method of maximum likelihood. Recently, semi-Bayesian approaches have been proposed using the fact that artificial priors can be introduced for the incidental parameters. From that point of view it is natural to be explicit about the nature of the random effects and use a fully likelihood-based procedure.8

Recently, Chen et al. (2011) proposed an estimator for the model in which there is no persistent inefficiency but the firm effects are fixed (nuisance) parameters. Since the model suffers from an incidental parameter problem they considered the standard within transformation to get rid of the fixed effects. Their model is inline image, which after the within transformation becomes inline image. Given that inline image, we have inline image, when inline image = data. The distribution of inline image belongs to the family of the multivariate closed skew-normal (CSN), so estimating the difficult parameters λ and σ is easy. Of course, the multivariate CSN depends on evaluating a multivariate normal integral in RT+1. With T > 5, this is not a trivial problem. In contrast, our Bayesian estimator performs quite well when T is large.

The transformation in Chen et al. (2011) that used the multivariate CSN is one class of transformations, but there are many transformations that are possible because the true fixed-effects model does not have the property of information orthogonality (Lancaster, 2000). The best transformation, the one that is maximally bias reducing, cannot be taking deviations from the means because the information matrix is not block diagonal with respect to inline image. Other transformations might be more effective.

BAYESIAN NUMERICAL INFERENCE PROCEDURES

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

Preliminaries

Our assumptions for the random components in (1) are as follows:

  • display math(4)

All random components are mutually independent as well as independent of xit. For Bayesian analysis it is necessary to specify a prior for the parameter vector, i.e. p(θ) ≡ p(β, σv, σu, ση, σα). The posterior distribution follows from Bayes' theorem, as p(θ|y, X) ∝ p(θ)L(y, X; θ), where L(y, X; θ) denotes the likelihood function.

By stacking the time series observations, the model in (1) can be written as

  • display math(5)

where ιT is a column of ones (T × 1) and ⊗ denotes the Kronecker product. The convolution inline image is well known (Kumbhakar and Lovell, 2000, p. 67) to have a skew-normal distribution with the density function inline image, where inline image, inline image, and φ, Φ denote respectively the density and distribution function of the standard normal distribution.

Assuming inline image, it follows that the density function is

  • display math(6)

where inline image and inline image. Therefore, the likelihood for firm i becomes

  • display math(7)

where ri ≡ yi − Xiβ. The integral above seems too complicated to be computed in closed form. Therefore, the likelihood inline image is not readily available in closed form.

Recently Colombi et al. (2011) presented the likelihood function of this model in closed form utilizing properties of multivariate closed skew-normal distribution (Gonzales-Farias et al., 2004; Arellano-Valle and Azzalini, 2006). In effect, the model in (1) presents substantial computational complications for both sampling theory and Bayesian inference. From the sampling theory perspective the application of the model is computationally prohibitive when T is large. This is because the likelihood function depends on a (T + 1)-dimensional integral of the normal distribution. From the Bayesian viewpoint it is well known that a naïve strategy based on data augmentation by all latent variables will result in slow mixing and poor exploration of the posterior distribution.9 In that sense the posterior must be explored using specially tailored techniques that can avoid the extreme correlations inherent in naïve MCMC (Papaspiliopoulos, O. and Roberts, G.O. (2003)).

To proceed, given the structural parameter vector θ = (β, σv, σu, ση, σα) we define the augmented parameter vector (including latent variables) as Θ = (θ, u+, η+, α). Typically, in naïve Gibbs sampling one would have to draw from the conditional distributions θ− θ, y, X as well as from the posterior conditionals of the latent variables inline image, inline image and α− α, y, X, where, in generic notation, Θ− a denotes all elements of Θ except the elements in a. Under mild regularity conditions (Roberts and Smith, 1994) the Gibbs sampler produces a sequence {Θ(s), s = 1, …, S} which converges in distribution to the full posterior.10 Therefore, subsamples like {β(s), s = 1, …, S} or inline image can be used to provide marginal inferences about the parameters β or the transient technical inefficiency of a particular firm for a given time period (inline image), etc. For example, a histogram or kernel density of inline image would provide the marginal posterior of inline image, while a histogram or kernel density of inline image can be used for marginal inferences of the persistent technical inefficiency of a given firm i, unconditional on parameter uncertainty.

Ever since Koop et al. (1995) it has been made clear in the efficiency literature that retaining latent variables (like inline image, inline image or αi) results in more efficient computational schemes organized around MCMC using the Gibbs sampler,11 relative to importance sampling (van den Broeck et al., 1994) where such latent variables are integrated out, whenever possible. To summarize ideas, suppose θ is a structural vector parameter and U is a certain set of latent variables. MCMC schemes are organized around the idea that if we can obtain draws from the posterior conditional distributions p(θ|U, y, X) and p(U|θ, y, X), then this series of draws, say {θ(s), U(s), s = 1, …, S} converges in distribution to the kernel posterior p(θ, U|y, X).

In particular, inline image, and inline image, which implies that marginal inferences are possible after using MCMC.

Priors

In this paper we use priors that are flexible enough and at the same time conditionally conjugate, as has been done many times before in the literature (e.g. Koop et al., 1995). For the prior we assume

  • display math

For the regression parameters we assume: inline image, where Nk denotes the k-variate normal distribution with mean vector inline image and prior precision matrix A. For the scale parameters, in generic notation, we assume

  • display math

In practical applications we set inline image, A = 10− 4 ⋅ Ik, inline image, and inline image, for κ = v, u+, η+, α.12

The prior for β is very diffuse given the nature of the parameters in our application (cost function in the banking sector). Each element of β is a priori centered at zero and has prior standard deviation 100. For the scale parameters, including the two new scale parameters proposed in this paper, ση and σα, inline image and inline image represent the length of a prior sample from which we obtain a sum of squares inline image and inline image, respectively. Therefore, on a priori grounds we expect the scale parameters to be 10−4 on average and they are, of course, minimally informative (see Fernandez et al., 1997, for a similar prior). The median of σ is 0.015, the mean is 0.09 and the standard deviation of σ is about 3.7. The 99% value of the prior distribution of σ is 6.6 and the 1% value is about 10−4, so this prior is rather diffuse and cannot dominate the data via the likelihood function.

Conditional Distributions and Parametrizations for MCMC

Nuisance parameters like αi and the various scale parameters are treated in a natural way in the Bayesian analysis through formal integration out of the posterior, via the Gibbs sampler. Implementing Gibbs sampling is straightforward since it requires only drawing random numbers from the various posterior conditional distributions a|Θ− a, y, X, for every a ∈ Θ. We show below, in artificial examples, that under the most favorable conditions the Gibbs sampling scheme will not have good mixing properties and will collapse. Thus we need to consider other strategies: for example, a reparametrization. We use two parametrizations which we call δ- and ξ-parametrizations. Their purpose is to group together the various random effects in a way that can allow better exploration of the posterior distribution by breaking the inherent correlations among parameters in an MCMC scheme.

In the δ-parametrization we combine together the firm effects αi and persistent inefficiency inline image, while in the ξ-parametrization we have ξit = αi + vit, which combines the error term vit and the firm effects αi. The purpose of these reparametrizations is to provide accurate MCMC inferences (by avoiding the inherent correlations of, say, αi and inline image), without hampering our ability to perform inferences for all parameters and latent variables of the model.

δ-Parametrization

We consider inline image, which groups together the firm-specific effects and persistent inefficiency. The probability density function of δi is given by (6). Suppose inline image. Then, for each i, the conditional posterior distribution of δi will be

  • display math(8)

where Θ− δ denotes all parameters except the δis, Ri = [Ri1, …, RiT]′, and ιT is the unit vector in inline image. The kernel density in (8) may seem impossible to simulate but in fact it is log-concave,13 so special rejection techniques can be used, requiring only the first and second derivatives of this function, which are easy to find. The following strategy has been found extremely effective: given the mode inline image and an estimate, say s2, from the inverse negative second derivative at the mode,14 we generate a random draw inline image. The draw is accepted with probability

  • display math

where fN(x|m, s2) denotes the density function of the normal distribution with mean m and variance s2, evaluated at x and (y, X) denotes the data. In effect, we have one draw for the parameter δi, i = 1, …, n.

It remains to show how to obtain random numbers from the conditional distributions of β, and the scale parameters of the model. Obtaining a draw for inline image is not easy, due to (8) where σδ enters in non-standard way, so we will discuss drawing random numbers from the posterior conditional of this parameter later in the paper. Obtaining draws for β and u+ is straightforward along with inline image and inline image, as usual, conditional on the δis. For example, the posterior conditional distribution of inline image is

  • display math

This notation means that a draw for inline image is obtained as inline image, where inline image is a random number from inline image.

Similarly, the posterior conditional distribution of inline image is

  • display math

Finally, for β, its posterior conditional distribution is given by

  • display math

where inline image, Y = y − δ ⊗ ιT − u+ and inline image where, as we remarked before, inline image is the prior mean and A is the prior precision matrix of β. In our applications we set inline image and A = 10− 4 ⋅ Ik. One can set A = 0(k × k), in which the prior will be totally uninformative with respect to the elements of β. Note that from (8) we also get posterior draws for δi.

As described below, in an artificial example with n = 100, T = 5, we obtain efficiency measures close to the true ones (75% correlation and parameters very close to the true ones). If one is interested only in short-run inefficiency then this completes the analysis. Under the assumption that firm-specific effects αi are small when permanent inefficiency inline image is introduced, one can perhaps proceed under the assumption that inline image. But this cannot always be the case so we next describe a computational scheme to complete the analysis.

ξ-Parametrization

Instead of combining αi and inline image, one can combine αi and vit as in standard GLS treatments of panel data models. The objective, as we mention below, is to reduce the high correlation in an MCMC scheme that would draw αi and inline image sequentially in a naïve way.

In ξ-parametrization we write the model as

  • display math(9)

where ξit = αi + vit. If we denote ξi = [ξi1, …, ξiT]′, then ξi ∼ NT(0(T × 1), Σ) where inline image, inline image. In this parametrization we draw inline image independently of the draw for αi, while on the δ-parametrization we have drawn αi independent of the draw for inline image, with the objective to reduce the overall high correlation in an MCMC scheme that would draw αi and inline image sequentially in a naïve way. Avoiding the overall high correlation means that we want a fast-mixing MCMC scheme that moves away fast enough from the initial conditions and moves around the parameter space, covering it as much as possible according to the probability mass of different configurations of the parameters.

In this parametrization, the posterior conditional distribution of inline image is

  • display math(10)

where inline image, inline image, inline image, where N+ denotes the univariate truncated normal distribution.

From the draws of each inline image, we can compute

  • display math

as well as the following posterior conditional distributions for the scale parameters σα and ση:

  • display math

and

  • display math

The parameter inline image, needed to draw the δis, can be drawn as inline image. The draw will apparently be always positive. It must be noted that, as we remarked before, the posterior conditional distribution of inline image is not χ2 given in (8).

The posterior conditional distribution of transient inefficiency, inline image, can be obtained from (9) as

  • display math(11)

where ũ = [ũit] = y − Xβ − δ ⊗ ιΤ. Drawing random numbers from (11) is straightforward given previous treatments in the literature (e.g. Geweke, 1991; Kumbhakar and Tsionas, 2005a,b). Effectively, the conditional distributions in (10) and (11) generalize the familiar Jondrow et al. (1982) estimator of technical inefficiency from the simple stochastic frontier model to the GTRE models. For example, conditional on the data, (10) and (11) can be used to obtain the expected values of inline image and inline image (Kumbhakar and Lovell, 2000, pp. 77–78) using the formula for the mean of a truncated normal distribution with arbitrary location and scale parameters. We do not need to rely on these formulae, however, since we can draw directly from the marginal posterior distributions of inline image and inline image using our MCMC scheme.

It should be noted that in the ξ-parametrization we take the estimates of β from the δ-parametrization as given in the course of MCMC computations. The order of application of the ξ- and δ-parametrizations is immaterial in reversible Markov chain sampling schemes, like that suggested here.

Computation of Efficiency Measures

The main objective of the model is to separate firm effects from persistent and transient inefficiency with a focus on measuring the last two components, which are of significant economic importance.

Expressions (10) and (11) provide the conditional distributions for inline image and inline image, respectively. If we had consistent estimators for the parameters involved we could obtain estimators for these efficiencies. However, from (11) it is clear that any measure of inline image will depend on δ, and from (10) it is clear that any measure of ηi will depend on inline image. This simultaneity presents a problem. Even if the problem were solved, there would remain the problem of accounting for parameter uncertainty (regarding β and the scale parameters). These problems are solved by averaging out parameter uncertainty in the standard Rao–Blackwell fashion.

Suppose inline image and inline image are draws from the conditional distributions in (8) and (9) for the sth pass of an MCMC scheme. Then the posterior means of permanent and transient inefficiency are

  • display math(12)

In addition, the complete samples inline image and inline image can be used to provide inferences about the (posterior) distribution of each inline image or each inline image individually, marginally on parameter uncertainty.

ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

Artificial Examples

We consider an artificially generated dataset with n = 100 and T = 5. We have a constant term and a covariate that was generated as independent standard normal, and σv = 0.1, σu = 0.2, σα = 0.2 and ση = 0.5. The MCMC scheme was implemented using 15,000 iterations, the first 5000 of which are discarded to mitigate start-up effects, while in the computation of all statistics we take every other tenth draw to mitigate autocorrelation.

First, we present the marginal posterior distributions of the scale parameters in Figure 1. The true parameter values are in the region of high posterior probability mass, as one would expect. Second, we are concerned with estimates of the efficiency measures inline image and inline image. Such estimates, as we remarked, can be obtained from inline image, where inline image denotes the sth draw for inline image and S = 1000 in our case after the burn-in and skipping phases. Similarly, we have inline image. These estimates are provided in Figure 2(a), where they are plotted against the true values (generated according to the true values of the parameters).

image

Figure 1. Marginal posterior distributions of scale parameters

Download figure to PowerPoint

image

Figure 2. (a) Marginal posterior estimates of inefficiencies versus true values. (b) Summary autocorrelation function (A) for persistent inefficiencies

Download figure to PowerPoint

Specifically, the correlation coefficient between inline image and its true values is 0.856, while the correlation between inline image and its true value is 0.754. These are not the simple correlations between inline image and inline image but posterior means of the correlation coefficient, say ρη, or E(ρη|y, X). Specifically, for each draw s, we compute the correlation coefficient between η+ and η+ (s), which we denote by inline image. The correlation coefficient is inline image, so it reflects fully parameter uncertainty.

The Naïve MCMC Scheme

Here we explore the behavior of what we termed a naïve Gibbs sampler. As we remarked, the naïve Gibbs sampler draws from the conditional posterior distributions of (β, σκ, α, η+, u+) given the data (y, X). Again, we have the same dataset with n = 100 and T = 10 (note that we had T = 5 before). We have a constant term and a covariate that was generated as independent standard normal, and σv = 0.1, σu = 0.2, σα = 0.2 and ση = 0.5. The starting values are set equal to the true parameter values and we use 150,000 iterations, in which the first 50,000 are discarded.15 Next we take every other tenth draw to mitigate the inherent autocorrelation. Since we do not wish to show results from a single dataset but we also want to keep computational costs to a minimum, we generate 100 alternative datasets and run the naïve MCMC scheme for each one of them.

We are interested in the posterior properties of the structural parameters, the posterior properties of inefficiency measures and also the autocorrelation properties of the naïve scheme when every other tenth draw is taken in standard fashion. The sampling properties of structural parameters and basic inefficiency measures are summarized in Table 2.

Table 2. Sampling properties of the naïve MCMC scheme
 True valuePosterior mean and SD
  1. Note: By true value of α, η+, u+ (and their standard deviations shown in parentheses) we mean the true average values and their SD as they were generated from the Monte Carlo experiment. In the third column we report (estimated) posterior means and posterior SD using 150,000 MCMC iterations, the first 50,000 of which are discarded and from the remaining 100,000 we retain only every other tenth draw, for a total of 10,000.

β11.000.973 (0.037)
β21.001.008 (0.006)
σα0.200.035 (0.018)
ση0.500.649 (0.052)
σv0.100.166 (0.011)
σu0.200.065 (0.019)
α−0.015 (0.203)−0.010 (0.003)
η+0.430 (0.324)0.480 (0.0026)
u+0.160 (0.120)0.130 (0.0038)

In Table 2, we report the true values of structural parameters along with their posterior means and posterior standard deviations computed from the naïve MCMC scheme. For σα and σu it is apparent that 99% Bayes probability intervals do not include the true parameter values. Posterior estimates of α, η+, and u+, although relatively close to the true averages, have substantially smaller standard deviations (by an order of magnitude of 100). Therefore they cannot reproduce reliably the distributions of α, η+, and u+ and these estimates show a negligible correlation with the true values.16 The reason for the extremely low standard deviation of the draws for α, η+, and u+ can only be attributed to the poor mixing properties of the naïve Gibbs sampler.

To show this explicitly, we can compute the autocorrelation function of any key parameter and examine its behavior. The reader must keep in mind that we have already taken only every other tenth draw from the MCMC scheme, so ideally there should be no autocorrelation left if the naïve MCMC explores the parameter space fairly well. As an illustration, we report here the autocorrelation function of ηi. The draws for these latent variables are inline image, where s = 1, …, S denotes the draw and i = 1, …, n refers to the particular unit. From the MCMC draws one can compute ACFi,l, the autocorrelation at lag l = 1, …, L. Since there are many units we can consider the measure inline image, i.e. the median autocorrelation at lag l, across all firms i = 1, …, n. Taking the maximum would be possible but that could be considered excessive. For lags l = 1, …, L = 50, the summary autocorrelation function of persistent inefficiencies (inline image) is shown in Figure 2(b). It can be seen from the figure that, even at lags of order 50, autocorrelation coefficients are about 0.55 and just below 0.90 at lag order 10. Apparently the MCMC scheme is not mixing well, resulting in poor exploration of the posterior distribution.

Simulation Experiment

We consider a data-generating process, where yit = 1 + xit + εit, xit is generated from a standard normal distribution and inline image. We use σα =0.2, ση =0.5, σv =0.1, σu =0.2 (for n = 50) and σα =0.1, ση =0.5, σv =0.1, σu =0.5 for sample size n = 100. In our Gibbs samplers we make use of 5000 iterations, the first 1000 of which are discarded to mitigate the impact of startup effects. Bayesian inference is conducted for 1000 datasets. Our results are reported in Table 3.

Table 3. Sampling behavior of Bayes estimators
n = 50, T = 5αη+u+
MeanMedianSDMeanMedianSDMeanMedianSD
True−0.014−0.0400.2050.3880.2890.3250.1630.1400.125
Estimated−0.014−0.0500.1250.4960.4030.3070.1500.1270.091
n = 50, T = 10         
True−0.014−0.0400.2050.3880.2900.3250.1600.1370.123
Estimated−0.011−0.0250.1900.4170.3270.3370.1100.1110.149
n = 100, T = 5         
True−0.015−0.0360.2030.4310.3900.3240.1570.1340.122
Estimated−0.019−0.0440.1580.4920.4020.2850.1100.1150.099
n = 100, T = 10         
True−0.007−0.0180.1020.4310.3900.3240.3990.3450.301
Estimated−0.008−0.0220.1050.4250.3310.2570.3870.3270.289
n = 200, T = 10         
True−0.007−0.0180.1020.4310.3900.3240.3990.3450.301
Estimated−0.007−0.0180.1000.3590.3860.3400.3970.3260.300

For α, η+ and u+ we compare their true values with posterior estimates since the true posterior is not available for a complete comparison. Specifically, these random effects are compared in terms of mean, median and standard deviation. For the true values these statistics can be computed easily. For the estimated parts, we report means, medians and standard deviations of the sampling distributions of the Bayesian estimators.

It can be seen that the sampling performance of the Bayesian posterior mean estimator is very good, particularly as n and T get large. For n = 100 and T = 5 or 100, which is typically the case in economic data, the means and standard deviations of the random effects inline imageand inline image are matched quite well. Even when n = 50 and T = 5, the sampling performance of the Bayesian posterior mean (or the median) is also quite good. The message from this sampling experiment is that the Bayesian posterior mean estimator has an excellent sampling theory behavior and thus it is recommended for use in practice. For n = 100 and higher and T = 10 the estimator is nearly unbiased in terms of the average value and standard deviation of all three random effects. It should be noted that the excellent behavior of the Bayes estimator is mostly due to the remarkably good performance of our proposed Gibbs sampler in terms of the δ- and ξ-parametrizations.

AN APPLICATION TO US BANKING

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

In this section we present results from the models we discussed earlier using the US banking data. The data used comes from the Reports of Income and Condition (Call Reports). We used a balanced panel of banks from 1998 to 2005 as in Feng and Serletis (2009). To address technological heterogeneity, we follow Feng and Serletis (2009), who classified the sample banks into three size categories (large, medium and small). Each of these categories is then subdivided into four categories, giving a total of 12 groups. Here we use three of their groups (group 1: the very large banks; group 6 is the lower end of medium banks; and group 10: in the middle of small banks) (see Table 3 of their paper for details).

Since efficiency is intimately related to the technology, it is important that the technology is specified to be as flexible as possible and the specified technology also satisfies theoretical properties. Since banking outputs are services which cannot be stored, the standard practice is to specify the technology in terms of a dual cost function, thereby meaning that banks minimize cost taking outputs as given. Flexible functional forms (mostly Translog) are used in the literature. Since the Translog function tends to violate theoretical properties of a cost function (viz., concave in input prices), Feng and Serletis (2009) used the Fourier cost function, which satisfies global regularity conditions. Here we use the popular Translog form and impose the regularity constraints using the procedure by Terrell (1996).

The other issue is specification of inefficiency. Even if the cost function is globally well behaved but inefficiency is not modeled properly, estimates of economic variables of interest (returns to scale, technical change, etc.) as well as inefficiency might be wrong. For example, in specifying the cost function and inefficiency, Feng and Serletis (2009) neither controlled for bank effects nor allowed persistent (bank-specific) inefficiency. In our model as well as in its closest cousin—the TRE model proposed by Greene (2005a,b)—bank effects are controlled in estimating inefficiency. In the application, we show that efficiency results (as well as estimated technical change) differ across models, and therefore one has to consider appropriateness of alternative models.

Specification of inputs and outputs for financial service providers like banks is not straightforward. This is because many of the services are jointly produced and prices are typically assigned to a bundle of financial services. The role of commercial banks is generally defined as collecting the savings of households and other agents to finance the investment needs of firms and consumption needs of individuals. In addition, they provide various financial services relating to fund transfer, trade, investments, etc. What banks do produce, using what, is a long-standing debate in the banking literature. Two approaches dominate: the production approach and the intermediation approach. Both approaches apply the traditional microeconomic theory of the firm to banking and differ only in the specification of banking activities. Under the production approach banks are primarily viewed as providers of services to customers. The inputs under this approach include physical variables (e.g. labor, material, space or information systems) and the outputs represent the services provided to customers. Under the intermediation approach banks produce intermediation services through the collection of deposits and other liabilities and apply them in interest-earning assets, such as loans, securities and other investments. This approach includes both operating and interest expenses as inputs, whereas loans and other major assets count as outputs. The appropriateness of each approach varies according to the issues and problems addressed. It is apparent that banks perform many activities within the broad framework of both approaches. Here we use the balance-sheet approach of Sealy and Lindley (1977), in which all liabilities (core deposits and purchased funds) and financial equity capital provide funds and are treated as inputs. Similarly, all assets (loans and securities) use bank funds and are treated as outputs. This approach is different from the intermediation approach, which is consistent with the value-added definition of output (Das and Kumbhakar, 2012).

Following Feng and Serletis (2009), we use three traditional outputs consumer loans (y1); non-consumer loans (y2) composed of industrial and commercial loans and real estate loans; and securities (y3), which includes all non-loan financial assets, i.e. all financial and physical assets minus the sum of consumer loans, non-consumer loans, securities and equity. All outputs are deflated by the consumer price index (CPI) to the base year 1998. In addition to these traditional outputs we also include the following two non-traditional outputs, viz., financial equity capital (y4) and non-traditional banking activities (y5). The time trend variable (t) is included in the cost function to capture technological change. The input variables used are: labor, borrowed funds and physical capital. Prices of these are: the wage rate for labor (w1); the interest rate for borrowed funds (w2); and the price of physical capital (w3). Prices of these inputs are calculated by dividing total expenses on each input categories by their respective quantities. For example, the wage rate is computed by dividing total salaries and benefits by the number of full-time employees. Similarly, the price of capital is total expenses on premises and equipment divided by premises and fixed assets, and the price of deposits and purchased funds equals total interest expense divided by total deposits and purchased funds. Total cost is simply the cost of these three inputs. This specification of outputs and inputs is similar to most of the previous banking studies (see, for example, Berger and Mester, 1997).

As mentioned earlier, Feng and Serletis (2009) emphasized that imposition of monotonocity and concavity constraints is quite important. However, instead of using the Fourier functional form, we used the Translog function and followed the approach in Terrell (1996) to impose these constraints. Specifically, we first impose the constraints at the means, and then we use rejection sampling to the whole set of observations until at least 99% of the observations satisfy the restrictions.

We analyze the group 1 bank (the largest banking group in the sample) results in greater detail and cross-check some key results with some other groups. The main reason for not reporting results for all 12 groups is that the groups did not seem to give different results so far as technical inefficiency, returns to scale, etc., are concerned. For comparison (robustness check), we use (i) the traditional stochastic frontier model (SFM) which is used in Feng and Serletis (2009), (ii) the true random effects (TRE) model proposed by Greene (2005a,b), and (iii) the generalized TRE (GTRE) model introduced in this paper. Note that GTRE is the most general model, followed by TRE and the traditional SFM. Since GTRE nests the other two models, it is possible to test empirically which model is appropriate for the data at hand.

When inline image then the GTRE model reduces to TRE and when inline image the TRE reduces to the traditional SFM. Although the computation of posterior odds ratios and Bayes factors is troublesome, Verdinelli and Wasserman (1995) have proposed a simple and accurate approximate of it for nested models of the type we have here. The approximation is: inline image, where σ2 denotes inline image or inline image, p(σ2) is the prior and p(σ2|D) is the posterior.17 If the parameter vector φ denotes all parameters of the model except σ2, the numerator of the above BF can be approximated by inline image, the MCMC average of the posterior conditional distribution of σ2. Bayes factors are shown in Table 4 for groups 1, 6 and 10. Based on these results it is easy to compute the Bayes factor in favor of GTRE and against SFM (group 1), as (45.343) × (212.376) = 9629.7. The conclusion is that the data provide considerable evidence in favor of GTRE and against TRE or SFM.18

Table 4. Bayes factor for model comparison
 Group 1Group 6Group 10
GTRE vs. TRE45.343189.212212.016
TRE vs. SFM212.376512.991565.127

Instead of reporting returns to scale (RTS), we report output cost elasticity (inline image), which is the reciprocal of RTS. Scale economies are said to exist if RTS exceed unity (or Ecy < 1). Note that for a Translog cost function Ecy is observation-specific (i.e. it varies with bank and over time). We also report technical change (TC = ∂ ln C/∂ t), a negative of which will indicate technical progress (cost diminution over time, ceteris paribus). TC is also observation-specific (i.e. it varies with bank and over time). Posterior distributions of Ecy and TC19 are reported in Figure 3. It can be seen that Ecy results are quite similar across three models. This is, however, not the case with TC (reported in the lower panel of Figure 3). Estimates of TC from the traditional SFM and GTRE are quite close and show large variations, in contrast to those from the TRE model. Given that banks in this group are quite heterogeneous in size, it is expected to observe large variations. Furthermore, since the specification test results favor the GTRE model, we rely more on the results from the GTRE model.

image

Figure 3. Cost elasticity and technical change (group 1 banks)

Download figure to PowerPoint

In Figure 4 we present posterior distributions of overall technical inefficiency from the above-mentioned three models. The traditional SFM and the GTRE models show that inefficiency averages around 12% and 5%, respectively. On the contrary, results from TRE show average inefficiency of close to 2.6% (calculations not shown). Further, results from the TRE model show that in fact it is quite implausible to expect inefficiency values higher than 8%. For US banks this is quite hard to believe, and goes against almost all the efficiency studies (e.g. Berger and Humphrey, 1997). Since TRE does not allow persistent inefficiency whereas GTRE does, it is expected that inefficiency (persistent and transient combined) from GTRE will be higher. In other words, TRE is likely to give low estimates of inefficiency because persistent inefficiency will be treated as bank effects—not inefficiency. It is, however, not clear whether results from the traditional SFM will be lower or higher compared to GTRE because the former model fails to take bank effects and persistent technical inefficiency into account. If these two effects are negatively correlated (as we find in these data), one might cancel the other and inefficiency results might go up or down depending on whether these time-invariant effects are picked up by the inefficiency term or not.

image

Figure 4. Technical inefficiency across different models (group 1 banks)

Download figure to PowerPoint

One advantage of the GTRE model is that it identifies and estimates two sources of inefficiency. Inefficiency in the other two models is captured by the transient component (inline image). Consequently, in comparing inefficiency across different models we used the overall measure (sum of persistent and transient components in GTRE model) of inefficiency. In Figure 5 we plot the posterior distributions of persistent (η+, solid line) and transient (u+, broken line) inefficiency components. The persistent component does not seem to exceed 4% and the average of both components is close to 3%. Persistent inefficiency in excess of about 6% is practically impossible, although the transient component has a long right tail that allows inefficiency values as large as 20%. The broken line in Figure 4 combines (as opposed to a naïve algebraic approach that would simply add up the means of inefficiency) the posterior probabilistic evidence for both u+ and η+ shown in Figure 5.

image

Figure 5. Distribution of inefficiency components in GTRE model (group 1 banks)

Download figure to PowerPoint

Bank-specific posterior distributions of the bank effects, α, for both TRE and GTRE models are shown in Figure 6. It can be seen from the figure that bank effects from GTRE are much larger (in both tails) compared to those in the TRE model. Since the TRE model does not allow persistent technical inefficiency, the resulting ‘pseudo bank effects in the TRE model' will capture the joint effects of persistent inefficiency and bank effects. If bank effects and persistent technical inefficiency are negatively correlated, the pseudo bank effects the TRE model might be more concentrated (as in Figure 6) and likely to underestimate the magnitude of ‘true’ bank effects.

image

Figure 6. Posterior distribution of banks effects

Download figure to PowerPoint

Now we turn our attention to other groups. For brevity we report results from group 6 and group 10 banks.20 These groups are representative of medium and small banks. Furthermore, to conserve space we focus our attention to the TRE and GTRE models. The results are summarized in Figures 7 and 8. The results indicate increasing returns to scale (Ecy < 1) and technical progress (TC < 0) for most of the banks. Average persistent technical inefficiency is around 1.5% but highly likely to be as large as 3%. Since TRE does not allow persistent inefficiency, for a fair comparison of inefficiency between these two models we have to compare the overall inefficiency in the GTRE with transient inefficiency (u+) in the TRE model. We find some differences in the distribution of overall inefficiency between the TRE and GTRE models for group 6 banks. The difference is very small for group 10 banks. Since the TRE model considers only transient inefficiency, it is likely that the overall inefficiency from TRE overestimates the probability of full efficiency.

image

Figure 7. Cost elasticity and technical change (group 6 banks)

Download figure to PowerPoint

image

Figure 8. Cost elasticity and technical change (group 10 banks)

Download figure to PowerPoint

Note that in Figures 7 and 8 we are comparing distributions of overall inefficiency from two models. Even if this difference is small it is not clear whether estimated inefficiency for each bank under these two models are close to each other. For this, we report scatter-plots of posterior estimates of overall inefficiency from the GTRE and TRE models in Figure 9 (for group 6; for group 10 we obtained similar results). Whether the efficiency scores are approximately the same can be examined by looking at scatter-plots of estimates of inefficiencies from TRE and GTRE models. If the scores are similar (identical) we would expect all pairs to lie on the 45° line (or close to it). However, the scatter-plots reveal that the correlations are quite low and therefore classification of banks according to GTRE inefficiency scores are likely to be quite different from the scores obtained from the TRE models.

image

Figure 9. Scatter-plot of overall inefficiency from TRE and GTRE models

Download figure to PowerPoint

CONCLUSIONS

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

This paper proposed a new panel data stochastic frontier model that disentangles firm effects from persistent (time-invariant/long-term) and transient (time-varying/short-term) technical inefficiency. The model separates firm heterogeneity from persistent or time-invariant technical inefficiency. Recently this model has received a lot of attention and various ingenious estimators have been proposed from a sampling theory perspective (Colombi et al., 2011). We used Bayesian methods of inference in this four-way error component model (labeled as generalized true random effects (GTRE) model) using MCMC simulation to provide robust and efficient methods of estimating inefficiency components.

We showed that Gibbs sampling is a straightforward way to perform accurate posterior inference, and we proposed two reparametrizations that increase efficiency of the Gibbs sampler substantially. We also showed that a naïve Gibbs sampler without reparametrization does not mix well and does not deliver the correct results. Furthermore, we showed that the GTRE techniques perform well in artificial experiments where mean squared errors are quite small for a wide array of parameter values and sample sizes. Finally, we applied the model to a large panel of US banks. We find evidence in support of the GTRE model. We compared return to scale, technical change and technical inefficiency results with two competing models, viz., the standard stochastic frontier model and the true random-effects model.

ACKNOWLEDGEMENTS

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

This paper is dedicated to the memory of Lennart Hjalmarsson, who passed away in 2012. We thank the editor (Herman K. van Dijk) and two anonymous referees for their comments. Tsionas is indebted to Athens University of Economics and Business for partial financial support through research grants under PEBE III.

  1. 1

    See Greene (2009) for a recent survey on both cross-sectional and panel stochastic frontier models.

  2. 2

    Kumbhakar and Wang (2005) generalize their model by introducing firm-specific effects together with time-varying inefficiency. Wang and Ho (2010) generalized it further in which the temporal pattern of inefficiency is made firm-specific by specifying it as a function of covariates that can change both temporally and cross-sectionally in a specific manner.

  3. 3

    Recently this model has also been proposed by Colombi et al. (2011), who used a single-step maximum likelihood method to estimate the technology parameters and technical efficiency components.

  4. 4

    It is not necessary that the model has to be linear in x. One can easily add the square and cross-product terms in the x vector to make the underlying cost function translog.

  5. 5

    Details on these steps can be found in Kumbhakar et al. (2011).

  6. 6

    In section 3.1.1 of an earlier version of the paper (Koop and Steel, 1999) the authors discuss a Bayesian fixed-effect model capable of yielding inefficiency predictions similar to the relative efficiency estimators but did not consider firm-specific effects. See http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/steel/steel_homepage/baltfin.pdf

  7. 7

    This difference is not important so far as estimation of inefficiency in the multi-step procedure considered by Kumbhakar and associates (see, for example, Kumbhakar et al., 2011). However, this might matter in a single-step maximum likelihood method, which is implemented in Colombi et al., 2011). Note that, although ξit = vit + αiis normally distributed, the variance–covariance matrix of ξitwill be non-scalar and hence ignoring the αiterm is likely to affect the parameter estimates as well as estimates of inefficiency (see our discussion following (8)). It is not clear whether ignoring αi will lead to upward or downward bias in the estimated inefficiency. This issue will be explored elsewhere.

  8. 8

    For more details, see Satchachai and Schmidt (2010), Wang and Ho (2010) and Chen et al. (2011), as well as and Arellano and Bonhomme (2006), Arellano and Hahn (2006a,2006b), Berger et al. (1999) and Bester and Hansen (2005a,2005b) for treatments of the incidental parameters problem in similar problems using the artificial priors setup. For a modern treatment of the incidental parameter problem, see Lancaster (2000).

  9. 9

    See, for example, Hills and Smith (1992) and Papaspiliopoulos et al. (2003). In particular, see their discussion following equation (9) on p. 311. Gill and Casella (2004) present the alternative approach of tempered transitions (Neal, 1996).

  10. 10

    For an excellent review see Geweke (1999).

  11. 11

    In statistics this is explored in Gelfand and Smith (1990), Tanner and Wong (1987) and Tierney (1994). For reviews see Koop (1994) and Geweke (1999).

  12. 12

    For posterior propriety it is essential that inline image, as has been shown in Fernandez et al. (1997).

  13. 13

    In fact, the distribution is known to be skew normal.

  14. 14

    We locate the mode using standard Gauss–Newton with analytical derivatives. Execution times for finding the mode and implementing the rejection algorithm were trivial. We never observed more than ten rejections and the modal number of rejections was one. An alternative is to use a general-purpose rejection sampler for log-concave densities (Wild and Gilks, 1993). This sampler also behaved well provided we used 50 points bounding the mode from above. Its timings properties were, however, not competitive.

  15. 15

    In all cases convergence of MCMC is tested using Geweke's (1992) diagnostic.

  16. 16

    These obvious results are not reported here to save space but are available upon request.

  17. 17

    For the priors of scale parameters we use exponential priors with mean 0.10.

  18. 18

    These results are quite robust to a wide range of prior beliefs about the parameters β, σα, and ση. For the scale parameters we have varied and N and Q prior parameters between 1 and 10 and 10−4 to 1 respectively. The precision matrix A for β was varied between 10−4 I and 10.I.

  19. 19

    For each MCMC draw, measures of output cost elasticity, technical change, inefficiency etc., can be obtained for each bank and time period, i.e. for each observation. Monte Carlo means provide measures for each observation that are plotted here using standard kernel density techniques. In this application we have used 150,000 iterations, the first 50,000 of which are discarded. We take every other tenth of the remaining 100,000 to produce a total of 10,000 MCMC draws. The initial conditions were obtained through maximum likelihood estimation of the SFM. Convergence was assessed using graphical means and the Geweke (1992) diagnostics.

  20. 20

    Results from other groups follow a similar pattern and are not reported to save space.

REFERENCES

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information
  • Aigner DJ, Lovell CAK, Schmidt P. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics 6: 2137.
  • Arellano M, Bonhomme S. 2006. Robust priors in nonlinear panel data models. Working paper, CEMFI, Madrid.
  • Arellano M, Hahn J. 2006a. Understanding bias in nonlinear panel models: some recent developments. In Advances in Economics and Econometrics, Blundell R, Newey W, Persson T (eds). Ninth World Congress, Cambridge University Press: Cambridge, UK.
  • Arellano M, Hahn J. 2006b. A likelihood-based approximate solution to the incidental parameter problem in dynamic nonlinear models with multiple effects. Working paper, CEMFI, Madrid.
  • Arellano-Valle E, Azzalini A. 2006. On the unification of families of skew–normal distributions. Journal of Data Science 3: 415438.
  • Battese G, Coelli T. 1992. Frontier production functions, technical efficiency and panel data: with applications to paddy farmers in India. Journal of Productivity Analysis 3: 153169.
  • Berger AN, Humphrey DB. 1997. Efficiency of financial institutions: international survey and directions for future research. European Journal of Operations Research 98: 175212.
  • Berger AN, Mester LJ. 1997. Inside the black box: what explains differences in the efficiencies of financial institutions? Journal of Banking and Finance 21: 895947.
  • Berger J, Liseo B, Wolpert RL. 1999. Integrated likelihood methods for eliminating nuisance parameters. Statistical Science 14: 122.
  • Bester CA, Hansen C. 2005a. A penalty function approach to bias reduction in non-linear panel models with fixed effects. Manuscript, University of Chicago Graduate School of Business.
  • Bester CA, Hansen C. 2005b. Bias reduction for Bayesian and frequentist estimators. Manuscript, University of Chicago Graduate School of Business.
  • Chen Y-Y, Schmidt P, Wang, H-J. 2011. Consistent estimation of the fixed effects stochastic frontier model. Paper presented at the EWEPA, Verona.
  • Colombi R, Kumbhakar SC, Martini G, Vittadini G. 2011. A stochastic frontier model with short-run and long-run inefficiency random effects. Department of Economics and Technology Management, University of Bergamo, Italy.
  • Das A, Kumbhakar SC. 2012. Productivity and efficiency dynamics in Indian banking: a hedonic input distance function approach incorporating quality of inputs and outputs. Journal of Applied Econometrics 27: 205234.
  • Feng G, Serletis A. 2009. Efficiency and productivity of the US banking industry, 1998–2005: Evidence from the Fourier cost function satisfying global regularity conditions. Journal of Applied Econometrics 24: 105138.
  • Fernandez C, Osiewalski J, Steel MFJ. 1997. On the use of panel data in stochastic frontier models. Journal of Econometrics 79: 169193.
  • Gelfand AE, Smith AFM. 1990. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association 85: 398409.
  • Geweke J. 1991. Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, Keramidas EM, Kaufman SM (eds). Interface Foundation of North America.
  • Geweke J. 1992. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics 4, Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds). Oxford University Press: Oxford; 169193.
  • Geweke J. 1999. Using simulation methods for Bayesian econometric models: inference, development and communication. Econometric Reviews 18: 173.
  • Gill J, Casella G. 2004. Tempered transitions for exploring multimodal posterior distributions. Political Analysis 12: 425443.
  • Gonzales-Farias G, Dominguez-Molina A, Gupta A. 2004. Additive properties of skew normal random vectors. Journal of Statistical Planning and Inference 126: 521534.
  • Greene W. 2005a. Reconsidering heterogeneity in panel data estimators of the stochastic frontier model. Journal of Econometrics 126: 269303.
  • Greene W. 2005b. Fixed and random effects in stochastic frontier models. Journal of Productivity Analysis 23: 732.
  • Greene W. 2009. The econometric approach to efficiency analysis. In The Measurement of Productive Efficiency: Techniques and Applications, Fried HO, Lovell CAK, Schmidt SS (eds). Oxford University Press: Oxford; 92159.
  • Hills SE, Smith AFM. 1992. Parameterization issues in Bayesian Inference. In Bayesian Statistics 4, Bernardo JM, Berger JO, Dawid AP, Smith AFM (eds). Oxford University Press: Oxford; 227246.
  • Jondrow J, Lovell CAK, Materov IS, Schmidt P. 1982. On the estimation of technical inefficiency in the stochastic frontier production function model. Journal of Econometrics 19: 233238.
  • Koop G. 1994. Recent progress in applied Bayesian econometrics. Journal of Economic Surveys 8: 134.
  • Koop G, Steel MFJ. 1999. Bayesian Analysis of Stochastic Frontier. http://www2.warwick.ac.uk/fac/sci/statistics.staff/academic-research/steel/_homepage/baltfin.pdf.
  • Koop G, Steel MFJ. 2003. Bayesian analysis of stochastic frontier models. In A Companion to Theoretical Econometrics, Baltagi BH (ed.). Blackwell: Oxford; 520537.
  • Koop G, Steel MFJ, Osiewalski J. 1995. Posterior analysis of stochastic frontier models using Gibbs sampling. Computational Statistics 10: 353373.
  • Kumbhakar S, Wang H-J. 2005. Estimation of growth convergence using a stochastic production frontier approach. Economics Letters 88: 300305.
  • Kumbhakar SC. 1987. The specification of technical and allocative inefficiency in stochastic production and profit frontiers. Journal of Econometrics 34: 335348.
  • Kumbhakar SC. 1991. Estimation of technical inefficiency in panel data models with firm- and time-specific effects. Economics Letters 36: 4348.
  • Kumbhakar SC, Heshmati A. 1995. Efficiency measurement in Swedish dairy farms: an application of rotating panel data, 1976–88. American Journal of Agricultural Economics 77: 660674.
  • Kumbhakar SC, Hjalmarsson L. 1993. Technical efficiency and technical progress in Swedish dairy farms. In The Measurement of Productive Efficiency: Techniques and Applications, Fried HO, Lovell CAK, Schmidt SS (eds). Oxford University Press: Oxford; 257270.
  • Kumbhakar SC, Hjalmarsson L. 1995. Labour-use efficiency in Swedish social insurance offices. Journal of Applied Econometrics 10: 3347.
  • Kumbhakar SC, Lovell CAK. 2000. Stochastic Frontier Analysis. Cambridge University Press: Cambridge, UK.
  • Kumbhakar SC, Tsionas EG. 2005a. Measuring technical and allocative inefficiency in the translog cost system: a Bayesian approach. Journal of Econometrics 126: 355384.
  • Kumbhakar SC, Tsionas EG. 2005b. The joint measurement of technical and allocative inefficiency: an application of Bayesian inference in nonlinear random effects models. Journal of the American Statistical Association 100: 736747.
  • Kumbhakar SC, Lien G, Hardaker JB. 2011. Technical efficiency in competing panel data models: a study of Norwegian grain farming. Department of Economics, SUNY Binghamton, NY.
  • Lancaster T. 2000. The incidental parameters problem since 1948. Journal of Econometrics 95: 391413.
  • Meeusen W, van den Broeck J. 1977. Efficiency estimation from Cobb–Douglas production functions with composed error. International Economic Review 18: 435444.
  • Neal R. 1996. Sampling from multimodal distributions using tempered transitions. Statistics and Computing 6: 35366.
  • Papaspiliopoulos O, Roberts GO, Sköld M. 2003. Non-centered parameterizations for hierarchical models and data augmentation. In Bayesian Statistics 7, Bernardo JM, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M (eds). Oxford University Press: Oxford; 307327.
  • Pitt M, Lee L-F. 1981. The measurement and sources of technical inefficiency in the Indonesian weaving industry. Journal of Development Economics 9: 4364.
  • Roberts CO, Smith AFM. 1994. Simple conditions for the convergence of the Gibbs sampler and Hastings–Metropolis algorithms. Stochastic Processes and their Applications 49: 207216.
  • Satchachai P, Schmidt P. 2010. Estimates of technical inefficiency in stochastic frontier models with panel data: generalized panel jackknife estimation. Journal of Productivity Analysis 34: 8397.
  • Schmidt P, Sickles R. 1984. Production frontiers and panel data. Journal of Business and Economic Statistics 2: 367374.
  • Sealy CW, Lindley J. 1977. Inputs, outputs and a theory of production and cost at depository financial institutions. Journal of Finance 32: 12521266.
  • Tanner MA, Wong WH. 1987. The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association 82: 528550.
  • Terrell D. 1996. Incorporating monotonicity and concavity conditions in flexible functional forms. Journal of Applied Econometrics 11: 179194.
  • Tierney L. 1994. Markov chains for exploring posterior distributions (with discussion). The Annals of Statistics 22: 17011762.
  • van den Broeck J, Koop G, Osiewalski J, Steel MFJ. 1994. Stochastic frontier models: a Bayesian perspective. Journal of Econometrics 61: 273303.
  • Verdinelli I, Wasserman L. 1995. Computing Bayes factors by using a generalization of the Savage–Dickey density ratio. Journal of the American Statistical Association 90: 614618.
  • Wang H-J, Ho CW. 2010. Estimating fixed-effect panel data stochastic frontier models by model transformation. Journal of Econometrics 157: 286296.
  • Wild P, Gilks WR. 1993. Adaptive rejection sampling from log-concave densities. Applied Statistics 42: 117126.

Supporting Information

  1. Top of page
  2. SUMMARY
  3. INTRODUCTION
  4. THE ECONOMETRIC MODEL
  5. BAYESIAN NUMERICAL INFERENCE PROCEDURES
  6. ARTIFICIAL EXAMPLES AND SAMPLING PERFORMANCE OF BAYES ESTIMATORS
  7. AN APPLICATION TO US BANKING
  8. CONCLUSIONS
  9. ACKNOWLEDGEMENTS
  10. REFERENCES
  11. Supporting Information

The JAE Data Archive directory is available at http://qed.econ.queensu.ca/jae/datasets/kumbhakar003/

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.