Abstract
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
This paper provides estimates of bank efficiency and productivity in the United States, over the period from 1998 to 2005, using (for the first time) the globally flexible Fourier cost functional form, as originally proposed by Gallant (1982), and estimated subject to global theoretical regularity conditions, using procedures suggested by Gallant and Golub (1984). We find that failure to incorporate monotonicity and curvature into the estimation results in mismeasured magnitudes of cost efficiency and misleading rankings of individual banks in terms of cost efficiency. We also find that the largest two subgroups (with assets greater than 1 billion in 1998 dollars) are less efficient than the other subgroups and that the largest four bank subgroups (with assets greater than $ 400 million) experienced significant productivity gains and the smallest eight subgroups experienced insignificant productivity gains or even productivity losses. Copyright © 2008 John Wiley & Sons, Ltd.
1. INTRODUCTION
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
In the last 25 years (from 1980 to 2005), the banking industry in the United States has been greatly transformed by numerous regulatory changes—see, for example, Lown et al. (2000), Kroszner and Strahan (2000), and Montgomery (2003) for a detailed list of regulatory changes. These changes, and particularly those related to the permission of interstate branching and combinations of banks, securities firms, and insurance companies, stimulated the decadelong consolidation in the industry characterized by the dramatic rise in merger and acquisition activities, the rapid decline in the number of commercial banks and the increasing concentration of industry assets among the very large banks (see Jones and Critchfield, 2005). On the other hand, various innovations in technology and applied finance were widespread and intensively adopted by the US banking industry. These technological and financial innovations include, but are not limited to, information processing and telecommunication technologies, the securitization and sale of bank loans, and the development of derivatives markets. The widespread and intensive use of information technologies and financial innovation has facilitated the rapid transfer of information at low cost, increased the scope and volume of nontraditional activities, and also helped facilitate consolidation of the industry (see Berger et al., 1995; Berger, 2004).
The question of whether the unprecedented transformation has made the US banking industry more efficient has stimulated a substantial body of efficiency studies—see, for example, surveys in Berger and Humphrey (1997) and Berger et al. (1999). One dimension of banking efficiency that attracted a lot of research interest (especially in studies prior to the 1990s) is scale efficiency and scope efficiency. The former is used to measure whether a banking firm is producing at optimal output levels; and the latter is used to measure whether it is producing at an optimal combination of outputs. The other dimension of banking efficiency that has received increasing attention since the early 1990s is Xefficiency. Xefficiency is called ‘frontier efficiency’ in Bauer et al. (1998) and ‘economic efficiency’ in Kumbhakar and Lovell (2003). The interested reader is referred to Kumbhakar and Lovell (2003) for an excellent discussion of the relationship between different concepts of efficiency.
Xefficiency is a combination of technical efficiency and allocative efficiency, with the former referring to the ability of a firm to produce output from a given set of inputs and the latter referring to the extent to which a firm uses the inputs in the best proportions, given their prices. Xefficiency is most commonly measured by determining an industry's bestpractice frontier and comparing how far each firm deviates from this frontier. However, previous studies revealed that Xinefficiency outweighs scale and scope inefficiencies by a considerable margin, and thus, as Bauer et al. (1998, p. 86) put it, ‘have a strong empirical association with higher probabilities of financial institution failures.’ According to Berger and Humphrey (1991), cost inefficiency consumes 25% or more of total costs, whereas scale inefficiency and allocative inefficiency consume only 5% or less. Therefore, in recent years, research on the efficiency of the US banking industry has increasingly focused on Xefficiency.
The literature investigating Xefficiency in the US banking industry has been dominated by two methodologies: nonparametric Data Envelopment Analysis (DEA for short) and the parametric Stochastic Frontier Analysis (SFA for short). Two other less commonly used parametric approaches are the Thick Frontier Analysis (TFA for short, see Berger and Humphrey, 1991) and the Distribution Free Approach (DFA for short, see Berger, 1993). First put forward by Charnes et al. (1978), the DEA approach is a linear programming technique where the efficient frontier is formed as the piecewise linear combination that connects the set of bestpractice observations in the dataset under analysis, yielding a convex production possibility set (see Berger and Humphrey, 1997). However, because DEA uses only the data on inputs and outputs and does not take direct account of input prices, it does not incorporate allocative inefficiency.
The SFA approach, based on the ideas of Aigner et al. (1977) and Meeusen and van den Broeck (1977), involves the estimation of a specific parameterized efficiency frontier with a composite error term consisting of nonnegative inefficiency and noise components. Xefficiency can thus be measured in terms of cost efficiency, revenue efficiency, or profit efficiency, depending on the type of frontier used. The DEA and SFA approaches generally give very different efficiency estimates. However, Bauer et al. (1998) and Rossi and Ruzzier (2000) argue that it is not necessary to have a consensus on which is the single best frontier approach for measuring efficiency. They also propose a series of criteria to evaluate if the inefficiency estimates obtained from different approaches are mutually consistent in terms of inefficiency scores and ranks.
Cost efficiency has received the most attention in the parametric analysis of efficiency of the US banking industry. According to Berger and Humphrey (1997), 30 out of 38 studies that employed parametric techniques in the analysis of efficiency in the US banking industry were reported to employ cost functions, and the rest employed profit functions—among these 38 parametric studies of the efficiency of the US banking industry, several employed TFA and DFA. Despite its popularity, the cost frontier used in previous studies suffers from the following two problems. First, the estimated parameters of cost frontiers frequently violate the monotonicity and concavity constraints implied by economic theory, which eventually leads to wrong conclusions concerning efficiency levels. While permitting a parameterized function to depart from the neoclassical function space is usually fitimproving, it also causes the hypothetical best practice firm not to be fully efficient at those data points where theoretical regularity is violated.
Second, the cost frontier suffers the problem of not having enough flexibility. Most of the previous studies employ a translog functional form. Researchers have found, however, that the translog function lacks enough flexibility in modelling the US banking industry which is composed of banks of widely varying sizes (see McAllister and McManus, 1993; Wheelock and Wilson, 2001). In an attempt to increase flexibility, more recent studies employ a so called ‘Fourier function’ which is actually a translog function augmented with trigonometric Fourier terms. Although this socalled ‘Fourier function’ can improve the goodness of fit, it is not a true Fourier flexible functional form, in Gallant's (1982) original sense. In particular, the original Fourier flexible functional form consists of two components, with the first component being a ‘reparameterized’ translog function and the second component a trigonometric Fourier series. It is important to note that these two components are not independent of each other. In fact, the scaled variables of outputs and input prices are not only used in the Fourier series, but also in the modified translog part. However, the socalled ‘Fourier function’ ignores the parametric relationship between the two components of the Fourier function, and just includes the scaled variables of outputs and input prices in the Fourier series. While this practice makes it a lot easier to use the Fourier function, it may be unable to reach close approximation in the Sobolev norm and may result in inconsistent parameter estimates.
Motivated by the widespread practice of ignoring the theoretical regularity conditions and not using a globally flexible functional form, as summarized in Table I, the purpose of this paper is to reinvestigate the cost efficiency of the US banking industry with more recent panel data over the sample period from 1998 to 2005, and by addressing the above two problems inherent in previous studies. In doing so, we take the SFA approach, and minimize the potential problem of using a misspecified functional form by employing a globally flexible functional form—Gallant's (1982) original Fourier flexible functional cost form. It should be noted that there are two globally flexible functional forms which can provide greater flexibility than locally flexible functional forms: the Fourier flexible functional form and the Asymptotically Ideal Model, introduced by Barnett et al. (1991). The former is based on a Fourier series expansion and the latter is based on a linearly homogeneous multivariate Muntz–Szatz series expansion. Both of them are globally flexible in the sense that they are capable of approximating the underlying cost function at every point in the function's domain by increasing the order of the expansion, and thus have more flexibility than most of the locally flexible functional forms which theoretically can attain flexibility only at a single point or in an infinitesimally small region. In this study we employ the Fourier cost functional form, which is both loglinear and globally flexible. In the implementation of it, we strictly follow Gallant's (1982) original specification of the functional form rather than just include the scaled variables of outputs and input prices in the Fourier series as previous studies did.
Table I. A summary of flexible functional forms estimation of cost efficiency of US banksStudy  Model used  True Fourier  Curvature imposed 


Ferrier and Lovell (1990)  Translog   No 
Berger and Humphrey (1991)  Translog   No 
Berger (1993)  Translog   No 
Kaparakis et al. (1994)  Translog   No 
Berger and Mester (1997)  Translog + Fourier trigonometric terms  No  No 
Berger et al. (1997)  Translog + Fourier trigonometric terms  No  No 
Peristiani (1997)  Translog   No 
DeYoung (1997)  Translog   No 
Mester (1997)  Translog   No 
DeYoung et al. (1998)  Translog + Fourier trigonometric terms  No  No 
Stiroh (2000)  Translog   No 
Clark and Siems (2002)  Translog   No 
Berger and Mester (2003)  Translog + Fourier trigonometric terms  No  No 
We also estimate the Fourier flexible cost function subject to full theoretical regularity. There are three approaches to incorporating curvature and/or monotonicity restrictions into flexible functional forms: the Cholesky factorization approach, the Bayesian approach, and the nonlinear constrained optimization approach. The Cholesky factorization approach can only guarantee the negative semidefiniteness of the Hessian matrix of a cost function in a region around the reference point (that is, a data point where curvature is imposed), and satisfaction of curvature at data points far away from the reference point can only be obtained by luck (see Ryan and Wales, 2000). This is not satisfactory, especially when the sample size is large and violations of curvature are widespread. The Bayesian approach involves specifying prior distributions for parameters and inefficiency terms. However, the specification of prior distributions adds extra uncertainty to the outcome of the modelling exercise, especially when researchers have no idea of how to parameterize a priori the unknown parameters (see Diewert, 2004; Greene, 2005). The nonlinear constrained optimization approach, originally proposed by Gallant and Golub (1984) and recently used by Serletis and Shahmoradi (2005) in the context of consumer demand systems, develops computational methods for imposing curvature restrictions at any arbitrary set of points. Monotonicity can also be incorporated into the estimation of the cost function although the original Gallant and Golub (1984) paper does not do so. This method applies to any cost function as long as the Hessian matrix (or some transform of the Hessian matrix) and the firstorder conditions of the cost function can be explicitly specified. While the nonlinear constrained optimization method has many desirable properties, no attempt has been made in the stochastic frontier literature to use this method to incorporate monotonicity and curvature on parametric (cost or distance) functions.
The rest of the paper is organized as follows. Section 2 provides a brief review of stochastic cost frontiers. In Section 3 we present the Fourier cost function and detail the homogeneity, monotonicity, and curvature constraints implied by neoclassical microeconomic theory. In Section 4 we discuss the constrained nonlinear optimization methodology for imposing these constraints on the parameters of the Fourier cost function. Section 5 deals with the data description. In Section 6, we apply our model to panel data on US banks, and discuss the effect of the incorporation of monotonicity and curvature on cost efficiency, and also report our estimates on cost efficiency for 12 different bank groups. Section 7 summarizes and concludes the paper.
2. STOCHASTIC COST FRONTIER
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
Within a panel data framework, the cost frontier model can be written as
 (1)
This model decomposes the observed cost for firm i at time t, C_{it}, into three parts: (i) the actual frontier f(X_{it}, ρ), which depends on X_{it}, a vector of exogenous variables (i.e., input prices and output quantities), and ρ, a vector of parameters, and which represents the minimum possible cost of producing a given level of output with certain input prices; (ii) a nonnegative term τ_{it} ≥ 1, measuring firmspecific inefficiency; and (iii) a random error, ζ_{it}, which captures statistical noise. The deterministic kernel of the cost frontier is f(X_{it}, ρ), and the stochastic cost frontier is f(X_{it}, ρ)ζ_{it}. As required by microeconomic theory, f(X_{it}, ρ) is a linearly homogeneous and concave function in prices and also nondecreasing in both input prices and outputs.
We follow the common practice in this literature and assume that f(X_{it}, ρ) is a loglinear functional form. The stochastic cost function in (1) is rewritten as
 (2)
where c_{it} = lnC_{it}; α+ x′_{it}β = lnf(X_{it}, ρ); u_{it} = lnτ_{it} ≥ 0; and v_{it} = lnζ_{it}. x_{it} is the counterpart of X_{it} with the input prices and output quantities transformed to logarithms, β is a K × 1 vector of parameters, and α is the intercept. Thus the composite error term ε_{it}( = u_{it} + v_{it}) consists of two parts, with u_{it} capturing the level of firm inefficiency and v_{it} capturing statistical noise.
In an empirical exercise, assumptions are commonly made about the two error components. Usually the v_{it}s are assumed to be i.i.d. N(0, σ^{2}) and independent of the u_{it}s, an assumption we maintain throughout this paper. In the specification of the distribution for the u_{it}s we assume
 (3)
where
 (4)
where η_{1} and η_{2} are parameters to be estimated and the u_{i}s are assumed to be independently and identically distributed nonnegative truncations of the distribution. Note that the above exponential function of time, η_{it}, is a generalization of that proposed by Battese and Coelli (1992) in the sense that it relaxes the monotonicity of the temporal variation pattern of the efficiency term using a twoparameter specification.
The cost efficiency of firm i at time t can then be defined as the ratio of minimum cost attainable in an environment characterized by exp(v_{it}) to observed expenditure, as follows:
 (5)
with CE_{it} ≤ 1. Notice that CE_{it} = 1 if and only if c_{it} = α+ x′_{it}β + v_{it}. For example, if a firm is 80% efficient, it could reduce costs by 20% simply by becoming fully efficient.
3. THE FOURIER COST FUNCTION
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
In equation (6), ; λ is a rescaling factor, and k_{α} is a multiindex—an (N + M) vector with integer components. As Gallant (1982) shows, the length of a multiindex, denoted as , reduces the complexity of the notation required to denote highorder partial differentiation and multivariate Fourier trigonometric terms (those sin and cos terms). Following Gallant (1982), these indexes are constructed using the following rules (the construction of these indexes is complex and is performed using MATLAB). First, the zero vector and any k_{α} whose first nonzero element is negative are deleted. Second, every index with a common integer divisor is also deleted.
As a Fourier term is a periodic function in its arguments but the cost function is not, the scaling of the data is also important. In empirical applications, to avoid the approximation from diverging from the true cost function, the data should be rescaled by a common scaling factor, λ, so that the input prices and output quantities lie in the interval [0, 2π]. The common scaling factor, λ, for input prices is defined analogously as in Gallant (1982). The parameters E (the number of terms) and J (the degree of the approximation) determine the degree of the Fourier polynomials. Thus, the Fourier cost function has 1 + (N + M)+ E(1 + 2J) parameters to be estimated.
Substituting the cost frontier defined by (6) into (2), we obtain the basic panel data stochastic cost frontier model we are going to use in this paper:
 (9)
where all parameters and variables are defined as above.
3.1. Theoretical Regularity
As required by microeconomic theory, the Fourier cost function in (6) has to satisfy certain theoretical regularity conditions, i.e., homogeneity, monotonicity, and concavity. The restriction of linear homogeneity on the Fourier cost frontier can be imposed through reparameterization, as in Gallant (1982) and Gallant and Golub (1984):
 (10)
and
 (11)
Restriction (10) guarantees the linear homogeneity of the firstorder terms, and (11) guarantees the linear homogeneity of both the secondorder terms and the Fourier trigonometric terms.
We now turn to the monotonicity and curvature constraints. For simplicity, the subscripts i and t for all variables are suppressed in this subsection to avoid notational cluster. Define ∇_{z}g(l, q, ϑ) = ∂[g(l, q, ϑ)]/∂z, and , where z = (l, q) as above. By the two equations defined in (7), it can be easily shown that
 (12)
where f(p_{1}, ·, p_{n}, y_{1}, ·, y_{m}) = f(X_{it}, ρ) is the cost frontier corresponding to the Fourier cost function. In what follows, we use f(p, y) instead of f(X_{it}, ρ). Taking the partial derivative of both sides of (12) with respect to z, we can obtain the following equation:
 (13)
where and Z is a diagonal matrix with unscaled input prices (p_{1}, ·, p_{N}) and outputs (y_{1}, ·, y_{M}) on its main diagonal. With both f(p, y) and Z^{−1} being positive, monotonicity (∂[f(p, y)]/∂p > 0) requires
 (14)
where ∇g(l, q, ϑ) has to satisfy , which can be derived from the fact that the cost function is homogeneous of degree one in prices, i.e.
 (15)
In equation (15), the first equality can be obtained by using (13).
Concavity in input prices requires that the Hessian matrix, H, of the cost frontier, f (p, y), is negative semidefinite. It can be easily shown that the element of the ith row and jth column of the Hessian matrix, H, of the cost function f (p, y) is given by (see Appendix)
 (16)
where s_{i} is the cost share for input i, ∇s_{i} is the derivative of s_{i} with respect to the log price of input j, is the demand for input i, obtained by Shephard's lemma as the first derivative of the cost function with respect to input price p_{i}, and δ_{ij} = 1 if i = j and 0 otherwise.
Since is positive by the property of monotonicity, ∇g(l, q, ϑ) is positive by equation (14), and p_{j} is also positive, concavity of the Hessian matrix in our particular case is equivalent to requiring (in matrix notation) that
 (18)
be a negative semidefinite matrix. Thus, (14) and (18) are the constraints we need to incorporate into the estimation of the Fourier cost frontier defined in (9)—the monotonicity and curvature conditions are provided in Gallant (1982) without proof.
4. CONSTRAINED OPTIMIZATION
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
We first run an unconstrained optimization using (19) and check the theoretical regularity conditions of monotonicity and curvature. In case that the monotonicity and curvature conditions are not satisfied at all observations, we use the NPSOL nonlinear programming program to minimize − lnL(φ(θ)) with monotonicity and concavity imposed. Essentially, this becomes a constrained maximum likelihood problem.
While we follow Gallant and Golub (1984) and use nonlinear constrained optimization to impose curvature, we do not do it by constructing their submatrix K_{22} using a Householder transformation and then deriving an indicator function for the smallest eigenvalue of K_{22} and it derivative. Instead, we work directly with the matrix G defined in (18), restricting its eigenvalues to be nonpostive. This is because a necessary and sufficient condition for negative semidefiniteness of G is that all its eigenvalues are nonpositive (see Morey, 1986). Compared with the Gallant and Golub (1984) approach where a reduced matrix K_{22} is sought, the direct restriction of the eigenvalues of G to be nonpositive seems more appealing.
It is well known that an N × N real symmetric matrix has N eigenvalues, with these eigenvalues being real numbers (see Magnus, 1985). Let λ = [λ_{1}, ·, λ_{N}] then denote the N eigenvalues of G, a real symmetric matrix defined in (18). The nonlinear curvature constraints for our constrained optimization problem can then be written as
The eigenvalues of G can be obtained by solving
 (20)
where I_{N} is an N × N identity matrix. Clearly, λ_{n}(n = 1, ·, N) are functions of the elements of G, denoted G_{ij}, which are in turn linear functions of and ∇g(l, q, ϑ) as can be seen from (18). In fact, in our case with N = 3, we have
 (21)
for n = 1, 2, 3, where
 (22)
for i = 1, 2, 3 and j = i, ·, 3. Explicit formulas for λ_{n}(ϑ) in terms of the G_{ij} elements can be easily obtained using the symbolic toolbox in MATLAB. After substituting (22) into λ_{n}(ϑ), the eigenvalues in terms of and ∇g(l, q, ϑ) can be obtained.
As for the derivatives of λ_{n}(θ), they can be obtained using equation (21), as follows:
 (23)
All of , and ∂[∇g(l, q, ϑ)]/∂ϑ can be easily computed. In our case with N = 3, each of (the eighteen) ∂λ_{n}/∂G_{ij} (for n = 1, 2, 3, i = 1, 2, 3, and j = i, ·, 3) are calculated using the symbolic toolbox in MATLAB.
In addition to the imposition of concavity, the monotonicity constraints in (14) also need to be imposed, if monotonicity is violated. The derivatives for the monotonicity constraints, ∂[∇g(l, q, ϑ)]/∂ϑ and ∂[∇_{qm}g(l, q, ϑ)]/∂ϑ, also can be easily computed. Hence, our constrained maximum likelihood problem can be written as follows:
 (24)
subject to
 (25)
 (26)
where λ_{n} is the curvature constraint for each observation and W_{j} is the monotonicity constraint for each observation as shown in (14). As already noted, we can impose the regularity constraints locally (at a single data point), regionally (over a region of data points), or fully (at every data point in the sample). After estimates of u_{0}, β, , γ, η_{1}, and η_{2} are obtained, and can then be calculated by using and , both of which are discussed above.
Following Battese and Coelli (1992), the minimummeansquarederror predictor of the cost efficiency of the ith bank at time t, CE_{it} = exp(−u_{it}) is
 (27)
where
 (28)
 (29)
This framework allows us to calculate the efficiency level of each bank relative to the bestpractice bank represented by the cost frontier.
While we follow Gallant and Golub (1984) in imposing the theoretical regularity conditions on the parameters of the Fourier flexible cost function, we extend Gallant's method in two ways. First, we extend Gallant's constrained nonlinear optimization approach from a traditional factor demand system framework to a stochastic frontier framework. This extension involves the use of a much more complicated loglikelihood function as the objective function, rather than the simple least squares based objective function used in Gallant and Golub (1984). This is because a composed error term is assumed in the stochastic frontier framework, whereas a simple i.i.d. N(0, σ^{2}) error term is assumed in the traditional factor demand system framework. Second, we extend Gallant's method from a time series framework to a panel data framework.
5. THE DATA
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
The data used in this study, obtained from the Reports of Income and Condition (Call Reports), cover the period from 1998 to 2005. We examine only continuously operating banks to avoid the impact of entry and exit and to focus on the performance of a core of healthy, surviving institutions during the sample period. There were 10,139 banks in the US banking industry in 1998, and the number declined to 8,390 in 2005 due to industry consolidation. After deleting those observations whose input prices are negative or zero, we obtained a balanced panel of 6010 observations for 8 years, from 1998 to 2005.
In choosing which financial accounts to specify as outputs versus inputs, we use the accounting balancesheet approach of Sealey and Lindley (1977). All liabilities (core deposits and purchased funds) and financial equity capital provide funds and are treated as inputs. All assets (loans and securities) use bank funds and are treated as outputs. This approach is different from the intermediation approach, which is consistent with the valueadded definition of output production by financial firms and with usercost price evaluation of the services of outputs. An accurate representation of the intermediation approach can be found in Barnett (1987), Barnett and Hahm (1994), Barnett and Zhou (1994), Barnett et al. (1995), and Hancock (1991).
In this paper, three output quantities and three input prices are identified. The three outputs are consumer loans, y_{1}; nonconsumer loans, y_{2}, is composed of industrial and commercial loans and real estate loans; and securities, y_{3}, includes all nonloan financial assets, i.e., all financial and physical assets minus the sum of consumer loans, nonconsumer loans, securities, and equity. All outputs are deflated by the Consumer Price Index (CPI) to the base year 1988. The three prices include: the wage rate for labor, p_{1}; the interest rate for borrowed funds, p_{2}; and the price of physical capital, p_{3}. The wage rate equals total salaries and benefits divided by the number of fulltime employees. The price of capital equals expenses on premises and equipment divided by premises and fixed assets. The price of deposits and purchased funds equals total interest expense divided by total deposits and purchased funds. Total cost is thus the sum of these three input costs. This specification of outputs and input prices is the same as or similar to most of the previous studies in this literature (see, for example, Akhigbea and McNulty, 2003; Stiroh, 2000; Berger and Mester, 2003). Thus, M = N = 3 in this paper. The three outputs and three input prices are then scaled, using the formulas specified in equations (7)–(8) of Section 3 for each of the 12 asset size classes, which we will discuss in more detail below.
The set of elementary multiindexes that satisfy and have norm k_{iα} ≤ 3 are displayed in Table II—these three k_{iα}(i = 1, 2, 3) are the three elements in the k_{α} vector corresponding to the three input prices. For this set E = 32, and we take J = 1. While Chalfant and Gallant (1985) and Eastwood and Gallant (1991) have suggested that the number of parameters to be estimated should be equal to the number of effective sample observations raised to the power of 2/3, in this paper we set the number of parameters such that k_{iα} ≤ 3 in order to reduce the number of parameters to a manageable level, given that we also have to deal with hundreds of variables and thousands of highly nonlinear constraints. Thus we have a total of 1 + (N + M)+ E(1 + 2J) = 1 + (3 + 3)+ 32 × (1 + 2) = 103 free parameters (that is, parameters estimated directly).
Table II. Elementary multiindexesα  1  2  3  4  5  6  7  8  9  10  11 

l_{1}  1  1  0  0  0  0  1  1  0  1  1 
l_{2}  −1  0  1  0  0  0  −1  0  1  −1  0 
l_{3}  0  −1  −1  0  0  0  0  −1  −1  0  −1 
q_{1}  0  0  0  1  1  0  1  0  0  0  1 
q_{2}  0  0  0  −1  0  1  0  1  0  0  0 
q_{3}  0  0  0  0  −1  −1  0  0  1  1  0 
k_{α}^{*}  2  2  2  2  2  2  3  3  3  3  3 
α  12  13  14  15  16  17  18  19  20  21  22 
l_{1}  0  1  1  0  1  1  1  0  0  0  0 
l_{2}  1  −1  0  1  −1  −1  −1  1  1  0  0 
l_{3}  −1  0  −1  −1  0  0  0  −1  −1  0  0 
q_{1}  1  0  0  0  −1  0  0  0  0  1  1 
q_{2}  0  0  0  1  0  −1  0  −1  0  −2  0 
q_{3}  0  1  1  0  0  0  −1  0  −1  0  2 
k_{α}^{*}  3  3  3  3  3  3  3  3  3  3  3 
α  23  24  25  26  27  28  29  30  31  32  
l_{1}  0  0  0  0  0  0  0  0  0  0  
l_{2}  0  0  0  0  0  0  0  0  0  0  
l_{3}  0  0  0  0  0  0  0  0  0  0  
q_{1}  1  1  2  2  0  0  0  0  0  0  
q_{2}  1  1  0  1  1  1  2  2  1  0  
q_{3}  −1  0  1  0  −2  1  −1  1  0  1  
k_{α}^{*}  3  2  3  3  3  2  3  3  1  1  
However, the effective number of parameters is 85 owing to the following restrictions. The homogeneity restriction
 (30)
reduces the number of free parameters by one. The remaining restrictions are due to the overparameterization of the A matrix. In particular, A is a 6 × 6 symmetric matrix which satisfies three linearly independent homogeneity restrictions:
 (31)
Moreover, the symmetry of the matrix A also implies
 (32)
Thus A can have at most 15 free parameters, and in the parameterization
 (33)
15 of the u_{0α} parameters are free parameters and 17 parameters must be set equal to zero. These 17 k_{α} parameters are listed in the last 17 columns of Table II.
Following Berger and Mester (2003), we add three more variables into the Fourier cost function: financial equity capital, , nontraditional banking activities, , and a time trend, t. Financial equity capital is treated as a fixed net input and offbalancesheet items are treated as a fixed net output. The time trend t is intended to capture the effect of technological change on cost. In the treatment of nontraditional banking activities, we follow Boyd and Gertler (1994) and use an assetequivalent measure (AEM) of these nontraditional activities. We assume that all noninterest income is generated from offbalancesheet assets, and that these nontraditional activities yield the same rate of return on assets (ROA) as traditional activities do. Thus, we transform the offbalancesheet income into an equivalent asset. The two fixed net inputs are measured in 1998 constant dollars and used in logarithm form. When adding the , and t variables in the Fourier cost function, these variables are used in linear and quadratic form (i.e., ), and do not interact with the outputs and input prices in order to reduce the number of parameters to a manageable level and to lessen the effects of multicollinearity.
Separating banks into asset size classes is a common approach in assessing the performance of banks' asset size. However, given the unique nature of the distribution of asset size for commercial banks in the United States, it is very difficult to categorize banks based upon asset size and also there is no industry standard on asset ranges. Over our sample period, from 1998 to 2005, around 85% of all commercial banks report less than $ 500 million in total assets. However, over that same time period, there exists a cluster of extremely large banks with over $ 3 billion in total assets that accounts for roughly 2.3% of all commercial banks. In this paper, we classify all banks into three groups: banks with over $ 500 million in total assets are classified as large banks, banks with assets between $ 100 million and $ 500 million are classified as medium banks, and banks with under $ 100 million in assets are classified as small banks.
This classification is mainly based on the standard asset size categories that are used by the Federal Financial Institutions Examination Council (FFIEC), as specified in forms 031, 032, 033, and 034. The only difference is that FFIEC sets the asset cap for medium banks to $ 300 million. The reason for this change is to keep consistency with the Financial Modernization Act and many previous studies which use $ 500 million as the lower limit for large banks. To reduce the computation time for each of the bank subgroups and in order to avoid heterogeneity biases associated with asset size, we further classify each of the three bank groups into several subgroups. Specifically, we use cutoffs at $ 20 million, $ 40 million, $ 60 million, and $ 80 million within the small bank group; $ 200 million, $ 300 million, and $ 400 million within the medium bank group; and $ 1 billion and $ 3 billion within the large bank group. Table III presents the 12 bank subgroups, together with their corresponding asset ranges at 2000 dollars and at 2005 dollars, as well as the number of banks in each subgroup.
Table III. Bank asset size classesBank groups  Asset size (in millions of 2000 dollars)  Asset size (in millions of 2005 dollars)  Number of banks  Share of banks 

Large banks 
Group 1  Assets ≥ 3000  Assets ≥ 3402  141  2.3% 
Group 2  1000 ≤ assets < 3000  1134 ≤ assets < 3000  218  3.6% 
Group 3  500 ≤ assets < 1000  567 ≤ assets < 1134  381  6.3% 
Medium banks 
Group 4  400 ≤ assets < 500  453.6 ≤ assets < 567  201  3.3% 
Group 5  300 ≤ assets < 400  340.2 ≤ assets < 453.6  321  5.3% 
Group 6  200 ≤ assets < 300  226.8 ≤ assets ≤ 340.2  602  10.0% 
Group 7  100 ≤ assets < 200  113.4 ≤ assets ≤ 226.8  1262  21.0% 
Small banks 
Group 8  80 ≤ assets < 100  90.72 ≤ assets ≤ 113.4  477  7.9% 
Group 9  60 ≤ assets < 80  68.04 ≤ assets ≤ 90.72  597  9.9% 
Group 10  40 ≤ assets < 60  45.36 ≤ assets ≤ 68.04  669  11.1% 
Group 11  20 ≤ assets < 40  22.68 ≤ assets ≤ 45.36  813  13.5% 
Group 12  assets ≤ 20  assets ≤ 22.68  328  5.5% 
Total    6010  100% 
It is to be noted, however, that this classification keeps the asset ranges fixed for the asset classes from year to year. These fixed asset ranges raise a serious question regarding the usefulness of the results when a long sample period, such as this study's sample period, is under examination. To deal with this problem, an approach similar to that laid out in the Financial Modernization Act (FMA) is used. In particular, we define a community bank to be an institution with average total deposits over the preceding 3 years of no more than $ 500 million. Each subsequent year, the asset cap is adjusted upward by the growth in the CPI (for all urban consumers) unadjusted for seasonal variation for the previous year (see Federal Register, 2000). The cap for each year is published in the Federal Register, early in the year, along with the inflation rate used in the adjustment. For example, the official asset cap for community banks in 2005 is adjusted to $ 567 million (see Federal Register, 2005). Consistent with the approach, all the asset size cutoffs are set at 2000 constant dollars, and are adjusted upward by the growth in the CPI.
6. EMPIRICAL RESULTS
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
We use the TOMLAB/NPSOL tool box with MATLAB to estimate the model using panel data for each of the bank subgroups. For each subgroup, the model is estimated under four different levels of constraints: with no constraints imposed; with only the curvature constraint imposed; with only the monotonicity constraint imposed; and with both the monotonicity and curvature constraints imposed. For each of the latter three cases, we impose curvature and/or monotonicity in a stepwise manner—first locally and then globally in case that regularity is not satisfied when local imposition is employed. Tables IV–XV summarize the results for each of the 12 subgroups in terms of parameter estimates, together with the percentages of monotonicity and curvature violations. Due to space limitations, we report only the intercept, u_{0}, the coefficients on the first order terms, b, the coefficients on the secondorder terms, , and the coefficients on the time trend and and variables.
Table IV. Parameter estimates for group 1Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  7.3739  6.6017  7.7557  7.3661 
b_{1}  0.6699  0.6858  0.9193  0.6671 
b_{2}  0.0818  0.0949  −0.0971  0.0824 
b_{3}  0.1557  0.1188  0.1955  0.1353 
b_{4}  −0.4922  −0.5293  −0.0912  −0.2805 
b_{5}  0.5639  0.6936  0.5790  0.3820 
u_{01}  0.3094  0.2655  0.0954  0.3084 
u_{02}  −0.2870  −0.2473  −0.2998  −0.2855 
u_{03}  0.3850  0.3496  0.2252  0.3825 
u_{04}  0.0021  −0.0460  −0.0041  0.0089 
u_{05}  −0.0701  −0.0773  −0.0645  −0.0712 
u_{06}  0.5085  0.2820  0.5583  0.3658 
u_{07}  0.0112  0.0097  0.1887  0.0163 
u_{08}  0.0837  0.0816  0.1284  0.0911 
u_{09}  −0.0450  −0.0436  −0.0559  −0.0520 
u_{010}  −0.0577  −0.0657  −0.0733  −0.0640 
u_{011}  −0.1521  −0.1403  −0.4515  −0.1577 
u_{012}  −0.3178  −0.2892  −0.2849  −0.3127 
u_{013}  −0.3056  −0.2670  −0.2984  −0.3022 
u_{014}  0.4326  0.3922  0.4842  0.4286 
u_{015}  0.0062  0.0068  0.1567  0.0105 
t  −0.1080  −0.1050  −0.1021  −0.1076 
t^{2}  0.0045  0.0043  0.0033  0.0045 
Nontrad  −0.0910  −0.0622  −0.1062  −0.0886 
Nontrad^{2}  0.0136  0.0093  0.0171  0.0136 
Equity  0.1726  0.2953  0.1996  0.1820 
Equity^{2}  −0.0051  −0.0089  −0.0056  −0.0054 
 0.1719  0.1690  0.1862  0.1716 
γ  0.9473  0.9486  0.9543  0.9472 
η_{1}  0.0574  0.0438  0.0309  0.0560 
η_{2}  0.0449  0.0403  0.0436  0.0446 
Loglikelihood  611.5  586.9  593.4  572.1 
Curvature violations  1.4%  0  37.5%  0 
Monotonicity violations  0.1%  5.7%  0  0 
Mean efficiency  0.8171    0.8219 
Table V. Parameter estimates for group 2Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  12.8594  12.3481  13.2671  12.9243 
b_{1}  0.8502  0.8710  0.8531  0.8048 
b_{2}  0.0493  0.0484  0.0427  0.1095 
b_{3}  0.1119  0.0972  0.1144  0.0718 
b_{4}  −0.0299  −0.0214  0.0503  0.1488 
b_{5}  0.3419  0.3793  0.3318  0.3613 
u_{01}  −0.0274  −0.0434  −0.0449  −0.0734 
u_{02}  −0.0518  −0.0229  −0.0408  0.0338 
u_{03}  0.0174  0.0213  −0.0002  −0.0169 
u_{04}  −0.0314  −0.0265  −0.0345  −0.0147 
u_{05}  −0.0050  −0.0042  0.0042  −0.0024 
u_{06}  −0.0540  −0.0515  −0.0603  −0.0633 
u_{07}  0.0013  −0.0060  0.0207  0.0250 
u_{08}  0.0261  0.0318  0.0242  0.0248 
u_{09}  −0.0178  −0.0203  −0.0163  −0.0204 
u_{010}  −0.0262  −0.0273  −0.0240  −0.0262 
u_{011}  0.0345  0.0492  0.0122  0.0144 
u_{012}  0.0316  0.0456  0.0263  0.0473 
u_{013}  0.0133  0.0295  0.0082  0.0326 
u_{014}  −0.0268  −0.0489  −0.0210  −0.0535 
u_{015}  −0.0450  −0.0501  −0.0232  −0.0162 
t  −0.0812  −0.0849  −0.0822  −0.0896 
t^{2}  0.0043  0.0048  0.0044  0.0052 
Nontrad  0.0371  0.0371  0.0389  0.0370 
Nontrad^{2}  −0.0026  −0.0026  −0.0031  −0.0022 
Equity  −0.6221  −0.6221  −0.8316  −0.8418 
Equity^{2}  0.0299  0.0299  0.0389  0.0389 
 0.0769  0.0769  0.0771  0.0745 
γ  0.8851  0.8851  0.8866  0.8794 
η_{1}  0.1523  0.1523  0.1819  0.1282 
η_{2}  0.0794  0.0794  0.0923  0.0682 
Loglikelihood  934.6  926.9  933.2  919.4 
Curvature violations  12.6%  0  13.2%  0 
Monotonicity violations  6.0%  6.2%  0  0 
Mean efficiency  0.8852  —  —  0.8820 
Table VI. Parameter estimates for group 3Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  11.6880  11.9781  11.7559  11.2504 
b_{1}  0.6484  0.6503  0.6456  0.6142 
b_{2}  0.0696  0.0798  0.0657  0.0936 
b_{3}  0.0552  0.0948  0.0782  0.0956 
b_{4}  0.2464  0.2113  0.1904  0.7056 
b_{5}  0.7217  0.7357  0.5611  0.5993 
u_{01}  −0.1411  −0.1574  −0.0803  −0.2350 
u_{02}  0.1323  0.1550  0.0685  0.2363 
u_{03}  −0.1258  −0.1401  −0.0657  −0.2247 
u_{04}  −0.0275  −0.0332  −0.0127  −0.0135 
u_{05}  −0.0181  −0.0175  −0.0104  −0.0120 
u_{06}  −0.0316  −0.0207  −0.0222  −0.0234 
u_{07}  −0.0395  −0.0430  −0.0539  −0.0909 
u_{08}  0.0511  0.0374  0.0379  0.0340 
u_{09}  −0.0344  −0.0225  −0.0266  −0.0220 
u_{010}  −0.0428  −0.0335  −0.0318  −0.0289 
u_{011}  0.0318  0.0406  0.0486  −0.1009 
u_{012}  0.1832  0.1929  0.1292  0.1390 
u_{013}  0.1852  0.1964  0.1283  0.1394 
u_{014}  −0.1932  −0.2029  −0.1406  −0.1495 
u_{015}  −0.0228  −0.0303  −0.0366  0.1070 
t  −0.0953  −0.0954  −0.0878  −0.0869 
t^{2}  0.0045  0.0045  0.0039  0.0038 
Nontrad  −0.0249  −0.0288  −0.0581  −0.0600 
Nontrad^{2}  0.0063  0.0068  0.0104  0.0105 
Equity  −1.0046  −1.0320  −0.9691  −1.0279 
Equity^{2}  0.0450  0.0462  0.0429  0.0456 
 0.0679  0.0673  0.0670  0.0664 
γ  0.8095  0.8073  0.8022  0.7996 
η_{1}  0.1468  0.1687  0.1791  0.1917 
η_{2}  0.0834  0.0896  0.0998  0.1047 
Loglikelihood  1230.1  1227.5  1212.8  1211.4 
Curvature violations  3.9%  0  1.9%  0 
Monotonicity violations  14.2%  13.0%  0  0 
Mean efficiency  0.8964  —  —  0.9038 
Table VII. Parameter estimates for group 4Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and Curvature 

u_{0}  16.3202  13.4177  13.7549  13.9571 
b_{1}  0.8400  0.8932  0.8297  0.8821 
b_{2}  0.0051  −0.0047  0.0699  0.0065 
b_{3}  −0.1996  −0.1471  0.1099  −0.0925 
b_{4}  0.6406  0.7147  0.8458  0.7259 
b_{5}  −0.3814  −0.4059  0.0530  −0.3640 
u_{01}  0.1010  0.0541  −0.1530  0.0817 
u_{02}  −0.1280  −0.1116  0.1309  −0.1301 
u_{03}  0.1540  0.1168  −0.0963  0.1366 
u_{04}  0.0198  0.0152  −0.0123  0.0235 
u_{05}  −0.0175  −0.0204  −0.0096  0.0105 
u_{06}  −0.1723  −0.1521  −0.0834  −0.0881 
u_{07}  0.1640  0.1770  0.2031  0.1396 
u_{08}  0.1137  0.1006  0.0421  0.0993 
u_{09}  −0.1032  −0.0920  −0.0285  −0.0784 
u_{010}  −0.1216  −0.1030  −0.0380  −0.0845 
u_{011}  −0.1620  −0.1587  −0.1988  −0.1310 
u_{012}  −0.1940  −0.1791  −0.0568  −0.1841 
u_{013}  −0.2006  −0.1820  −0.0645  −0.1827 
u_{014}  0.2257  0.1921  0.0778  0.1969 
u_{015}  0.1479  0.1590  0.1804  0.1266 
t  −0.0676  −0.0649  −0.0678  −0.0600 
t^{2}  0.0020  0.0020  0.0021  0.0026 
Nontrad  0.0037  −0.0001  −0.0221  −0.0126 
Nontrad^{2}  0.0044  0.0053  0.0071  0.0053 
Equity  −1.6658  −1.1140  −1.4599  −1.3593 
Equity^{2}  0.0828  0.0568  0.0727  0.0671 
 0.0798  0.0777  0.0792  0.0908 
γ  0.9076  0.9021  0.9048  0.8900 
η_{1}  −0.0478  −0.0074  −0.0210  0.1715 
η_{2}  0.0449  0.0604  0.0532  0.0678 
Loglikelihood  985.9  972.6  975.5  885.0 
Curvature violations  27.4%  0  26%  0 
Monotonicity violations  8.9%  5.3%  0  0 
Mean efficiency  0.8948  —  —  0.8856 
Table VIII. Parameter estimates for group 5Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  1.1076  1.7548  2.1774  1.1061 
b_{1}  0.8642  0.8134  0.8777  0.8547 
b_{2}  0.1917  0.1211  0.1995  0.1784 
b_{3}  0.3425  0.1699  0.0957  0.3717 
b_{4}  1.2590  1.2956  0.7711  1.2594 
b_{5}  0.3305  0.2992  0.1345  0.3497 
u_{01}  −0.5118  −0.4297  −0.2394  −0.5000 
u_{02}  0.4670  0.3949  0.1955  0.4597 
u_{03}  −0.4237  −0.3655  −0.1489  −0.4161 
u_{04}  −0.0886  −0.0785  −0.0432  −0.0818 
u_{05}  0.0153  0.0222  0.0154  0.0191 
u_{06}  −0.0727  −0.0928  −0.0430  −0.0337 
u_{07}  0.3283  0.3347  0.1879  0.3356 
u_{08}  −0.0537  0.0054  0.0190  −0.0405 
u_{09}  0.0478  −0.0037  −0.0208  0.0431 
u_{010}  0.0563  0.0017  −0.0118  0.0481 
u_{011}  −0.3064  0.0017  −0.1673  −0.3173 
u_{012}  0.0579  0.0394  −0.0043  0.0578 
u_{013}  0.0730  0.0486  0.0080  0.0578 
u_{014}  −0.0795  −0.0509  −0.0106  −0.0649 
u_{015}  0.3066  0.3250  0.1610  −0.0649 
t  −0.0704  −0.0708  −0.0735  −0.0703 
t^{2}  0.0041  −0.0884  0.0043  0.0038 
Nontrad  −0.1196  −0.0884  −0.1454  −0.1237 
Nontrad^{2}  0.0196  0.0158  0.0223  0.0203 
Equity  0.7371  0.6224  0.7959  0.7308 
Equity^{2}  −0.0402  −0.0343  −0.0433  −0.0401 
 0.0586  0.0609  0.0579  0.0605 
γ  0.8155  0.8179  0.8115  0.8132 
η_{1}  0.2249  0.2020  0.2393  0.2228 
η_{2}  0.0967  0.0916  0.1004  0.0996 
Loglikelihood  1246.2  1224.6  1238.3  1210.8 
Curvature violations  23.4%  0  20.9%  0 
Monotonicity violations  5.1%  3.2%  0  0 
Mean efficiency  0.8975  —  —  0.8961 
Table IX. Parameter estimates for group 6Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  7.7279  6.7066  7.3981  6.3933 
b_{1}  0.7662  0.6810  0.8205  0.7750 
b_{2}  0.0372  0.1106  −0.0014  0.0679 
b_{3}  0.1921  0.1692  0.1683  −0.0108 
b_{4}  0.2053  0.3839  0.3832  −0.6297 
b_{5}  0.1310  0.1047  0.2217  0.1336 
u_{01}  −0.0341  −0.0749  −0.0900  0.2667 
u_{02}  0.0112  0.0898  0.0497  −0.2775 
u_{03}  −0.0282  −0.0642  −0.0832  0.2919 
u_{04}  −0.0406  −0.0347  −0.0183  0.0069 
u_{05}  −0.0100  −0.0098  −0.0073  0.0003 
u_{06}  −0.0758  −0.0744  −0.0384  −0.0234 
u_{07}  0.0147  0.0637  0.0455  −0.2440 
u_{08}  −0.0200  −0.0091  0.0066  0.0493 
u_{09}  0.0222  0.0145  0.0082  −0.0413 
u_{010}  0.0077  0.0021  −0.0023  −0.0456 
u_{011}  −0.0180  −0.0705  −0.0451  0.2499 
u_{012}  −0.0311  −0.0299  0.0067  −0.0138 
u_{013}  −0.0318  −0.0337  0.0049  −0.0179 
u_{014}  0.0505  0.0419  0.0103  0.0264 
u_{015}  0.0161  0.0650  0.0453  −0.2500 
t  −0.0822  −0.0840  −0.0841  −0.0819 
t^{2}  0.0047  0.0050  0.0049  0.0049 
Nontrad  0.0484  0.0531  0.0427  0.0413 
Nontrad^{2}  −0.0021  −0.0028  −0.0013  −0.0016 
Equity  −0.2235  −0.0551  −0.2071  0.3698 
Equity^{2}  0.0123  0.0037  0.0112  −0.0187 
 0.0669  0.0665  0.0691  0.0719 
γ  0.8736  0.8697  0.8766  0.8762 
η_{1}  0.0585  0.0735  0.0647  0.0938 
η_{2}  0.0552  0.0591  0.0561  0.0686 
Loglikelihood  1579.3  1559.5  1567.6  1528.5 
Curvature violations  17.4%  0  17.4%  0 
Monotonicity violations  8.1%  5.8%  0  0 
Mean efficiency  0.8878  —  —  0.8890 
Table X. Parameter estimates for group 7Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  6.0068  9.2679  5.6587  9.2110 
b_{1}  0.6884  0.7039  0.7227  0.7683 
b_{2}  0.1302  0.1014  0.1308  0.1054 
b_{3}  0.2685  0.1194  0.2281  0.1085 
b_{4}  0.8774  −0.7582  1.0423  −0.6621 
b_{5}  0.1013  −0.1570  0.1091  −0.0876 
u_{01}  −0.2710  0.2711  −0.2858  0.2395 
u_{02}  0.2489  −0.2925  0.2555  −0.2766 
u_{03}  −0.2164  0.3174  −0.2247  0.2989 
u_{04}  −0.0634  −0.0529  −0.0388  −0.0269 
u_{05}  −0.0215  −0.0148  −0.0091  −0.0010 
u_{06}  −0.0034  −0.0145  0.0007  0.0046 
u_{07}  0.1988  −0.2022  0.2111  −0.2064 
u_{08}  −0.0233  0.0189  −0.0190  0.0130 
u_{09}  0.0310  −0.0126  0.0231  −0.0097 
u_{010}  0.0091  −0.0310  0.0098  −0.0194 
u_{011}  −0.2084  0.1982  −0.2181  0.2070 
u_{012}  −0.0124  −0.0966  −0.0087  −0.0749 
u_{013}  −0.0075  −0.0910  −0.0026  −0.0662 
u_{014}  0.0137  0.0986  0.0097  0.0772 
u_{015}  0.1954  −0.2077  0.2077  −0.2154 
t  −0.0577  −0.0572  −0.0593  −0.0584 
t^{2}  0.0031  0.0030  0.0032  0.0031 
Nontrad  0.0717  0.0756  0.0653  0.0711 
Nontrad^{2}  −0.0021  −0.0028  −0.0017  −0.0025 
Equity  −0.2653  −0.2342  −0.2424  −0.2603 
Equity^{2}  0.0125  −0.2342  0.0108  0.0117 
 0.0596  0.0605  0.0606  0.0613 
γ  0.7285  0.7289  0.7309  0.7300 
η_{1}  0.5445  0.5318  0.5369  0.5202 
η_{2}  0.2418  0.2375  0.2391  0.2347 
Loglikelihood  2165.9  2152.7  2156.9  2143.1 
Curvature violations  10.8%  0  8.0%  0 
Monotonicity violations  22.1%  20.3%  0  0 
Mean efficiency  0.9181  —  —  0.9178 
Table XI. Parameter estimates for group 8Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  5.0832  5.5241  5.4229  5.6218 
b_{1}  0.7878  0.8074  0.7354  0.7780 
b_{2}  0.1347  0.1106  0.1638  0.1307 
b_{3}  0.1606  0.1013  0.1548  0.1355 
b_{4}  0.5805  0.6484  0.5695  0.5083 
b_{5}  0.1429  0.2829  0.2858  0.3314 
u_{01}  −0.1841  −0.2200  −0.2056  −0.1941 
u_{02}  0.1418  0.1735  0.1757  0.1544 
u_{03}  −0.1301  −0.1600  −0.1569  −0.1377 
u_{04}  −0.0495  −0.0415  −0.0420  −0.0389 
u_{05}  0.0096  0.0153  0.0038  0.0060 
u_{06}  −0.0187  −0.0157  −0.0128  −0.0108 
u_{07}  0.1069  0.1312  0.1037  0.0922 
u_{08}  −0.0034  0.0038  −0.0035  −0.0020 
u_{09}  0.0166  0.0032  0.0098  0.0061 
u_{010}  0.0062  −0.0065  0.0040  −0.0006 
u_{011}  −0.1040  −0.1284  −0.1049  −0.0903 
u_{012}  −0.0059  0.0235  0.0285  0.0356 
u_{013}  0.0173  0.0443  0.0458  0.0530 
u_{014}  −0.0106  −0.0295  −0.0446  −0.0436 
u_{015}  0.1031  0.1244  0.1027  0.0864 
t  −0.0502  −0.0500  −0.0487  −0.0486 
t^{2}  0.0021  0.0022  0.0019  0.0020 
Nontrad  0.0894  0.0879  0.0793  0.0775 
Nontrad^{2}  −0.0088  −0.0085  −0.0076  −0.0073 
Equity  0.2311  0.0942  0.0885  0.0754 
Equity^{2}  −0.0109  −0.0034  −0.0035  −0.0026 
 0.0690  0.0684  0.0692  0.0686 
γ  0.8195  0.8165  0.8188  0.8161 
η_{1}  0.3468  0.3527  0.3441  0.3507 
η_{2}  0.1807  0.1811  0.1818  0.1829 
Loglikelihood  1307.2  1302.2  1303.3  1298.7 
Curvature violations  19.2%  0  16.6%  0 
Monotonicity violations  8.4%  6.9%  0  0 
Mean efficiency  0.9145  —  —  0.9129 
Table XII. Parameter estimates for group 9Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  3.6871  4.9381  4.0945  5.3413 
b_{1}  0.6942  0.6576  0.7199  0.7237 
b_{2}  0.0930  0.1085  0.0792  0.0698 
b_{3}  0.3937  0.2634  0.3275  0.1084 
b_{4}  0.4650  −0.3527  0.3046  −0.6066 
b_{5}  0.0291  −0.2014  0.0867  −0.0221 
u_{01}  −0.1841  0.1708  −0.1242  0.2453 
u_{02}  0.1683  −0.1748  0.1023  −0.2700 
u_{03}  −0.1607  0.1874  −0.0976  0.2700 
u_{04}  −0.0352  −0.0182  −0.0266  0.0001 
u_{05}  −0.0178  −0.0177  −0.0123  −0.0130 
u_{06}  −0.0149  0.0108  −0.0160  0.0026 
u_{07}  0.0902  −0.1468  0.0360  −0.2222 
u_{08}  −0.0672  −0.0269  −0.0442  0.0251 
u_{09}  0.0682  0.0274  0.0484  −0.0194 
u_{010}  0.0466  0.0055  0.0292  −0.0387 
u_{011}  −0.0873  0.1434  −0.0344  0.2233 
u_{012}  −0.0147  −0.0841  −0.0039  −0.0413 
u_{013}  −0.0055  −0.0755  0.0057  −0.0312 
u_{014}  0.0072  0.0810  −0.0016  0.0393 
u_{015}  0.0956  −0.1413  0.0415  −0.2191 
t  −0.0585  −0.0567  −0.0597  −0.0579 
t^{2}  0.0036  0.0036  0.0036  0.0037 
Nontrad  0.1336  0.1374  0.1303  0.1374 
Nontrad^{2}  −0.0157  −0.0163  −0.0156  −0.0166 
Equity  0.4182  0.4907  0.3689  0.4779 
Equity^{2}  −0.0210  −0.0257  −0.0194  −0.0264 
 0.0948  0.0934  0.0933  0.0909 
γ  0.8429  0.8382  0.8392  0.8316 
η_{1}  0.3910  0.3988  0.3900  0.4058 
η_{2}  0.1875  0.1893  0.1864  0.1908 
Loglikelihood  1305.1  1292.3  1300.5  1280.9 
Curvature violations  17.3%  0  15.6%  0 
Monotonicity violations  5.6%  6.2%  0  0 
Mean efficiency  0.8916  —  —  0.8936 
Table XIII. Parameters estimates for group 10Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  7.7178  7.7385  7.0927  7.4785 
b_{1}  0.4581  0.5502  0.5566  0.6193 
b_{2}  0.3386  0.2337  0.2494  0.1889 
b_{3}  0.0949  0.1325  0.1248  0.0854 
b_{4}  0.1616  0.2084  0.2485  0.2047 
b_{5}  −0.0204  0.0220  0.1263  0.0414 
u_{01}  −0.0765  −0.0842  −0.1249  −0.0646 
u_{02}  0.0955  0.0868  0.1189  0.0469 
u_{03}  −0.0226  −0.0365  −0.0701  −0.0126 
u_{04}  −0.0215  −0.0260  −0.0184  −0.0163 
u_{05}  −0.0085  −0.0118  −0.0088  −0.0070 
u_{06}  −0.0089  −0.0129  −0.0145  −0.0104 
u_{07}  −0.0014  0.0173  0.0180  0.0156 
u_{08}  −0.0366  −0.0386  −0.0238  −0.0098 
u_{09}  0.0180  0.0240  0.0145  0.0018 
u_{010}  0.0039  0.0118  0.0058  −0.0041 
u_{011}  −0.0208  −0.0305  −0.0363  −0.0281 
u_{012}  −0.0202  −0.0182  0.0087  −0.0204 
u_{013}  −0.0125  −0.0104  0.0150  −0.0120 
u_{014}  −0.0045  0.0073  −0.0270  0.0104 
u_{015}  0.0107  0.0228  0.0315  0.0215 
t  −0.0436  −0.0422  −0.0440  −0.0421 
t^{2}  0.0039  0.0038  0.0039  0.0038 
Nontrad  −0.0343  −0.0455  −0.0516  −0.0562 
Nontrad^{2}  0.0066  0.0083  0.0085  0.0095 
Equity  −0.3892  −0.3857  −0.3324  −0.3070 
Equity^{2}  0.0211  0.0218  0.0170  0.0166 
 0.0593  0.0600  0.0604  0.0610 
γ  0.8038  0.8029  0.8065  0.8054 
η_{1}  0.4957  0.4924  0.4878  0.4878 
η_{2}  0.1885  0.1880  0.1865  0.1874 
Loglikelihood  1276.0  1261.9  1270.1  1257.2 
Curvature violations  27.5%  0  22.4%  0 
Monotonicity violations  8.4%  7.1%  0  0 
Mean efficiency  0.9003  —  —  0.9001 
Table XIV. Parameter estimates for group 11Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  2.4555  2.4245  2.2933  2.3250 
b_{1}  0.7153  0.7153  0.7035  0.6845 
b_{2}  0.0126  0.0123  0.0365  0.0321 
b_{3}  −0.0280  −0.0280  0.0622  0.0609 
b_{4}  −0.3451  −0.3451  0.2483  0.2173 
b_{5}  0.7115  0.7115  0.3907  0.3571 
u_{01}  −0.0803  −0.0803  −0.1619  −0.1266 
u_{02}  0.0098  0.0097  0.0971  0.0768 
u_{03}  −0.0078  −0.0080  −0.0875  −0.0696 
u_{04}  −0.0338  −0.0345  0.0008  0.0003 
u_{05}  −0.0045  −0.0105  0.0011  0.0016 
u_{06}  −0.0474  −0.0458  −0.0358  −0.0323 
u_{07}  −0.1212  −0.1211  0.0282  0.0233 
u_{08}  0.0268  0.0266  0.0045  0.0060 
u_{09}  −0.0328  −0.0327  −0.0082  −0.0094 
u_{010}  −0.0477  −0.0446  −0.0199  −0.0201 
u_{011}  0.1241  0.1241  −0.0247  −0.0194 
u_{012}  0.1461  0.1461  0.0574  0.0486 
u_{013}  0.1506  0.1506  0.0581  0.0495 
u_{014}  −0.1382  −0.1376  −0.0487  −0.0389 
u_{015}  −0.1085  −0.1084  0.0348  0.0279 
t  −0.0315  −0.0315  −0.0342  −0.0339 
t^{2}  0.0031  0.0029  0.0033  0.0033 
Nontrad  −0.1056  −0.1056  −0.1309  −0.1292 
Nontrad^{2}  0.0143  0.0142  0.0169  0.0167 
Equity  0.8097  0.8097  0.7047  0.7264 
Equity^{2}  −0.0571  −0.0571  −0.0512  −0.0521 
 0.0706  0.0706  0.0728  0.0722 
γ  0.8438  0.8438  0.8438  0.8421 
η_{1}  0.4668  0.4668  0.4526  0.4605 
η_{2}  0.2068  0.2068  0.2052  0.2071 
Loglikelihood  1283.9  1266.8  1258.9  1257.6 
Curvature violations  3.8%  0  2.3%  0 
Monotonicity violations  21.4%  25.1%  0  0 
Mean efficiency  0.9034  —  —  0.9024 
Table XV. Parameter estimates For group 12Parameter  Unconstrained  Curvature only  Monotonicity only  Both monotonicity and curvature 

u_{0}  9.9718  8.9411  8.1803  11.0359 
b_{1}  0.8387  0.7629  0.7857  0.6458 
b_{2}  0.0200  0.0677  0.0479  0.1558 
b_{3}  0.0051  −0.0150  0.0378  0.2757 
b_{4}  −0.0320  0.0840  0.1272  1.4352 
b_{5}  0.3304  0.1658  0.2147  0.4760 
u_{01}  −0.0956  −0.0606  −0.0784  −0.5649 
u_{02}  0.0018  0.0027  0.0084  0.5311 
u_{03}  −0.0036  0.0116  −0.0010  −0.4995 
u_{04}  −0.0754  −0.0664  −0.0203  −0.0495 
u_{05}  −0.0191  −0.0112  −0.0017  −0.0163 
u_{06}  −0.0472  −0.0437  −0.0116  −0.0158 
u_{07}  0.0135  0.0438  0.0171  0.3617 
u_{08}  0.0451  0.0469  0.0220  −0.0450 
u_{09}  −0.0385  −0.0422  −0.0206  0.0439 
u_{010}  −0.0536  −0.0538  −0.0294  0.0389 
u_{011}  0.0005  −0.0307  −0.0021  −0.3706 
u_{012}  0.0554  0.0095  0.0340  0.1043 
u_{013}  0.0571  0.0113  0.0331  0.1016 
u_{014}  −0.0513  −0.0087  −0.0363  −0.1090 
u_{015}  −0.0055  0.0258  −0.0025  0.3537 
t  −0.0517  −0.0512  −0.0563  −0.0637 
t^{2}  0.0044  0.0043  0.0055  0.0062 
Nontrad  0.0007  0.0111  0.0277  −0.0078 
Nontrad^{2}  0.0047  0.0036  0.0002  0.0037 
Equity  −1.4230  −1.1720  −1.0914  −2.6916 
Equity^{2}  0.0871  0.0704  0.0590  0.1637 
 0.0679  0.0706  0.0579  0.0632 
γ  0.8513  0.8537  0.8083  0.8155 
η_{1}  0.2559  0.2358  0.3856  0.3318 
η_{2}  0.1018  0.0982  0.1357  0.1149 
Loglikelihood  660.6  651.8  627.5  603.1 
Curvature violations  34.1%  0  24.9%  0 
Monotonicity violations  46.5%  46.6%  0  0 
Mean efficiency  0.8867  —  —  0.8856 
A parametric bootstrapping method is usually used in constrained optimization to obtain statistical inference for the estimated parameters () or nonlinear transformations of these parameters (, i.e., elasticities or efficiency) (see Gallant and Golub, 1984). This involves the use of Monte Carlo methods, generating a sample from the distribution of the inequality constrained estimator , large enough to obtain a reliable estimate of the sampling distributions of and . However, the possibility of the use of Monte Carlo methods depends on the complexity of the problem in question. For a simple problem where the objective function is simple and the number of observations and constraints is small, like the traditional factor demand problem with 24 observations in Gallant and Golub (1984), a few hundred simulations are easily affordable in terms of computing time. Unfortunately, this is not the case with our problem. The complicated objective function and the large number of observations and constraints render the Monte Carlo method almost unaffordable. In particular, it takes at least 1 hour of CPU time on a Pentium 4 PC to run the optimization problem once. A 500 simulation would take at least 500 hours. When coupled with the number of bank subgroups, 12 in our case, it would take over 6000 hours of CPU time to obtain standard errors for all the 12 groups. This is certainly unaffordable at present. Therefore, only point estimates are provided for the estimated parameters () in the following tables.
When neither monotonicity nor curvature is imposed (see the second column of each table), both monotonicity and curvature are violated for each of the 12 subgroups, with the percentage of curvature violations ranging from 1.4% to 34.7% across subgroups and that of monotonicity violations ranging from 0.1% to 46.5%. Since regularity is not achieved for all of the 12 bank subgroups, we first impose curvature alone on the parameters of the cost function. Clearly, the imposition of curvature alone reduces the percentage of curvature violations to zero for each of the 12 bank subgroups; however, it does not guarantee the satisfaction of monotonicity at every data point for all the 12 subgroups (see the third column of each table). In particular, the percentage of monotonicity violations still ranges from 3.2% to 46.6% across bank subgroups when only curvature is imposed. We further notice that, while the imposition of curvature alone reduces the percentage of violation for all of the 12 bank subgroups, it may also induce more violations of monotonicity that otherwise would not have occurred. Taking bank subgroup one (see Table IV) for example, the percentage of monotonicity violations is 0.1% when no constraints are imposed, but increases to 5.7% when curvature alone is imposed. This confirms Barnett's (2002, p. 202) argument that ‘imposition of curvature may increase the frequency of monotonicity violations. Hence equating curvature alone with regularity, as has become disturbingly common in this literature, does not seem to … be justified.’
Similarly, the imposition of monotonicity alone reduces the percentage of monotonicity violations to zero for each of the 12 bank subgroups, but it does not guarantee the satisfaction of curvature at every data point (see the fourth column of each table). In particular, the percentage of curvature violations still ranges from 5% to 20% across subgroups when only monotonicity is imposed. We also notice that the imposition of monotonicity alone may induce more violations of monotonicity curvature that otherwise would not have occurred (see, for example, bank subgroup 1). This further confirms the argument of Barnett and Pasupathy (2003, p. 135) that ‘regularity requires satisfaction of both curvature and monotonicity conditions. Without both satisfied, the second order conditions for optimizing behavior fail and duality theory fails.’ We thus followed the procedures discussed in Sections 3 and 4 and imposed both curvature and monotonicity on the parameters of the Fourier cost function for each of the 12 bank subgroups. As expected, regularity is satisfied at every data point after curvature and monotonicity are globally imposed (see the fourth column in each of Tables IV–XV).
A common practice in this literature is to derive cost efficiency measures from cost functions without theoretical regularity imposed. While permitting a parameterized function to depart from the neoclassical function space is usually fitimproving (this can be seen from the decrease in the loglikelihood values as constraints are imposed), it also causes the hypothetical best practice firm not to be fully efficient at those data points where curvature and/or monotonicity are violated. In particular, the violation of curvature at a data point (p_{jt}, y_{jt}) implies that the quantities of some outputs increase as their corresponding prices increase (holding other things constant); and the violation of monotonicity at that data point implies the quantities of some outputs decrease as total cost increases (holding other things constant). Both of these two cases mean that the best practice firm is not minimizing its cost at (p_{jt}, y_{jt}). Therefore, cost efficiency, which is supposed to be measured relative to a costminimizing best practice bank, is not accurate for all the 12 bank subgroups when monotonicity and curvature are not imposed. In fact, we find that the difference in the 8year mean efficiency between the unconstrained models and their corresponding curvature and monotonicity constrained versions range from − 0.73% to 0.92% (see Table XVI).1 Hence, the failure to impose monotonicity and curvature can produce misleading estimates of cost efficiency.
Table XVI. Differences in average efficiency between unconstrained and regularity constrained modelsBank group  Difference in average efficiency 

Large banks 
Group 1  −0.48% 
Group 2  0.32% 
Group 3  −0.73% 
Medium banks 
Group 4  0.92% 
Group 5  −0.05% 
Group 6  −0.12% 
Group 7  0.03% 
Small banks 
Group 8  0.16% 
Group 9  −0.20% 
Group 10  0.01% 
Group 11  0.10% 
Group 12  0.11% 
Another issue of particular interest is whether failure to impose theoretical regularity affects the ranking of individual banks in terms of cost efficiency. We calculate the Spearman rank correlation coefficient between unconstrained models and their corresponding (curvature and monotonicity) constrained versions, using the following formula:
 (34)
where n_{k} is the number of banks in the subgroup, Rank_{j1} is the rank of bank i based on the constrained version of the model, and Rank_{j2} is the rank of the same bank based on the unconstrained version of the model.2
If R = − 1, there is perfect negative correlation; if R = 1, there is perfect positive correlation; and if R = 0, there is no correlation. As can be seen in Table XVII, all of the 12 rank correlation coefficients are different from 1, indicating that the ranking of banks in term of cost efficiency changes due to the imposition of theoretical regularity.
Table XVII. Spearman rank correlation coefficients between unconstrained and constrained modelsBank group  Rank correlation coefficient 

Large banks 
Group 1  0.9997 
Group 2  0.9861 
Group 3  0.9792 
Medium banks 
Group 4  0.9460 
Group 5  0.9809 
Group 6  0.9742 
Group 7  0.9869 
Small banks 
Group 8  0.9963 
Group 9  0.9918 
Group 10  0.9911 
Group 11  0.9772 
Group 12  0.8684 
Roughly speaking, the rank correlation coefficient between unconstrained and (theoretical regularity) constrained models is negatively related to the percentage of monotonicity and curvature violations. For example, bank subgroup 1, which has the lowest percentage of monotonicity violations (0.1%) and the lowest percentage of curvature violations (1.4%), has the highest rank correlation coefficient (0.9997); bank subgroup 12, which has the highest percentage of monotonicity violations (46.5%) and the highest percentage of curvature violations (34.1%), has the lowest rank correlation coefficient (0.8684). Hence, we alert researchers to the potential problems caused by failure to check for and impose (if necessary) theoretical regularity.
6.1. Cost Efficiency and Productivity of US Banks
We now turn to the discussion of cost efficiencies by asset size class, reported in Table XVIII. Clearly, the mean efficiency for each of the 12 subgroups ranges from 82.19% to 91.78%, implying that about 8–18% of incurred costs over the sample period can be attributed to cost inefficiency relative to the best costpractice banks. These results are similar to earlier estimates that examined commercial banks. Berger and Humphrey (1997), for example, report mean cost efficiency of 84% with a standard deviation of 6% across 50 studies of US banks using parametric frontier techniques. Likewise, Berger and Mester (1997) report average cost efficiency of 87% using a large dataset of almost 6000 US commercial banks that were in continuous operation over the 6year period from 1990 to 1995.
Table XVIII. Cost efficiency (in %) per asset groupBank group  Mean  Min.  Max.  5% percentile  95% percentile 

Large banks 
Group 1  82.19  42.50  98.95  66.68  97.42 
Group 2  88.20  72.52  99.21  77.04  98.36 
Group 3  90.38  72.22  99.45  81.71  98.08 
Medium banks 
Group 4  88.56  71.86  98.58  77.89  97.86 
Group 5  89.61  74.17  99.48  79.77  97.71 
Group 6  88.90  72.90  99.40  78.97  98.20 
Group 7  91.78  78.52  98.85  83.87  97.90 
Small banks 
Group 8  91.29  77.16  98.79  83.11  98.12 
Group 9  89.36  75.28  99.30  75.42  98.07 
Group 10  90.01  75.56  98.96  80.48  97.93 
Group 11  90.24  75.53  98.97  80.99  97.85 
Group 12  88.56  70.32  98.97  78.18  97.58 
There are several findings that emerge from Table XVIII. First, the largest two subgroups are less efficient than the other 10 subgroups. In particular, the very largest subgroup (with assets greater than $ 3000 million) is about 5.6% less efficient than the second largest subgroup and 7.8% less efficient than the third largest subgroup. The same subgroup is 6.3–9.6% less efficient than the mediumsized and small banks. The second largest subgroup (with assets between $ 1000 million and $ 3000 million) is 1.2% less efficient than the third largest subgroup, and ranges from 0.9% to 3.9% less efficient than mediumsized and small bank subgroups. Second, in general cost efficiency falls with bank size for banks with assets above $ 100 million except for subgroup 3 (with assets between $ 500 million and $ 1 billion) and subgroup 5 (with assets between $ 300 million and $ 400 million). However, cost efficiency increases with bank size for banks with assets below $ 200 million except for subgroup 9 (with assets between $ 60 million and $ 80 million) and subgroup 10 (with assets between $ 40 million and $ 60 million). These findings are partially consistent with Kaparakis et al. (1994), who applied a translog cost function to a large dataset of almost 5548 commercial banks in the United States. In particular, Kaparakis et al. (1994) also find that banks with assets greater than $ 1000 million are less efficient than smaller banks. However, they find that average efficiency increases with bank size for banks with assets less than $ 500 million.
We are also interested in the time patterns of cost efficiency of the different bank subgroups, plotted in Figure 1. Several conclusions emerge. First, all the bank subgroups experienced a drastic decline in cost efficiency over the period from 1998 to 2004, and then showed an improvement in cost efficiency in 2005. For example, the cost efficiency of the largest bank subgroup (with assets greater than $ 3000 million) declined from 94.79% in 1998 to 73.65% in 2004, and then resurged a little bit to 73.9% in 2005. The most efficient subgroup with assets between $ 100 million and $ 200 million shows a decline in cost efficiency from almost full efficiency in 1998 to 80.12% in 2004, and then shows a rebound to 84.47% in 2005. Second, the largest bank subgroup is consistently less efficient than the other bank subgroups. Further, the gap in cost efficiency between the largest bank subgroups and the other bank subgroups has increased. For example, the largest banks were 3.34% less efficient than the second largest bank subgroup in 1998, but 7.24% less efficient than the second largest bank subgroup in 2005.
The drastic decline in cost efficiency for all asset size classes during the first 7 years of our sample period can be partially justified by the failure of banks to adjust to the rapid technological change of the best practice cost frontier. Figure 2 plots the technological change of the best practice cost frontier for all the size classes. Clearly, all 12 asset size classes have shown rapid technological change, with large banks being more favored by the technological change. In particular, the largest size subgroup (with assets greater than $ 3000 million) has seen the fastest technological change of around 6.71% per year; and even the second smallest size subgroup (with assets between $ 20 million and $ 40 million)—which has shown the lowest technological change—has also seen a technological change of around 1% per year. Rapid technological change, which makes feasible the production of given levels of outputs with fewer inputs (or, equivalently, the production of more outputs with given levels of inputs), could result in lower average bank efficiency, even if banks became increasingly productive over time. This can be clearly seen from equation (5).
The second reason may lie in unmeasured improvements in service quality and variety. Banks have provided an improved array of services (e.g., mutual funds, derivatives, online services) that increased bank costs, but at the same time were able to raise revenues to more than cover these costs. This is consistent with a strong improvement in profitability over the sample period. Another partial explanation for the decline in cost efficiency for the very large banks (those with assets greater than $ 1 billion) is that many of them have been engaged in geographical diversification and product diversification. The passage of the Riegle–Neal Interstate Banking and Branching Efficiency Act of 1994 undoubtedly helped spur large banks to spread across state lines and to grow. This development helped create large, geographically diversified branch networks that stretch across large regions and even coasttocoast. The Gramm–Leach–Bliley Financial Services Modernization Act of 1999 allowed the largest banking organizations to engage in a wide variety of financial services, acquiring new sources of noninterest income and further diversifying their earnings. While these geographical and product diversifications have increased the large banks' profits, they also greatly increase their costs.
Finally, one thing that needs to be clarified here is that a lower cost efficiency does not mean a lower productivity growth. In order to illustrate, we calculate the average productivity growth for each bank subgroup over the sample period. Within a cost frontier context, productivity growth can be decomposed into four components: a technological change term, a technical efficiency change term, an input allocative efficiency change term, and a scale effect term—see Kumbhakar and Lovell (2003) for more details. For simplicity, let us ignore the last term and call productivity growth, which is now composed of only the first three terms, ‘net’ productivity growth (NTFPG). Following Kumbhakar and Lovell (2003), we then express the net productivity change as
 (35)
where the first term is the technological change of the best practice cost frontier and the second term is the change in cost efficiency, including both technical and allocative efficiency changes. The average annual net productivity growth for each of the 12 subgroups is plotted in Figure 3. Generally speaking, the net productivity growth rate increases with asset size, with the largest four bank subgroups (with assets greater than $ 400 million) experiencing significant productivity gains (NTFPG > 1%) and the smallest eight subgroups (with assets less than $ 400 million) experiencing insignificant productivity gains (NTFPG < 1%) or productivity losses (NTFPG < 0). In particular, the largest size subgroup, which has the lowest cost efficiency, shows the fastest average annual net productivity growth of 3.3% whereas subgroup 7, which has the highest cost efficiency, shows a moderate average annual net productivity growth of 0.4%. This finding is also consistent with the view expressed by Berger (2003), Bernanke (2006) and others that technological advances have favored larger banks at the expense of small lenders. However, these productivity gains by larger banks are mainly due to technological advances rather than cost efficiency gains.
7. CONCLUSION
 Top of page
 Abstract
 1. INTRODUCTION
 2. STOCHASTIC COST FRONTIER
 3. THE FOURIER COST FUNCTION
 4. CONSTRAINED OPTIMIZATION
 5. THE DATA
 6. EMPIRICAL RESULTS
 7. CONCLUSION
 Acknowledgements
 . APPENDIX
 REFERENCES
 Supporting Information
The estimation of stochastic cost frontier is popular in the analysis of bank efficiency. However, the theoretical regularity conditions (especially those of monotonicity and curvature) required by neoclassical microeconomic theory have been widely ignored in the literature. In this paper, and for the first time in this literature, we use the globally flexible Fourier functional form, as originally proposed by Gallant (1982), and estimation procedures suggested by Gallant and Golub (1984) to impose the theoretical regularity conditions on the Fourier cost function. Hence, we provide estimates of bank efficiency in the United States using (for the first time) parameter estimates that are consistent with global regularity.
We find that failure to incorporate monotonicity and curvature into the estimation will result in mismeasured magnitudes of cost efficiency and also misleading bank rankings in terms of cost efficiency. Regarding cost efficiencies from our theoretical regularity constrained models, we find that the largest two subgroups are less efficient than the other subgroups. We also find that all 12 asset size classes show a decline in cost efficiency from 1998 to 2004, and then see a slight improvement in 2005. This decline in cost efficiency can be the result of adjustments to fast technical progress or unmeasured improvements in service quality and variety. For the very large banks, the decline in cost efficiency can be a result of their engagement in geographical diversification and product diversification after deregulation. Further, we find that the largest four bank subgroups (with assets greater than $ 400 million) experienced significant productivity gains (NTFPG > 1%) and the smallest eight subgroups (with assets less than $ 400 million) experienced insignificant productivity gains (NTFPG < 1%) or productivity losses.
In estimating bank efficiency and productivity in the United States, we have also highlighted the challenge inherent with achieving economic regularity and the need for economic theory to inform econometric research. Incorporating restrictions from economic theory seems to be gaining popularity as there are also numerous recent papers that estimate stochastic dynamic general equilibrium models using economic restrictions (see Aliprantis et al., 2007). With the focus on economic theory, however, we have ignored econometric regularity. In particular, we have ignored unit root and cointegration issues, because the combination of nonstationary data and nonlinear estimation in large models like the ones in this paper is an extremely difficult problem. Dealing with these difficult issues is an area for potentially productive future research.