Non‐Discriminatory Trade Policies in Panel Structural Gravity Models: Evidence from Monte Carlo Simulations

This paper provides Monte Carlo (MC) simulation evidence on the performance of methods used for identifying the effects of nondiscriminatory trade policy (NDTP) variables in panel structural gravity models. The benchmarked methods include a fixed effect (FE) estimator that utilizes data on intra national trade flows, the bonus&#8208;vetus (BV) and the two&#8208;stage fixed effect (FE&#8208;2S) estimator. The results indicate that only the FE estimates are unbiased and consistent under very general assumptions of the data generating process. The favourable asymptotic properties of the FE estimator unfold as the number of period T increases.


| 857
SELLNER which implicitly determines p it and Y it , given α and the joint distribution of τ ijt and H it (see Egger & Staub, 2016). Substituting this price back into the import demand equation and defining P jt = ∑ N i = 1 (p it ijt ) as the CES price index results in the well-known structural gravity model with Y it and Y jt being production and expenditures, whereas for simplicity balanced trade is assumed. From the definition of the multilateral resistance terms (MRT) given in Equations 4 and 5 it follows, that E[Π it τ ijt ] ≠ 0, E[P jt τ ijt ] ≠ 0 and E[Π it P jt ] ≠ 0. Addressing this correlation structure will be a central feature of the DGP that will be used for the Monte Carlo simulations. To be consistent with a structural gravity model, prices p it need to be determined endogenously from the structure of the model. Production and expenditure in turn depend on the prices and, along with the trade costs, determine the trade flows. Equations 1 and 2 can therefore be used to generate the endogenous variables p it , Y it , and the structural part 4 of the trade flows, consistent with the economic structure of the model.

| Econometric specification
Specifying the unobserved trade costs ijt and introducing a stochastic term, the model given by Equations 3 to 5 can be re written in a form that can be econometrically estimated. Denoting e it = ln Y it ∕Π it , m jt = ln Y jt ∕P jt and specifying the trade costs by a vector of observable variables D ijt + NIP jt + ij = ln ijt , leads to To keep the model simple, the trade costs are specified by time-varying discriminatory variable D ijt , a time-varying nondiscriminatory import protection measure NIP jt and a time-invariant directional pairwise trade cost component θ ij .
The assumptions about the multiplicative error η ijt will determine which model will be identified. For instance, OLS on a log-linearized version of Equation 6 will be identified if E( ln ijt |e it , m jt , D ijt , NIP jt ) = 0, that is, there is no dependence between the covariates and the logarithm of the error term. Assuming E( ijt |e it , m jt , D ijt , NIP jt ) = 1, that is, the errors are mean-independent of the covariates, permits identification via a GLM estimator 5 . Santos Silva and Tenreyro (2006) showed that the errors in gravity models of trade will in general be heteroskedastic and dependent on the covariates. OLS estimation on a log-linear version of Equation 6 will then yield biased estimates owing to Jensen's inequality. Furthermore, the presence of zero flows will provoke ad hoc solutions in log-linear models. Since the bulk of the current empirical gravity models are estimated in X ijt = exp e it + m jt + D ijt + NIP jt + ij ijt . 858 | SELLNER multiplicative form, mostly by PPML following the arguments brought in Santos Silva and Tenreyro (2006), the focus of this paper will be on GLM estimation with a logarithmic link function and a Poisson family. GLM estimates will be consistent if the conditional mean function is correctly specified. The efficiency of the estimator will depend on correct specification of the error variance, that is, the linear exponential family of the density chosen.
There are three different approaches towards a theory-consistent estimation of Equation 6 that controls for the structure imposed by e it and m jt . The most common procedure is to include N * T exporter-period and (N−1) * T importer-period fixed effects for e it and m jt (see, Feenstra, 2004;Harrigan, 1996). A second option is to apply a structurally iterated estimator (Egger & Staub, 2016;Head & Mayer, 2014). Given a set of starting values for e it and m jt , Equation 6 is estimated via GLM, restricting the coefficients on e it and m jt to one. The results are used to obtain an estimate of ijt , which is employed to solve for new estimates of e it and m jt . This procedure is repeated until convergence. A third option is the quasi-differences estimation, that uses two products of country-pairs to net out e it and m jt . This can be done either by using just-identified set of moment conditions (as described in Charbonneau, 2012) or apply a simply ratio-of-ratios estimator as outlined in Head and Mayer (2014) or Egger and Staub (2016). The latter can be implemented by GLM estimation on the sets of transformed variables X st = X ikt X ljt /X lkt X ijt and d st = (d ikt + d ljt )−(d lkt + d ijt ) with d being a trade cost variable 6 .

| Identification via intra national trade flows
Given the conditional mean is correctly specified, fixed effects, structurally iterated, and quasi-differences estimation will yield consistent estimates for bilaterally varying trade cost variables. If data on intra national trade flows is available, these methods may also be applied to identify the effects of nondiscriminatory trade policies in a structural gravity model. This identification strategy has just recently been introduced by Heid et al. (2017) and the approach is further discussed in Piermartini and Yotov (2016). As Piermartini and Yotov (2016) mention, the intra national dimension turns the monadic nondiscriminatory trade policy variables into dyadic variables. The key identifying assumption is that nondiscriminatory trade policies do not affect intra national trade. Heid et al. (2017) suggested a fixed effects panel model that can be formulated as follows for a cross-sectional setting: The unobserved terms e it and m jt are captured by the fixed effects FE it and FE jt . Furthermore, the timeinvariant directional pairwise trade costs, such as distance or common language, can be captured by directional pairwise fixed effects FE ij . The nondiscriminatory trade policy, per definition, affects all trading partners equally, but will not exert an impact on goods and services that are domestically produced and consumed (see Heid et al., 2017), that, is X iit . Thus, NIP jt is interacted with an international border dummy INT ij that is one if i ≠ j and zero otherwise. Identification is possible since the MRT will be identified over all periods and country-pairs, including intra national trade, while the nondiscriminatory trade policies can be identified by exploiting the variation between intra national and international flows.
Note that Equation 7 may also be estimated via the structurally iterated and the quasi-differences approach outlined above. However, the fixed effects estimator has several advantages. In contrast to the structurally iterated estimator, the FE estimator imposes no specific structural assumptions on the MRT. The fixed effects will capture the MRT even if some key variables in determining the trade costs are omitted or unobserved. Furthermore, the structurally iterated estimator is more prone to convergence failures than the fixed effects estimator (see Egger & Staub, 2016). Compared with the

| 859
SELLNER quasi-differences approach, the fixed effects estimator is computationally less demanding 7 and leads to lower standard errors (see Egger & Staub, 2016). A potential disadvantage of the FE estimator may be that, while the coefficients are consistently estimated, the asymptotic variance might be affected by the incidental parameter problem. This problem will in particular be present in cross-sectional analysis. Cameron and Trivedi (2005) showed that in a multiplicative model with one-way fixed effects, the incidental parameter problem can be avoided by using a concentrated likelihood function, but in the case of a two-way fixed effect model the problem would persist. Egger and Staub (2016) showed in their Monte Carlo simulations using cross-sectional data that the incidental parameter problem strongly affects the t statistics on their trade cost measure. As a solution to the problem for cross-sectional analysis, Jochmans (2017) derived a GMM estimator for two-way multiplicative gravity models. Regarding panel data applications, Fernández-Vala and Weidner (2016) introduced a correction for non-linear models, including a Poisson model, with two fixed effects. Furthermore, Pfaffermayr (2017Pfaffermayr ( , 2018 showed that his proposed constrained PPML, that exploits the theoretical SGM constraints instead of including exporter-period and importerperiod dummy variables as fixed effects, is not affected by the incidental parameter problem.
Another disadvantage, especially when interested in product-level analysis, is the availability of intra national trade flows. Readily available high quality data sources are still scarce 8 . Researchers are thus often forced to construct their own datasets based on available international trade flow data and national gross output statistics, with the latter data typically covering fewer countries and having longer publication lags. Furthermore, the probability of encountering zero flows will increase with the level detail of product groups. Some countries may not or only very infrequently export or import certain products at all or with specific trading partners. In such cases convergence problems of the estimator may arise or some of the exporter-period, importer-period, or time-invariant directional pairwise fixed effects may not be identified at all.
Finally, when interested in estimating the effects of more than one NDTP variable, that is, a nondiscriminatory importer-period-specific measure of import protection (NIP) and a nondiscriminatory exporter-period-specific export subsidy (NES) measure, the corresponding parameters may not be jointly identified. Heid et al. (2017) demonstrated a case of perfect collinearity of the fixed effects and the NDTP variables if a specific functional dependence between the NDTP measures across countries exists. Oberhofer, Pfaffermayr, and Sellner (2018) showed in their appendix that if the same countryperiod-specific measure is included as an importer-and exporter-specific NDTP, i.e. NIP ct = NES ct for c, ⋯, C, or if these two measures are highly correlated, (near) collinearity will occur.

| Identification methods without intra national trade flows
A challenge for empirical estimation occurs if data on intra national flows is not available. This may be the case because they are not published for the countries of interest, or they are reported with a severe time lag and do not match the periods for which the data on nondiscriminatory trade policies is reported. Since in such a case the nondiscriminatory trade policy measures will not vary over either the exporter or importer dimension, they will be perfectly collinear to the exporter-period or importer-period fixed effect and cannot be identified. Furthermore, the collinearity will lead to convergence failure in the structurally iterated estimates, since m jt and NIP jt cannot be jointly identified. The quasi-differences procedure is also not applicable, since d ikt + d ljt = d lkt + d ijt and thus d st = 0 for all s, thus none of the three methods discussed above will work.
One way to avoid collinearity is to simply construct a new dyadic variable from two monadic variables, taking into account that the functional form avoids collinearity. Examples from the literature include Lee and Park (2007) or Moïsé, Orliac and Minor (2011) that analyzed the effects of 860 | SELLNER trade facilitation indicators. However, as Head and Mayer (2014) note, most of the dyadic indicators constructed this way may not have a straightforward interpretation. Another ad hoc solution to the problem, noted in Piermartini and Yotov (2016), is to approximate the MRT by remoteness indexes, which has been the common practice in papers prior to Eaton and Kortum (2002) and Anderson and Van Wincoop (2003). Head and Mayer (2014) discussed various frequently used remoteness measures and found that none of them can capture the structure imposed by the theory. In a similar manner Piermartini and Yotov (2016) do not recommend the use of such remoteness measures to substitute MRT or fixed effects.
As a more promising solution to the problem, Piermartini and Yotov (2016) and Head and Mayer (2014) propose a two-stage fixed effects procedure. First the coefficients on the time-varying discriminatory trade cost variables are estimated along with exporter-period, importer-period and directional pairwise fixed effects. Then the respective fixed effects of the first stage are regressed on the NDTP and other country-period-specific variables. This two-step procedure has been frequently used in the literature. For instance, Kortum (2001, 2002) used this set-up to identify key parameters of their trade models, Melitz (2008) estimated the effects of the literacy rate and a measure of linguistic diversity on trade, and Gylfason, Martínez-Zarzoso, and Wijkman (2015) adapted this procedure on a panel with time-varying non bilateral data on corruption and democracy. Given its frequent use, this estimator will be included in the Monte Carlo simulations. Both stages will be estimated via GLM: The flaws of this procedure for obtaining an estimate of γ 2S under fixed effects assumption have been thoroughly outlined in the discussion that followed the fixed effects vector decomposition model introduced in Plümper and Troeger (2007). As Greene (2011) noted, it is not possible to identify the fixed effects and collinear variables under the assumptions of a fixed effects model. Only the preceding linear mixture of the two is estimable. The parameter γ 2S is only separately identified if the fixed effects assumption is violated. Another related approach to identify collinear variables was given in Egger (2005), who extended the Hausman and Taylor (1981) model (HTM) to the gravity framework. Besides being a log-linear model, the HTM imposes strong exogeneity assumptions on the discriminatory variables used as instruments for the identification of the effects of the NDTP measures 9 .
Another commonly applied approach to avoid the collinearity problem is the bonus vetus (BV) method (see Baier & Bergstrand, 2009), that approximates the multilateral resistance terms by a first-order log-linear Taylor series expansion around a symmetric trade cost world. To implement this method, compute The normalized trade flows X ijt /(Y it Y jt ) are then regressed on the transformed variables x MRS ijt . A more robust version for estimation purposes is to replace the GDP-weights with equal weights 1/N (see Baier & Bergstrand, 2010), which assumes a symmetric world in trade costs and economic sizes. While the BV method was originally derived for OLS on log-linearized model of Equation 6, BV-transformed variables haven often been used in PPML estimation (see, Bratt, 2017;Hoekman & Nicita, 2011;Moïsé & Sorescu, 2013). Given its continuing popularity in empirical applications, the BV method (in its more robust, equally weighted specification) is included in the Monte Carlo simulations:

SELLNER
In cases of no missing data and assuming a log-linear model with homoskedastic errors, Head and Mayer (2014) showed in Monte Carlo simulations that the BV method yields unbiased estimates for discriminatory trade policy variables in a cross-section. It will also be demonstrated later, that under certain conditions the BV method also yields unbiased estimates on the NDTP variable in a panel setting. The use of the BV method is often motivated on grounds of collinearity in cases where data on intra national flows is missing. It is therefore illustrative to assess the performance of the BV method in a GLM estimation framework without data on intra national trade flows.
Estimates of such a model are subject to several sources of bias. First, under a multiplicative non linear model with heteroskedastic errors, the log-linearized BV-transformations will lead to biased estimates, as shown in Santos Silva and Tenreyro (2006). Since the BV method resembles a doubledemeaning procedure, their GLM counterpart would invoke concentrating out the mean effects of exporters and importers or applying some quasi-differences approaches as described above. Such a procedure would strongly complicate the use of this method, whose popularity is a result of its simplicity for application. A second source of bias comes from missing observations, a problem that is usually encountered in empirical applications of the gravity model. As the MC results of Head and Mayer (2014) showed, this bias will be negligible for discriminatory continuous variables. The third source of bias can be attributed to the correlation between the latent inward multilateral resistance term m jt and the importer-period-specific NDTP variable NIP jt .
For comparison with the above models, a naive gravity model is added to the MC simulations: The naive model relates to the basic Newtonian model with exporters and importers GDP's as mass variables. Estimates of γ obtained by the naive gravity model are biased since they omit the MRT that are correlated with the NDTP variable. Prior to the structural gravity model, researchers usually extended the naive gravity model by remoteness indexes. As mentioned before, none of the remoteness variables suggested captures the structure implied by the MRT. In order to avoid selecting an arbitrary remoteness index for an augmented gravity model, the naive gravity model is used as a benchmark.

| MONTE CARLO SIMULATIONS
Monte Carlo simulation studies based on a DGP that is consistent with structural gravity theory are still scarce. Baier and Bergstrand (2009) employed Monte Carlo simulations to motivate that their bonus vetus OLS method yields results similar to the customized non-linear least squares estimator of Anderson and Van Wincoop (2003). They used observed data on GDP, distance, and borders of the McCallum (1995) Canadian-United States province trade dataset for the simulation analysis, fixing the coefficients on distance and border to a true value. The obtained trade costs were used to solve for the MRT, which were in turn used to construct the expected trade flows of their log-linear model. Prior to estimation they added a log-normal error term with a variance such that a non structural gravity regressions yields an R 2 of between 0.7 to 0.8. Head and Mayer (2014) adopted a similar approach, using actual data on GDP, distance and regional trade agreements. In contrast to Baier and Bergstrand (2009) they included the log-normal error term in the trade cost, prior to calculation of the MRT, rather than prior to estimation as in Equation 6.

SELLNER
They emphasize this difference in design, noting that they pursued a structural instead of a statistical approach to the error term 10 . The variance of their error term was calibrated to the root mean squared error of a least squares dummy-variable regression on their data. Similar to Baier and Bergstrand (2009), they restrict their analysis to log-linear models and benchmarked a series of linear estimators, including fixed effects, structurally iterated, quasi-differences, and BV.
As the focus of this paper is on GLM estimation, the simulation approaches of Baier and Bergstrand (2009) and Head and Mayer (2014) for log-linear models cannot be applied. The DGP that will be used in the following, is based on the design introduced in Egger and Staub (2016). As in Baier and Bergstrand (2009) their DGP consists of an deterministic structural part that determines the expected trade flows and a stochastic part that introduces noise subsequently.

| Data generating process
The following Monte Carlo experiments are applied to empirical regressors as well as to fully simulated regressors. The DGP is the same for both datasets and obeys the restrictions imposed by the structural gravity model outlined in Section 2. Using empirical data on the regressors and effect sizes of the trade cost variables from the empirical literature has the advantage of producing data properties that are likely to be encountered in practice. In contrast, a full simulation of all regressors enables the researcher to fine tune dispersion and correlation properties of the data and latent parameters.

| Empirical regressors
As empirical regressors we use data on gross domestic product (GDP), the simple mean of the applied most-favored nation tariff rate (MFN) and a dummy variable for regional trade agreements for 150 countries 11 over 3-year intervals between 1994 and 2015. Data on GDP in constant U.S.$ million and tariffs is obtained from the World Bank Database and data on regional trade agreements is taken from "Mario Larch's Regional Trade Agreements Database" (see Egger & Larch, 2008). Missing values in the most-favored nation tariff variable were interpolated by the conditional mean prediction from a Poisson regression on the logarithm of GDP, dummy variables indicating the income group definition of the World Bank, a full set of importer country dummies and year dummies 12 . The deterministic part of the trade flows is then generated by setting the linear index of the trade costs to solving for the multilateral resistances in Equations 4 and 5 using the empirical measures for GDP and substituting back to Equation 3. The resulting deterministic trade flows will be realized in expectation E(X ijt ) = μ ijt . This deterministic or structural part is then augmented by a stochastic component, as will be described later. The coefficient on regional trade agreements D ijt = RTA ijt and the most-favored nation tariff NIP jt = ln (1 + MFN jt ) are set to β = 0.5 and γ = −5, which corresponds to values 13 found in the empirical literature (see, Head & Mayer, 2014). Since there is no natural empirical benchmark on how to choose the directional pairwise trade cost components θ ij , we draw them from a normal distribution with mean −0.6 and standard deviation of 0.3 to resemble key moments of the distribution of 0.5 × RTA ijt −5 × ( ln (1 + MFN jt ) × INT ij ), that is, the sum of the linear trade costs index of the other two components. Setting θ ij this way specifies equal contributions of observed and unobserved components in expectation.

| Fully-simulated regressors
Our second series of MC experiments is based on a fully simulated data set, extending the framework of Egger and Staub (2016) to a panel including an importer-specific NDTP variable and directional pairwise trade cost components. Again, the first part of the DGP is the deterministic component, that obeys the structural gravity theory and will be realized in expectation. For the initial period t = 1, we draw four correlated variables from a multivariate normal distribution with covariance matrix with the superscripts H, D, NIP and θ denoting the auxiliary variable for endowment, the time-varying discriminatory, the time-varying nondiscriminatory and the directional pairwise trade cost component. Means, standard deviations and correlations are set in a similar manner as in Egger and Staub (2016) N∕4, σ D = σ NIP = 5, and σ θ = 3. In the baseline scenario the variance scaling factor v z is set to 0.1. The endowments and the trade cost variables for the initial period t = 1 are then obtained by with α = −4 in the baseline scenario, following Anderson and Van Wincoop (2003). The variables are constructed for the full set of N 2 trade flows including intra trade flows. As in Egger and Staub (2016), increasing N will increase the mean and variance but will not affect the coefficient of variation or the endowment share of a country relative to the world. The variables for the periods t > 1 are generated by applying growth factors drawn from uniform distributions: The total trade cost component T ijt = ijt is then specified by: The coefficients of the DGP are set to β = 0.5 and γ = −5 to facilitate comparison with the DGP of the empirical dataset described above. The structural part is derived using H it and T ijt to solve for p it in Equation 2. Recognizing that Equation 1 only holds in expectation, Y it = p it H it and T ijt = τ α are then used to obtain E(X ijt ) = μ ijt .
In the next step, the observed trade flows-for both the empirical and the fully simulated dataset-are obtained from the structural part by multiplying the deterministic μ ijt with the stochastic η ijt component The errors η ijt with E(η ijt ) = 1 are drawn from the heteroskedastic log-normal distribution The simulation exercises are carried out assuming 2 , ijt = ln 1 + −1 ijt in the baseline setting. In this setting, GLM with a Poisson variance function, that is, PPML, will lead to an asymptotically efficient estimator. In the Monte Carlo simulations, the deterministic components are derived from the empirical data or constructed by drawing the simulated dataset once. In all but one experiment, the full sample size of N = 150 countries and T = 8 periods is employed for solving the model whereas only the largest n = 50, 100, 150 countries in terms of GDP or endowment and the first t = 3, 5, 8 periods are selected for the respective the sample sizes. Within each experiment, the stochastic part is added in each replication as described in Equations 23 and 24.

| Description of MC scenarios
The first two Monte Carlo experiments are conducted on the data generating process using the empirical data. Table 1 shows the data properties of the empirical data generating process. Since the largest n countries in terms of GDP are considered, the dispersion-given by the coefficient of variation CVin trade flows increases with N and decreases in T. Also note that the correlation-denoted by ρ(·, ·)between the discriminatory and nondiscriminatory variable decreases with increasing N and T. In a second experiment, the robustness of the PPML estimator with respect to a different assumption of the variance process is assessed. In a GLM the choice of the linear exponential family distribution will lead to a particular variance function, under which the estimator is asymptotically efficient. In the baseline scenario the errors are drawn from Equation 24 with 2 , ij = ln 1 + −1 ij , resulting in V(X ijt ) = μ ijt , that is, a Poisson family. While the PPML is most frequently used in empirical research, other applied PML estimators include Gamma (see, Martínez-Zarzoso, 2013; Santos Silva & Tenreyro, 2006) and negative binomial (see Burger, Van Oort, & Linders, 2009;Head, Mayer, & Ries, 2009). Estimators of these families will be optimal in terms of efficiency if 2 , ij = ln (2) and 2 , ij = ln 2 + −1 ij , respectively. As a robustness check for the PPML, it will be applied to a DGP with a Gamma variance  Note. CV(X ijt ) is the mean coefficient of variation over 5,000 replications.

|
SELLNER function 14 for the stochastic component, which results in V(X ijt ) = 2 ijt . Changes in the error distribution will only affect the data properties via the dispersion in the generated trade flows. Specifying a Gamma family, increases the dispersion in trade flows compared with the baseline, see last row in Table 1.
An additional three MC experiments are conducted on the DGP using the fully simulated dataset. The data properties of the above outlined baseline setting of the fully simulated DGP are given in Panel (a) of Table 2. Data for different N and T are generated from the same full sample of N = 150 and T = 8 in each experiment, including the 50, 100, and 150 countries with the largest endowment in the initial period and the 3, 5, and 8 earliest periods. Thus, by construction the dispersion in endowments and trade flows will increase with increasing N. Note that there is a significant degree of correlation between the nondiscriminatory trade policy variable NIP jt and the latent inward multilateral resistance term m jt .
In the first scenario we break this correlation by slightly modifying the DGP described above. In particular, NIP jt is replaced by a variable N IP jt drawn from a standard normal distribution such that ρ(NIP jt , m jt ) = 0 15 . Then we derive To ensure that (NIP jt , m jt ) = 0 for each N = 50, 100, 150 and T = 3, 5, 8, we generated each sample size separately rather than generating the largest sample size once and then reducing it as in the other experiments. Hence, there is no dependence of the dispersion in endowments and trade flows on the sample size. The modified DGP increases the dispersion of D ijt , and introduces a high correlation between the D ijt and NIP jt (see Panel (b) in Table 2). The coefficient of variation is suppressed since NIP jt has zero mean in this scenario.
The next two scenarios are taken from Egger and Staub (2016). The third scenario simulates the effects of an increased dispersion in endowments and trade flows, by setting v z = 0.3. As shown in Panel (c) of Table 2 this effectively increases the dispersion in all components of the structural part of the DGP compared with the baseline scenario. In the fourth scenario, a higher elasticity of substitution of α = −9 is assumed, which substantially increases the dispersion in the total trade costs T ijt and trade flows X ijt , see Panel (d).
Each scenario is simulated for the following four estimators outlined above: (1) fixed effects model (FE), (2) Two-stage fixed effects estimator (FE-2S), (3) bonus-vetus with equal country weights 1/N (BV), and (4) a non-structural gravity model (naive). Models (2) to (4) are estimated on both the full sample covering N 2 × T and the reduced sample covering only (N 2 −N) × T observations of international trade flows, whereas the fixed effects model (1) as given in Equation 7 is only identified in the full sample with intra national trade flows X iit included. All methods are estimated by PPML, including the scenario in which the DGP is specified via a GLM with a Gamma variance. In each scenario, S = 5, 000 replications of the DGP for a small (N = 50), medium (N = 100), and large (N = 150) country sample size are performed. Furthermore, we vary the time periods over a short (T = 3), medium (T = 5) and long (T = 8) panel.
As the largest panel dataset with N = 150 and T = 8 leads to 180,000 observations with N × T = 1,200 identified exporter-period fixed effects λ it , (N−1) × T = 1, 192 identified importerperiod fixed effects χ jt , and N 2 −N = 22,350 identified directional pairwise fixed effects θ ij , we are confronted with high-dimensional fixed effects and computations that are not computationally feasible using standard software for GLM models. We thus employ the algorithms introduced in Stammann (2018) that utilize a weighted version of the Frisch-Waugh-Lovell theorem with alternating projections in a Netwon-Raphson algorithm for GLM models. The routines used are available in the Rpackage "alpaca" and output all N × T × 2 + N 2 fixed effects, that is, including the ones that would be unidentified when using a dummy variable regression framework. Thus, for the two-stage fixed effects estimator on the full sample including intra national observations, we recovered the corresponding   Table 3 shows the normalized (in percent of true value) bias and standard deviation as well as the convergence ratio (percentage of successful convergences) of MC simulations on the DGP based on the empirical data. Panel (a) depicts the results on the full data set, that is, including intra national trade flows. The fixed effects estimator (FE) shows no bias up to the third digit in all sample sizes for the discriminatory (D ijt ) as well as the nondiscriminatory (NIP jt ) trade cost variable. The standard deviation on both estimates decreases with increasing T (read table from left to right), but remains constant over increasing N (read table from top to bottom). The two-stage fixed effects estimator (FE-2S) employed on the full sample shows a small but consistent bias of between 1% and 1.2% in the discriminatory trade cost variable that is estimated in the first stage. This bias vanishes when only data on international trade flows-see Panel (b)-is utilized, since the nondiscriminatory variable, which is omitted in the first stage of this estimator, exerts a different impact on domestic and international trade flows. Hence, the first stage leads to biased estimates on the time-varying discriminatory variable, if domestic trade flows are included. In both the full and reduced sample, the second stage produces inconsistent estimates on the NDTP measures that are biased downwards by between 70% and 104%.

| Simulation results
Contrary to the FE-2S, the BV method produces biased estimates for the coefficients on both the discriminatory and nondiscriminatory trade cost measure and both the full and reduced dataset. The bias on NIP jt in Panel (a) is substantially smaller than for the FE-2S, however, reducing the sample to intra national observations the reverse is true for most of the sample sizes. Finally, the naive gravity specification yields biased and inconsistent estimates on both variables and all experiments in Table 3.
Next, we employ the empirical dataset ot test the robustness to a different distributional assumption of variance of the GLM, the Gamma distribution. The results are summarized in Table 4. Most notably in the results on the full dataset given in Panel (a), the estimates on NIP jt are biased for all estimators and sample sizes. In the small samples with either N = 50 or T = 3 the BV estimates are least biased, however, similar to the estimates of the FE-2S, the bias increases with N and T. Only the FE estimates show decreasing bias and standard deviation with increasing T. The reduction in bias progresses only slowly, that is, reducing from a bias of 25% for N = 150 and T = 5 to a bias of 23% for N = 150 and T = 8. Also note that increasing number of countries N-and thus including smaller countries that introduce more variation-has only little and ambiguous impact on the bias. Compared with the estimators on the dataset excluding domestic observations, shown in Panel (b), the FE estimator produces the smallest bias in percentage of the true parameter values.
As a next step, we analyze if the correlation structure between the nondiscriminatory trade policy variable NIP jt the latent inward multilateral resistance term m jt causes the large bias in the FE-2S and BV estimators. The following simulations are based on the fully simulated DGP described above 17 . Table 5

|
SELLNER with decreasing bias in T and decreasing standard deviation with increasing N and T. The MC results illustrate that applying the BV or FE-2S method leads to biased and inconsistent results, even if the latent and the observed importer-time component are uncorrelated. While the bias on NIP jt is smallest for the FE-2S when applied to the reduced sample, a substantial bias remains even in larger samples and there is no monotonic decrease in bias when increasing the number of periods. These findings are especially illuminating, since a violation of the fixed effect assumption could have been expected to improve the performance of this estimator. The results of this MC experiment, however, indicate that ρ(NIP jt , m jt ) = 0 is not a sufficient condition for unbiasedness of the FE-2S method.
Since the BV method is originally based on a log-linear model, Table 6 shows the results for the BV method estimated via GLM and OLS assuming a homoskedastic error, that is, setting 2 , ijt = 0.3. Panels (a) and (b) summarize the MC results for the baseline scenario with correlation between NIP jt and the latent multilateral resistance term m jt for the full and reduced sample respectively, and Panels (c) and (d) depict the results for the no correlation scenario, ρ(NIP jt , m jt ) = 0. In the baseline scenario, the estimates of the log-linear estimator BV-LogLin on the full sample are only slightly biased, but this bias does not systematically decrease with either increasing the number of countries or periods, see Panel (a). In the no-correlation scenario, bias and standard errors decrease quickly with increasing sample size and the estimates on NIP jt are virtually unbiased, as shown in Panel (c). However, excluding intra national observations in the no-correlation scenario in Panel (d), a bias of around 3% remains in the largest sample size.
Finally, the MC results on the two scenarios that increase the dispersion in trade costs and flows are shown in Table 7 for the FE estimator, along with the baseline results for comparison. Increasing the dispersion in endowments and trade costs has little effect on the properties of the FE estimator compared with the baseline scenario. Standard errors of the estimates decrease with increasing sample size along both dimensions and bias on the coefficient of NIP jt strictly decreases with increasing T. We obtain similar results with somewhat smaller standard errors for the scenario with a higher elasticity of substitution α = −9. Table 8 summarizes the results of the estimations. The estimates for the MFN in the full sample vary between −4 and −20, all being statistically significant on conventional levels. The naive gravity model results in estimates very similar to the BV, while the FE-2S shows estimates that are well below the values found in the empirical literature. The FE estimator results in an estimate of roughly −9 which corresponds with the estimates obtained in Heid et al. (2017), who used a similar dataset. The FE, naive and 2S-FE arrive at nearly similar direct effects of regional trade agreements with coefficients between 0.14 and 0.17. The BV method, using the transformed trade cost variables, yields small, negative, and a statistically insignificant parameter estimate on RTA.
Since, the MC results indicated that the PPML might be biased when the stochastic component of the true DGP corresponds to a GLM with Gamma variance function, we additionally estimated a Gamma GLM model. This estimation is constrained by the unavailability of statistical software routines 18 for the Gamma PML estimator that accounts for high-dimensional fixed effects. Therefore we employed the high-dimensional fixed effect GLM routine feglm (see Stammann, 2018) which is only defined for strictly positive dependent variables for the Gamma family. Results for this Gamma GLM (GGLM) estimator are summarized in column (6). To enable comparison, column (5) shows the results of the PPML estimator on the same sample. Dropping the zero flow observations has a negligible impact on the PPML estimates. The coefficient on RTA is very similar for the PPML and the GGLM estimator, but the effect of the MFN decreases in size from −9 to −6.8.
Panel (b) shows the results when intra national trade flows and the according observations are not available to the researcher. In this case, the FE is not applicable since it identifies the nondiscriminatory trade cost variable along that dimension of the data. The resulting estimates of the naive gravity Overall, the results of this empirical exercise demonstrate that the BV and FE-2S approaches result in implausible coefficients on either the discriminatory, the nondiscriminatory, or both trade cost variables. Comparing the results between the full and reduced sample size, the naive gravity model results are more robust than the results obtained from these two methods. In this specific application, disregarding the zero flow observations had only a minor impact on the FE estimates. When interpreted as a trade elasticity, the Gamma GML estimate on the most-favored nation tariff would be closer to the values given in Head and Mayer (2014) than the PPML estimate.  (6), which is estimated by Gamma GML. Heteroskedasticity robust standard errors in parentheses. * p < 0.1; ** p < 0.05; *** p < 0.01. 1 Estimated on X ijt > 0.

| EMPIRICAL APPLICATION
This section illustrates the methods discussed above in an empirical application. It should be noted that it is not the purpose of this application to derive exact estimates on the effects of the trade cost variables used, but to demonstrate the impact of the methods on the estimates of the NDTP variable. For a thorough analysis of the effects of trade policy measures on trade flows in structural gravity model, the challenges summarized in Piermartini and Yotov (2016) should be adequately addressed. Hence, we employ a panel dataset including intra national observations of 3-year intervals, apply a PPML and control for directional pairwise fixed effects.
The following application is based on the dataset used in Oberhofer and Pfaffermayr (2017) augmented by data on an importer-based NDTP variable. The dataset covers 65 countries between 2000 and 2012 in 3-year intervals, that is, five periods. Trade flows cover an aggregate of manufacturing industries and are based on OECD-STAN, UNIDO's INDSTAT database, CEPII and WIOD-Oberhofer and Pfaffermayr (2017), for a more detailed description of the dataset. In this dataset trade flows are normalized by total world production at time t such that ∑ i ∑ j X ijt = 1. The respective production and expenditures are given by PROD it = ∑ j X ijt and EXP jt = ∑ i X ijt . The dataset also includes information on the formation of regional trade agreements that is taken from "Mario Larch's Regional Trade Agreements Database" (see Egger & Larch, 2008). As a measure of nondiscriminatory trade policy the most-favored nation (MFN) tariff was selected. We use the simple average of the applied tariff rate over all products obtained from UNCTAD's Trade Analysis Information System (TRAINS) accessed via World Integrated Trade Solution (WITS). Data was cross-checked with the WTO Tariff Reports, and the most plausible value was taken 19 . Missing data values for some periods were linearly interpolated between the most recent values before and after the period of the missing value. As is common practice with data on tariffs, the variable will be included as ln (MFN ij + 1). The descriptive statistics of the data are summarized in Table A1 in the Appendix.
Prior to the estimation we removed all observations of directional country-pairs that have missing or recorded zero flows over all periods, since these observations cannot be identified by the directional pairwise fixed effects. From 21,125 observations, a total of 65 are removed according to this procedure, leaving us with 21,060 observations. Estimations are performed on the full sample including intra national observations and on a reduced sample covering only the with 21,060−(65 * 5) = 20,735 international observations. The estimators for the (1) naive, (2) BV, (3) FE-2S, and (4) FE are applied as described in Section 2, including all transformations of the data. Hence, the FE is estimated using the interaction with the international border dummy NIP jt × INT ij . Table 8 summarizes the results of the estimations. The estimates for the MFN in the full sample vary between −4 and −20, all being statistically significant on conventional levels. The naive gravity model results in estimates very similar to the BV, while the FE-2S shows estimates that are well below the values found in the empirical literature. The FE estimator results in an estimate of roughly −9 which corresponds to the estimates obtained in Heid et al. (2017), who used a similar dataset. The FE, naive and 2S-FE arrive at nearly similar direct effects of regional trade agreements with coefficients between 0.14 and 0.17. The BV method, using the transformed trade cost variables, yields small, negative, and a statistically insignificant parameter estimate on RTA.
Since, the MC results indicated that the PPML might be biased when the stochastic component of the true DGP corresponds to a GLM with Gamma variance function, we additionally estimated a Gamma GLM model. This estimation is constrained by the unavailability of statistical software routines 20 for Gamma PML estimator that accounts for high-dimensional fixed effects. Therefore we employed the high-dimensional fixed effect GLM routine feglm (see Stammann, 2018) which is only | 883 SELLNER increasing N as well. Results of the MC scenario with an error distribution corresponding to Gamma-GLM indicated biased estimates on the NDTP parameter even for the largest sample size with N=150 and T=8. However, the bias is again decreasing in T and standard errors decrease in both N and T.
Results of the baseline scenario indicate strongly biased and inconsistent estimates on the NDTP variable obtained by the BV and FE-2S on samples including or excluding intra national observations. Assuming homoskedastic errors, we illustrated that the bias of the BV method by PPML is due to a combination of functional form misspecification, correlation between the latent multilateral resistance term (MRT) and the NDTP variable, and missing intra national flow observations. Since the assumption of no correlation between the MRT and NDTP variable will not hold in practice, the BV method can be expected to produce biased coefficient estimates on NDTP variables. The FE-2S method produces persistently biased and inconsistent estimates, independent of assumptions regarding the correlation between the MRT and the NDTP variable. This finding is particularly revealing, since one may have expected the two-stage FE procedure to produce unbiased results given the fixed effects assumption is violated.
Three recommendations for practitioners may be derived from the results of this paper. First and foremost, assuming that NDTP impact only border-crossing trade flows, we recommend to use the approach outlined Heid et al. (2017) for econometric estimation of a theory-consistent SGM. The MC results on bias and consistency showed, that this estimator has favorable asymptotic properties under very general assumptions. In contrast, the BV and FE-2S methods that have been frequently employed in past empirical research, often because domestic trade flows were unavailable, yield biased and inconsistent point estimates. Given a DGP, similar to the one analyzed in this paper, researchers should refrain from ad hoc solutions to tackle collinearity between NDTP variables of interest and fixed effects that capture the structure of the model. The results of the empirical application emphasize this finding. Gathering data on intra national trade flows introduces an additional variation that can be exploited for proper identification, while still conforming to the structure of the theoretical model. Following this recommendation may prove to be challenging, since readily available datasets on intra national trade are still scarce and can have substantial publication lags. Alternatively, the domestic trade flows maybe constructed by the researcher using trade flows and national gross production statistics as, for example, in Oberhofer and Pfaffermayr (2017) or Heid et al. (2017).
Secondly, the MC results showed that the favorable asymptotic properties of the FE in terms of bias on the NDTP estimate work through the number of periods T included. Hence, given a trade off between the number of countries N and periods T when constructing a dataset, we recommend researchers to favor the latter over the further when interested in the effects of NDTP measures. However, maximizing the data coverage with respect to the periods included will be a particular problem when following the recommendations in Piermartini and Yotov (2016) of using 3-or 5-year intervals to allow for adjustments of trade flows to policy changes. Moving from aggregate manufacturing data to industry-level or product-group data further aggravates the problem. Moreover, analysis on this level will potentially be confronted with zero export or import flows of a product and country or on some dyadic pairs for all periods. In such cases, not all importer-period, exporter-period or directional pairwise fixed effects will be identified.
Third, the MC results illustrated that, assuming a GLM with a Gamma variance process as the true DGP, the FE estimated by PPML-while showing consistent behavior-yielded biased estimates for the NDTP variable, even for the longest panel (with T=8) considered in the simulations. We therefore recommend researchers to accompany their Poisson PML results with a Gamma PML estimates, especially when confronted with small sample sizes. Compared to the Poisson PML, the Gamma PML puts even less weight on observations with large conditional mean (see Santos Silva & Tenreyro, 2006), which in some cases may improve the model fit. Model selection could then be based on information 884 | SELLNER criteria, specification tests or prediction error statistics (see Martínez-Zarzoso, 2013;Santos Silva & Tenreyro, 2006). Endeavors to follow this recommendation might be hindered by the availability of statistical software routines that support both a Gamma PML-and thus zero trade flows-as well as high-dimensional fixed effects.
The findings of this paper are subject to several limitations that are open to further research. The MC results are focussed on the bias and consistency of the coefficient estimates, but did not examine the efficiency of the estimators or if and how strongly the estimated standard errors are affected by the incidental parameter problem. Future research could build upon the work of Pfaffermayr (2017Pfaffermayr ( , 2018, who showed that his derived constrained PPML-which leads to identical point estimates as a dummy variable PPML in cases of no missing data-yields asymptotically unbiased standard errors. Furthermore, in the DGP employed in this paper, we made ad hoc assumptions on the distribution of the time-invariant directional pairwise trade costs, since there is no clear empirical benchmark for the size of those effects. Further analysis on the impact of different correlation structures or dispersion properties of these effects would be insightful. Another limitation relates to the performance when confronted with a large number of zero trade flows in the data, since the theory-consistent DGP used in this paper does not generate such flows. Hence, augmenting this DGP, with a process that specifically produces zero flows (as, e.g., in Santos Silva & Tenreyro, 2011), while still obeying the structure of the model, could provide further insights.

1
Note that empirical research increasingly employs panel data sets and/or covers specific sectors or product groups. Exploiting the time variation via a panel data structure is strongly recommended in Piermartini and Yotov (2016), whereas they suggest to include every third or fifth year to permit adjustment of trade flows to policy changes. 2 For simplicity, the exporter-specific preference parameter from the original specification of Anderson and Van Wincoop (2003) is omitted. 3 Various types of models motivated from the demand or supply side suggested in the literature yield a model of the type presented here (see, e.g., Anderson & Van Wincoop, 2003;Chaney, 2008;Eaton & Kortum, 2002;Melitz, 2003). These models only differ in their interpretation of the elasticity of trade flows with respect to trade costs α. 4 The structural part can be obtained by assuming that Equation 1 only holds in expectation, replacing X ijt by E(X ijt ). 5 Figueiredo, Lima and Orefice (2016) argued that if one assumes that the log-linear model is identified, then the GLM estimates will be severely biased. Since the researcher cannot know which model is true, they proposed a robust quantile estimation approach. 6 Given i ≠ l and j ≠ k there are numerous sets making estimation computationally demanding. To speed up the computation, the estimation may be performed on only a subset of all possible combinations, leading to efficiency losses. 7 The dummy variable FE approach will in particular be computationally less demanding when optimization methods for handling high-dimensional fixed effects are used (as, e.g., Larch, Wanner, Yotov, & Zylkin, 2018). 8 Available datasets include CEPII's "TradeProd" (see De Sousa, Mayer, & Zignago, 2012), that covers approximately 150 countries over the years 1980 to 2006 at a three-digit level of the ISIC Revision 2 manufacturing industries and the World Input Output Database (see Timmer, Dietzenbacher, Los, Stehrer, & De Vries, 2015) covering 43 countries over the years 2000 to 2014 at 56 ISIC Rev. 4 industries.

| 885
SELLNER 12 A total of 384 values were interpolated this way. 13 Since the most-favored nation tariff is a direct price shifter, the parameter can be interpreted as the trade elasticity or elasticity of substitution. 14 Note that Head and Mayer (2014) advised against the use of a negative binomial GLM, because of the scale dependency of its estimates, as shown in Bosquet and Boulhol (2014). Thus we restrict our analysis to a Gamma variance function. 15 First, orthogonalize a standard normal variable (z jt ) to the centered and scaled effect m jt , then scale both vectors to length one (m jt , z jt ) and compute ̃j t =z jt + 1∕ tan ( arccos (r)) * m jt for an exact sample correlation r, with r = 0. 16 For the two-stage fixed effects estimator on the reduced sample excluding intra national observations, we additionally impose the identifying restrictions θ Nj = θ iN = θ N−1, N−2 = 0. 17 The baseline results of the simulated DGP are summarized in Table A2 in the Appendix. Compared with the baseline results of the empirical DGP, the parameter estimates on NIP jt of the naive, BV and FE-2S estimators are even more biased, but the main results regarding the asymptotic properties are similar. 18 The glm routine in Stata for a Gamma family with a logarithmic link function and the full set of fixed effects did not converge. 19 For example, Switzerland is reported to have a most-favored nation tariff of zero for all products according to the TRAINS database. However, the WTO Tariff Report show a positive value for the simple average of the MFN tariff. 20 The glm routine in Stata for a Gamma family with a logarithmic link function and the full set of fixed effects did not converge.