Instrument choice, implementation structures, and the effectiveness of environmental policies: A cross‐national analysis

While both the economics and political science literature have gained considerable knowledge on why some environmental policies work better than others, we still lack a clear and consistent picture on the determinants of environmental policy effectiveness. This is primarily because influences of the policy design and the implementation process have often been studied in isolation from one another. This article intends to close this gap by systematically examining how different instrument types and the implementation structures interact. By analyzing the air pollutant emissions of 14 OECD countries over a period of 25 years (1990 to 2014), it is revealed that only command-and-control (C&C) regulations that are put into practice through well-equipped and -designed implementation structures can be associated systematically with reductions in air pollutant emissions. Softer, so-called new environmental policy instruments (NEPIs), in turn, are found to have no significant influence on the outcome variable and this regardless of how they are executed and enforced. In essence, these findings indicate that (i) marketand information-based policy instruments seem to be not as self-implementing as often argued in the existing literature and that; (ii) C&C regulations and marketand information-based instruments need quite different prerequisites to

Instrument choice, implementation structures, and the effectiveness of environmental policies: A cross-national analysis

Introduction
All governments in the industrialized world face societal demands to combat problems of environmental pollution and degradation. The central means through which governments respond to such demands is the adoption of environmental policies (Sommerer & Lim 2016). As a result, the study of environmental policies and their effectiveness has received pronounced scholarly attention over the last decades. While the policy design literature has focused on whether or not the choice of certain instruments makes a difference for the policy outcome, the implementation literature has mainly concentrated on the analysis of factors undermining the proper working of environmental policies during their practical application. Due to the progress made in these strands of literature, we gained substantial knowledge about the assets and drawbacks of different instrument types (Fankhauser et al. 2010;Sterner & Coria 2013;Somanathan et al. 2014) as well as about the number of obstacles standing in the way of smooth and effective policy implementation (Jordan & Tosun 2012). The identified obstacles include, among others, the available administrative capacities (Shimshack 2014), the way, and the venues through which environmental policies are put into practice (Gollata & Newig 2017), as well as the nature of the affected target group (Weaver 2014;Knoepfel 2018).
Despite this progress made, however, only little is known about how the different instrument types and the determinants located at the implementation stage interact. In this context, it has been frequently argued that it is in particular the hierarchical forms of governmental intervention that require strong administrative capacities due to the need for constant monitoring and enforcement, while market-and information-based policy instrument are often considered somewhat self-executing (Holzinger & Knoepfel 2000;Howlett et al. 2009, p. 175). This proposition is, in fact, very much open to question and empirical testing as, for instance, also environmental taxes need to be collected (Cohen & Shimshack 2017;Bontems & Bourgeon 2005;Osterkamp 1978;Knill & Lenschow 2000) and the placement of information labels must be controlled and, if necessary, coerced (Haq & Weiss 2016). This article intends to take a first step in closing this gap. More precisely, it examines (i) which factors condition the effectiveness of environmental policies during their practical application; and (ii) if the influence of these factors varies by the chosen instrument type.
In this context, the analysis distinguishes between two groups of policy instruments that governments can choose from to combat environmental pollution and degradation. These are the so-called command-and-control (C&C) regulations and the new environmental policy instruments (NEPIs), comprising both market-and information-based instruments. To capture the influence of the implementation process, it concentrates on the core features of the implementation structures through which the respective policies are executed and enforced. The determinants considered are capacity, coordination, control, and cooperation.
Empirically, the analysis covers 14 OECD member countries between 1990 and 2014 covering both EU member as well as nonmember states. It examines the relationship between environmental policy outputs and outcomes by analyzing the annual changes in air pollutant emissions from both road traffic and industrial combustion for the four most common and conventional air pollutants, namely, nitrogen oxides (NO X ), sulfur dioxide (SO 2 ), carbon monoxide (CO), and coarse dust particles with 10 μm in diameter (PM 10 ). The overall logic of the research design is a large-N comparison. The relationship between clean air policies and changes in air pollutant emissions is examined by using standard techniques for the analysis of time-series cross-section (TSCS) data.
In essence, the article finds that NEPIs cannot be linked systematically to changes in air pollutant emissions. C&C regulations, by contrast, lead to a significant reduction in air pollutant emissionsbut only when they are adequately executed and enforced. It finds strong evidence that C&C regulations are more effective when they are put into practice through well-equipped and -designed implementation structures.
The remainder of this article is structured as follows; Section 2 introduces several theoretical propositions about the connection between environmental policy outputs and outcomes and deduces causal mechanisms from the literature, which then guide the development of the hypotheses. Subsequently, it turns to the measurement of the dependent and independent variables, before testing the relationship between environmental policy outputs and outcomes statistically. The final section summarizes the article's main findings and presents some concluding remarks.

Explaining the effectiveness of environmental policies
Policy effectiveness can be defined as the degree to which environmental policy measures (policy outputs) benefit the environment (outcomes). Policy outputs are the immediate result of the decision-making process. Policy outcomes, on the other hand, refer to the more general societal or environmental developments that are usually assumed to be at least in part the result of a given set of policy outputs (Adam et al. 2018). A typical example for an output in the field of environmental governance could be the adoption of new environmental programs or regulations that set out new standards for water and air quality. The corresponding outcomes would then be an increase or decrease of the biodiversity in rivers and lakes or of the concentration of certain pollutants in the air. A policy can be considered effective if there is a significant relationship between policy outputs and outcomes and if the policy measures taken have a positive influence on the quality of the environment. The degree of effectiveness, in turn, depends on the exact size of the effect.
Yet, although the relationship between environmental policy outputs and outcomes seems to be straightforward, explaining policy effectiveness is actually far from being trivial. While the common belief is that more ambitious and stricter policy measures lead to better results, one should actually not assume a straight and mechanistic link between means and ends. In contrast, there are several interventions possible that can render even the most ambitious policy largely ineffective (Knill et al. 2012). In the context of this article, the focus is on the influence of the instrument choice, the implementation structures through which the instruments are put in practice, as well as on their interaction.

Policy effectiveness and instrument choice
When speaking of environmental policy instruments, it has become customary to differentiate between two broader instrument types. These are the so-called command-and-control (C&C) regulations and the new environmental policy instruments (NEPIs) (Meckling & Jenner 2016). The latter group consists of both market-and information-based instruments. When engaging in traditional regulation, the government directly commands the reduction of pollution levels and controls the compliance with the performance or technological standards. By contrast, market-based instruments incentivize polluters (positively or negatively) to reduce their environmental externalities. This can be performed, for instance, by taxing harmful activities or by establishing tradable emission permit schemes such as in the case of the EU emissions trading system (EU ETS). Information-based instruments, in turn, aim at stimulating more environmentally friendly behavior by assisting business and individuals by providing information on environmental issues (Hobman & Ashworth 2013).
The scholarship on the effectiveness of the different instrument types is quite inconsistent. Most studies suggest that hierarchical intervention is a more appealing policy option when it comes to policy effectiveness as it provides a rather straightforward connection between policy outputs and outcomes (Baldwin & Cave 1999). There are, however, also other works that suggest no significant differences between the instrument types, or even attribute a superior performance to NEPIs as opposed to C&C measures (Andersen & Sprenger 2000;Li et al. 2017). Critics of regulatory instruments often point to the fact that they are too inflexible. They emphasize the lack of incentives for firms to reduce their environmental harm beyond the policy standard set and the necessity of substantial public enforcement capacities to become effective (Schwartz 2003). NEPIs, by contrast, aim at altering the costs or benefits of the target group (market-based instruments) or the priorities and significance attached to environmental issues (information-based instruments) (Mickwitz 2003). They are thus said to work not through the cost-and resource-intensive process of hierarchical enforcement but by encouraging certain behavior by market signals or moral suasion.
Hence, existing scholarship does not univocally point in the direction of generally favoring traditional regulation over NEPIs (or vice versa). Rather, it is argued that the effectiveness of a policy instrument is strongly determined by both the circumstances under which it is applied and the capacities available (Faure 2012). Those governments who possess sufficient administrative capacities to monitor and enforce legal compliance should rely more on hierarchical forms of intervention. Governments that lack these capacities might be better off when applying a market-or information-based policy approach (Holzinger & Knoepfel 2000;Howlett et al. 2009;Kostka 2016). The logical implication of this reasoning is that NEPIs are usually considered to be rather self-executing, whereas the effectiveness of regulatory means seems to be more prone to shortcomings during the implementation stage. This reasoning gives way to the following hypothesis.
Hypothesis 1a: The effectiveness of NEPIs is less dependent on the implementation context than the effectiveness of C&C regulations.
In technical terms, hypothesis 1 implies thatwhen no other factors are consideredit should be easy to detect a main effect of policy outputs on environmental outcomes for NEPIs. As institutional demands are higher when governments choose to rely on more traditional forms of governmental intervention, in turn, it should be more difficult to detect an unconditional effect of policy outputs on environmental outcomes for C&C regulation. Obviously, this does not imply that the influence of NEPIs on the outcome dimension is entirely unconditional. The success of green taxes in reducing environmental harm, for instance, depends on the costs of technological alternatives. When taxes are high and the costs of technological improvements are low, regulated firms have a stronger incentive acquiring pollution abatement equipment than in constellations in which potential tax savings hardly offset the investments in retrofitting existing facilities and processes. The underlying argument is thus not to say that the effectiveness of NEPIs is completely independent from additional factors but simply that their influence is weaker than in the case of C&C regulations.

Policy effectiveness and implementation structures
All policies require some institutional arrangements to convert abstract policy ideas and desires into real-world actions designed to alter the policy addressees' behavior (May 2015). The design of these structures can be distinguished according to the features of the involved implementation authorities, their relationship to each other, as well as to their coordinating ministry (Mayntz 1978). Moreover, implementation structures differ in the extent to which they encompass (networks of) private entities (Feiock 2013). To capture the influence of the implementation process, it is thus reasonable to focus on four core features of the implementation structures through which the respective environmental policies are executed and enforced: coordination, capacity, control, and cooperation.
Modern law enforcement usually brings together various public authorities to work in concert in order to achieve a set of objectives (O'Toole & Montjoy 1984). This requires the (I) coordination of activities both along the hierarchy (vertical coordination) as well as across administrative entities at the similar level (horizontal coordination). The need for coordination increases with the number of authorities and divisions that are involved in the process of policy implementation (Pressman & Wildavsky 1984). Existing studies reveal that both intra-and inter-organizational boundaries hamper the free flow of information. Scharpf (1978), for instance, concludes that organizational boundaries substantially complicate conflict solution in the case of a policy or jurisdictional conflict. This is true even in the case of an asymmetric relationship between the involved authorities. While the apparently dominant implementation body may exercise hierarchical power over subordinate units, they still might be fully dependent on them due to existing information asymmetries and a lack of expertise. Thus, the literature suggests that, from the outset, having only one single rather than multiple implementing authorities avoids potential maladministration and -implementation caused by interorganizational coordination processes (Hepburn 2010).
Moreover, few provisions are as crucial to the ultimate goal achievement as adequate administrative (II) capacities (Holzinger & Knoepfel 2000). The exact capacity requirements of an implementing authority depend upon its responsibilities (Tobin 1990). In broad terms, authorities typically need financial and personnel resources as well as technical knowledge to fulfill the tasks they were established for (Knill et al. 2008).
Here, more specialized agencies are usually considered to have a stronger record when it comes to the efficient and effective policy execution due to their greater specialization on the issue at stake (Potoski 2002). In this context, Scharpf (1978) argues that with decreasing levels of specialization, it becomes increasingly unlikely that policy implementation can be optimized without connecting to the outside. This again leads to a stronger need for coordination at the price of also risking encountering the problems associated therewith (see above).
In addition, much depends on the question whether the design of implementation structures allows politicians and central public manager to exercise strong (III) control over the subordinate authorities (De Mesquita & Stephenson 2007). Implementing authorities have substantial decisional discretion during the execution of policies. They may make use of this discretion in cases where their regulatory preferences diverge from those of central policy-makers. Similarly, "bureaucratic" or "regulatory capture" can lead to gaps between the intended policy goals and the actual outcomes (Lodge 2014). The terms refer to situations in which the implementing authorities are more beholden to the companies they are supposed to monitor than to their own formal mandate or the public interest (Dal Bó 2006). The literature supposes that top-down control is a potent way to limit bureaucratic drift and that oversight becomes easier the less the execution of policy programs is decentralized and the fewer levels of government are involved in the implementation process (Weibust 2009).
Despite the strong necessity for administrative capacities, however, governments have lost much of their central steering capabilities in environmental matters over the last decades. The promotion and proliferation of concepts such as private governance, governance without government, public-private partnerships, and others, have led to outsourcing key implementation tasks to nonstate parties and thus to a much stronger need for (IV) cooperation between public implementing authorities and private sector actors (Mol 2016). This increased cooperation, however, rarely has met the expectations of those arguing in favor of contracting out and public-private-partnerships (Mayer & Gereffi 2010). Rather, most implementation studies suggest that the success of public-private cooperation strongly depends on the degree to which public authorities have actually maintained the resources that are necessary to supervise their private counterparts (see inter alia Brown et al. 2013). From this perspective, any involvement of nonstate actors in the implementation process must be considered as adding new risks and challenges to the implementation process. Given the abovementioned theoretical consideration, we can formulate four different hypotheses all describing the influence of the implementation structure on the effectiveness of environmental policies. They read as follows: Hypothesis 2a: The effectiveness of environmental policies is higher if less authorities are involved in the implementation process (coordination). Hypothesis 2b: The effectiveness of environmental policies is higher if the implementing authority/ authorities is/are specialized to enforcing environmental policies (capacity). Hypothesis 2c: The effectiveness of environmental policies is higher if the implementation authority/ authorities is/are located at the central level of government (control). Hypothesis 2d: The effectiveness of environmental policies is higher if key implementation tasks are not outsourced to private sector actors (cooperation).
At this point, it is important to highlight thatalthough maybe paradoxical at first sightthe hypotheses 1 to 2a-2d are neither competing nor are they mutually exclusive since the effectiveness of NEPIs might be overall less dependent on the implementation context and still vary from one institutional arrangements to the other.

The research design and operationalization
The analysis covers 14 OECD countries comprising both EU member states and nonmember countries. The sample includes Austria, Belgium, Canada, Denmark, France, Germany, Ireland, Italy, Portugal, Spain, Sweden, Switzerland, the United Kingdom, and the United States. The case selection is motivated by the goal to observe sufficient variation in the key variables. Within the broader sample of highly industrialized democracies, the analysis employs a "diverse case" selection strategy (Gerring 2008). The sample includes countries that are usually portrayed as having relied heavily on C&C regulation such as Germany and Austria (Héritier et al. 1996). At the same time, it covers a number of countries that started quite early to experiment with eco-taxes and tradable permits such as Denmark, the United Kingdom, and Sweden (Jordan et al. 2003). The investigation period ranges from 1990 to 2014.

The dependent variable: Assessing environmental outcomes
The empirical analysis focuses on the variation in air pollutant emissions as one major outcome of environmental governance. Air pollution is a prominent and still highly relevant environmental policy issue. Moreover, data on air pollutant emissions is well documented across different countries and years and thus more easily and extensively available than other environmental outcome indicators such as water pollution or soil contamination. The relevant pollutants under scrutiny are NO X , SO 2 , CO, and PM 10 . All these gaseous pollutants either adversely affect the health of human beings or the integrity of ecosystems. Greenhouse gas (GHG) emissions are intentionally excluded from the analysis given the "perplexing trade-offs" (Bryner & Duffy 2012, p. 12) between reductions in GHG emissions, in particular, CO 2 emissions and the conventional pollutants.
Emissions of NO X , SO 2 , CO, and PM 10 mainly originate from combustion processes related to power generation, road transport, and industrial production. The dependent variable is thus operationalized as changes in air pollutant emissions from two different sectors, namely, from road transport and industrial combustion. All pollutant emission levels are standardized using 1990 as the base year for total emissions (100 percent). This procedure allows pooling the different pollutants into one single dependent variable (clustered in years and country-sectorpollutants). In fact, it is only through this procedure that it is possible to gather a sufficiently high number of instances of NEPI changes that allows for a comparison of the two instrument types. The repeated observation of 14 countries over 25 years for two sectors and four pollutants leads to a maximum of about 2,800 observations.

The independent variable: Assessing environmental policy outputs
To assess environmental policies and their ambitiousness, the analysis relies on a concept proposed by Knill et al. ((2012); see also Steinebach et al. 2019). Knill et al. (2012) assess the ambitiousness of policy outputs by the evaluation of three distinct components: changes in policy targets, policy instruments, and instrument settings (comprising both instrument level and scope).
Policy targets refer to the specific objectives addressed in a certain policy field and thus focus on the question who or what exactly is regulated by the government. Policy instruments are an indicator of which specific tools, out of a range of options, are used by policy-makers in order to achieve their targets. It thus refers to the question how does the government intend to solve its environmental problems. A specific policy target is often addressed by the use of various instruments. The instrument level, in turn, refers to the exact calibration of a policy instrument. For instance, in the case of an obligatory emission standard, the level prescribes the maximum admissible volume (for cars: 95g/CO 2 per km). The instrument scope, in turn, covers the specific cases or addressees targeted by a certain policy instrument. It can prescribe, for example, whether a regulation relates to all conventional vehicles in road traffic or only to newly registered ones. Figure 1 presents the different policy components and how they link together. Every policy target has at least one instrument, which in turn has some level and scope. Therefore, every structural change entails a change to the instrument settings, but not vice versa.
As a result of the presented hierarchical structure among the components, changes can be weighted differently by a simple logic of aggregation. By definition, changes in policy targets have to involve changes in at least one policy instrument and its calibration (value "4"). Following the same logic, the establishment of a new policy instrument inevitably leads to changes of an instrument's level and scope (value "3"). By contrast, both level (value "1") and scope (value "1") (or both) may change without any implication for the existence of either policy targets or instruments.
Given this paper's interest in the difference that the instrument choice can make, the analysis broadly distinguishes between two types of policy instruments, namely, traditional "command-and-control" instruments and "new environmental policy instruments". In this context, obligatory policy standards, bans, and technological prescriptions represent C&C regulations. Environmental taxes, subsidies, liability schemes, and information-based measures, in turn, can be regarded as NEPIs. It must be noted, however, that there is no commonly accepted positive definition of NEPIs. Rather, NEPIs must be considered a catch-all category for instruments that have not been used traditionally to target environmental pollution.
The data used within this paper were compiled within the CONSENUS project financed by the European commission, with the help of national experts and then coded by the project members. In addition to the original dataset, it was necessary to (i) collect information on the dates at which the policies have entered into force (as opposed to their adoption dates); (ii) apply the coding scheme also to EU Regulations; and (iii) extend the existing data to the years 2006 to 2014. The policy change assessment relies on a comprehensive data collection encompassing all relevant national legal documentslaws, decrees, and regulationsin the policy area under review. The legislation was collected through national legal repositories, secondary literature, and scholar analyses. Legislation emanating from subnational level was excluded from the data collection process. To capture the policy targets and instruments in the area of environmental policy, the data collection process was carried out by using an encompassing list of predefined policy targets and instruments that could be derived from the existing literature. A full list of all policy targets and instruments under scrutiny is provided in the appendix (Tables A1 and A2).
As shown in Figure 2, about 530 instances of policy output changes could be observed during the investigation period. About 93 percent of these policy outputs related to (changes in) C&C regulationsthe remaining 7 percent to NEPIs. The observed instances of NEPI changes cover a range of different instruments such as green  taxes (see, for instance, the French and Danish tax on NO X or SO 2 pollution from industry), levies (see, for instance, the German highway toll that is based on the trucks' emission class), allowance market systems (the US Acid Rain Program) as well as different labeling schemes. Here, it seems necessary to emphasize that after the policy changes have been assessed individually, they are again summed up for each year. For instance, in the year 2001, Austria put new NOx emission limits for both passenger cars (level change) and heavy-duty vehicles (level change) into effect and prescribed the use of certain emission abatement technologies for passenger cars (instrument change). This adds up to the values of "5" for the respective country-sector-pollutant and year.

The moderating variable: Assessing implementation structures
Gathering reliable cross-sectional and -temporal data on implementation structures is a highly challenging and time-consuming endeavor. This is particularly the case, as implementation structures do not only vary across countries and over time but also among policies within the same country. This issue was resolved by concentrating on the manifest and most visible patterns of implementation structures. As elaborated in more detail below, each of the discussed determinants is operationalized by a single indicator, which can take on one of two values. The indicators take on a value of "0" if challenges with regard to the key determinants of the implementation structures can be expected to be present; and "1" if these challenges can be deemed to be largely absent.
The aspect of (I) coordination is captured by measuring whether a policy is implemented by a single or by several implementing authorities. In line with the theoretical debate laid out in the previous section, the analysis is not so much focusing on whether or not implementing authorities have developed adequate coordination mechanisms, but rather if there is a general need for coordination or not. (II) Capacity is a multidimensional concept (Wu et al. 2015). A crucial dimension of both an organization's "analytical" and "operational" capacities is whether bureaucracies possess sufficient expertise, that is knowledge on the subject matter as well as on the clients and conditions in the policy domain that they serve (Pattyn & Brans 2015, p. 192;Peters 2015, p. 220). All things equal, sectoral and specialized agencies can be expected to find it easier to develop and acquire such expertise. Thus, in the scope of this paper, capacities are captured by whether the executing authority is specialized in executing the task at hand, that is, the enforcement of environmental policies, or serves various functions (singlepurpose vs. multi-purpose implementing authority). Obviously, the degree of specialization is only a rough approximation of an authority's actual capacities given that an agency might be specialized but still lack the resources to do its job effectively (see for instance the US EPA under Trump (Bomberg 2017)). Applying an alternative indicator focusing on the authority's budget or the administrative headcount, however, is not feasible given the lack of available information for most countries in the sample. The extent of (III) control depends on whether the implementing authority is established at the central (federal) or the regional (local) level. While this operationalization does not exclude the possibility that also local authorities are under strong pressure and control from central policy-makers, it can be expected that there are "on average" differences that are captured by the proposed coding procedure. (IV) Cooperation is assessed by whether nonstate actors are involved in the policy implementation process. Private actors can be involved in execution and enforcement activities such as vehicle safety and emission inspections, the collection of toll charges, or the verification of industrial plants' emission reports. To avoid potential misunderstandings, it must be made clear that, in the context of this article, private sector involvement only refers to the practical application of environmental policies with the help of or directly through private actors. This explicitly excludes other forms of cooperation with the public sector such as public hearings, consultations, or voluntary agreements (Jordan et al. 2003). Table 1 summarizes the proposed indicators.
The data on implementation structures were gathered from various sources. These were primarily the information made available online by the responsible authorities. In addition, other sources such as academic contributions and the reports of international organizations (in particular by the OECD and the European Energy Agency (EEA)) were consulted. If necessary, the responsible authorities were contacted directly and asked for clarification.
In the process of coding, some specifications had to be made. First, policies are often enforced through various processes and different implementation structures in parallel. For instance, vehicles' emission standards can be checked both prior to new vehicle types' approval, and later on during inspection of vehicles in use. Likewise, the quality of fuels is controlled upon importation (usually by the customs authorities), at the refineries, and at the refueling stations. There is not much cross-country variation in the implementation when it comes to the approval of new vehicle types or the importation of fuels. In both cases, we usually find central national or even European authorities in charge of carrying out the key monitoring and enforcement activities. Thus, during the assessment of the implementation structures for the respective environmental policy outputs, the focus is primarily on vehicle inspection programs and the control of diesel sulfur contents at the point of sale, that is, the refueling stations. Another issue pertains federal states such as Germany or the US, where implementation structures for national policies may vary across the different states and regions. In these cases, the analysis refers only to the implementation structures in a country's most populous states, regions, or provinces. Lastly, it must be mentioned that, in practice, private actor involvement cannot always and universally be considered as adding new risks and challenges to the implementation process. In some cases, the involvement of private sector actors actually constitutes an additional enforcement and monitoring mechanism. For instance, the so-called "emissions badge" ("Feinstaubplakette") in Germany is primarily controlled by traffic wardens (for parked vehicles) and policemen (for moving vehicles). In addition to this, state-certified inspecting organizations such as DEKRA and TÜV can deny vehicle inspections in the case that passenger cars or heavy-duty vehicles are not labeled correctly. In this case, the actions of the private vehicle inspection companies contribute to rather than detract the proper implementation of the policy. Under these conditions, they are coded as "1" rather than as "0".
In sum, in about 88 percent of the implementation structures examined, there was a single authority in charge of carrying out the key implementation tasks. In about 85 percent, the public authority served a single purpose, that is, was specialized to environmental matter. In 58 percent of the cases, the implementing authority was established at the central level. In slightly less than half of the cases (48 percent), policy implementation was carried out exclusively by public actors, that is, without nonstate actor involvement.

Alternative explanations
Apart from environmental policies, their design, and the way policies are executed, several other factors may affect changes in air pollution. As a consequence, one needs to test the relationship between environmental policy The implementing authority(−ies) is/are established at regional/local level (=0) (IV) Cooperation No involvement of non-state actors in the implementation process (=1) Involvement of non-state actors in the implementation process (=0) outputs and outcomes against several other variables. The "Environmental Kuznets Curve" hypothesis posits that pollution and economic growth relate to each other in a nonlinear fashion. It assumes that while environmental damages initially increase with countries' economic development, they reach a peak point before ultimately decreasing (Grossman & Krueger 1991). Thus, one must account for the potential influence on changes in pollutant emission levels by incorporating the natural log of GDP per capita in the analysis (OECD 2018a). However, it is not only the absolute levels of economic development that may affect emission levels, but also more short-term and temporary changes in countries' economic performance. Emissions may fluctuate strongly with the economic situation and are highly correlated with the GDP and energy consumption (Giedraitis et al. 2010). Accordingly, the analysis takes account of these short-term changes by including the GDP growth rate into analysis (OECD 2018a). In addition, there could be confounding effects related to demographic changes. Consequentially, and in accordance with previous studies, changes in the total and the urban population are included in the analysis. The first aspect is controlled for by the growth rate of the population (OECD 2018b); the second is measured as changes in the share of the population living in urban areas (World Bank 2018b). Moreover, it has been found that the structural composition of a country's economy affects air pollutant emissions. Thus, the analysis controls for changes in the size of the industrial and energy sector via their contribution to total GDP (World Bank 2018a). Last, it is necessary to look for the influence of the system of interest intermediation. At the heart of this concept lies the distinction between neo-corporatism and pluralism. Previous studies found that more corporatist arrangements generally tend toward an overall higher environmental performance due to their stronger tendencies to internalize externalities (Scruggs 2001). The majority of variables can be readily derived from the OECD, the IEA, and the World Bank databases. 1 To account for the influence of the system of interest intermediation, the corporatism index as developed by Jahn (2016) is used as it includes both the most important conceptual aspects of corporatism as well as the changes in corporatist arrangements over time. The dependent and independent variables are summarized in Table S1 in the online appendix. The sample is balanced with some data points missing for PM 10 for the early 1990s.

The analytical model
The analysis uses an ordinary least square (OLS) regression with a first-difference dependent variable in which the independent variables enter as either first-difference or level. Given that changes in air pollutant emissions are measured as relative deltas, that is, annual differences as a percentage of the 1990 base value, the changes in the outcome variable are percentage point (pp) changes. As it has frequently been discussed in the literature on international environmental regimes (Hovi et al. 2003), an environmental policy is effective to the extent to which it reduces the level of pollution below what would otherwise have been achieved. In the context of this analysis, all data points (years and country-sector-pollutants) without prior policy output changes are used as counterfactuals/baseline scenario while controlling for all changes in the outcome dimension that can be traced to other influences (for a similar approach see Ringquist & Kostadinova 2005).
The standard errors are corrected to account for clusters in the data structure. More precisely, the analysis uses Driscoll and Kraay's covariance matrix estimator to correct for standard errors (Hoechle 2007). Although Driscoll and Kraay's standard errors tend to be slightly more "optimistic" than other estimators, given our specific case, they tend to produce more robust results than other approaches as the cross-sectional dimension (N) is relatively large compared to the temporal one (T) (Hoechle 2007). Moreover, there is a certain risk that regression disturbances are not only correlated over time but also between the different spatial units due to (i) potential trade-offs in simultaneously reducing different air pollutant emissions just as (ii) the eventuality of plant reallocations to other countries with more lenient policy standards and lower production cost.
To deal with serial correlation and modeling dynamic processes, a lagged dependent variable is included into the analysis (Beck & Katz 2011). All other variables are lagged by one year. The only exception are the levels of and the changes in countries' economic output (GDP pc and GDP growth), which are assumed to exert an immediate effect on environmental outcomes. By contrast, one must opt for a more inductive approach in specifying the lag structure between policy outputs and outcomes. Given that we have no prior knowledge on whenif at allclean air policy outputs make a difference for the outcome, it seems reasonable to allow the time lags to vary using the R-square value as the efficiency criterion. In the case of new emission standards for vehicles, for instance, it may take years until a sufficiently high number of old vehicles have been replaced by new ones and before a true difference in traffic emission is both noticeable and significant. It is virtually impossible to specify these lagged effects deductively for all cause-effect relationships in the model ex ante. This lag structure also ensures that the policy output changes antecede (possible) changes in the outcome dimensions, thus reducing issues of reversed causality.
A substantial share of the unit-specific variation is already eliminated by (i) using 1990 as base year and (ii) by referring to year-to-year differences. However, given that the dependent variable does not only encompass air pollutant emissions changes across different countries but also across different sectors and pollutants, there still might be some unobserved factors that generally (dis-)favor outcome changes and which are not yet covered by the analysis. For instance, it could be the case that some sectors or pollutants generally have a higher reduction potential and are thus simply more prone to changes than others. To control this unobserved unit-level heterogeneity and to ensure that it is not specific sectors or pollutants driving the regression results, dummy variables for each sector (industry and road traffic) and pollutant (NO X , SO 2 , CO, and PM 10 ) are applied.
Time (fixed) effects are intentionally not included into the analysis. The only plausible way of how the time factor could be expected to affect air pollutant emissions is through technological advancement, that is, a constant trend affecting all countries in the panel both similarly and simultaneously. This dynamic, however, is already captured by the level of economic development, which has more or less constantly increased across all countries in the sample.

Examining the relationship between clean air policy outputs and outcomes
The first model, reported in Table 2, examines whether C&C regulations and NEPIs can be directly linked to changes in air pollutant emissions. The results reveal that neither C&C regulations nor NEPIs turn out to be significant predictors of changes in air pollutant emissions in general. It is possible to argue, however, that today's environmental policies were never meant to function in isolation but must be considered broader instrument mixes (Gunningham & Sinclair 1999). This view implies that policy instruments are intended to work in combination with each other. It is thus worth testing for the interaction effect between C&C regulations and NEPIs. As presented in model II, however, again no significant relationship between policy outputs and outcomes can be found. This finding suggests that, if no other factors are taken into account, there is always some gap between policy outputs and outcomes (for similar findings see Knill et al. 2012). This essentially implies that the proper operation of both C&C regulations and NEPIs seems to be strongly dependent on the implementation context and that, therefore, none of the instrument types is more self-executing than the other. As a result, hypothesis 1 cannot be confirmed given that the null hypothesis cannot be rejected. Given this finding, let us now turn to the analysis of the factors that are hypothesized to affect the implementation of the different instrument types.
To recall, implementation structures are necessarily policy-specific, that is, they vary from policy to policy. Models III to VI thus present different interaction terms: one for the interaction between C&C regulations and the respective determinants of the implementation structures and one for NEPIs and the corresponding implementation structures. Given that the implementation structures per se cannot be expected to benefit the environment if there are no environmental policies to apply, the main effects of the key determinants of implementation structures are excluded from the analysis. Formally, this approach can be stated as "x + x:y" instead of "x*y". However, also when using the ordinary approachnamely, a two-way interaction, which includes both main effects as well as their interaction (Brambor et al. 2006)all of the findings remain unchanged (see Table S2 in online annex).
The analysis reveals that C&C regulations have a stronger negative effect on air pollutant emissions when the implementing authorities are located at the national rather than the subnational level and thus are in the direct reach of the central government (Model V). Although in line with the theoretical expectation formulated above, this finding is remarkable as it contradicts arguments from the bottom-up view that decentralized structures are more effective given that the authorities are "closer" to the polluters. A possible explanation lies in the fact that environmental policy implementation is often of rather "technical" nature and characterized by relatively low policy ambiguity and conflict. In these cases, Matland (1995) speaks of "administrative implementation" (p. 160)    processes for which it primarily requires resources and strict hierarchical control to function properly. Moreover, it needs to be mentioned that a possible alternative explanation that cannot be fully ruled out on basis of the analysis is that levels of government have simply less capacity than senior ones (for discussion see Howlett & Wellstead 2012).
In contrast to the other models, the findings of Model VI are somewhat tricky to interpret as, when including the interaction term, the main effect of the NEPI variable is both negative and significant without the interaction effect reaching levels of statistical significance. Given that there are no policies that are not executed and enforce through some sort of implementation structure, the coefficient of the main effect refers to all instances for which no NEPI changes are observed and thus actually portends toward an overall ineffectiveness of NEPIs. Figure 3 presents the predicted values for the interaction between C&C regulation and the central control dummy. The x-axis indicates the ambitiousness of the policy measures taken, while the y-axis presents their effect on air pollutant emissions when all other variables are held constant, that is, kept at their mean. As indicated by the confidence intervals' upper and lower bounds, the plots reveal that it is only for the combination of C&C regulation and implementing authorities being located at the central level that we can safely conclude that the effects of policy output on outcomes is negative and significantly different from zero. All in all, hypothesis 2c is thus the only one that can be confirmed on the basis of the presented analysis, while the hypotheses 2a, 2b, and 2d cannot. However, also hypothesis 2c requires some modification as the theoretical expectation does not hold true for both instrument types but only for C&C regulations.
With regard to the confounding variables, the results confirm the discussed expectations from previous research. Across all models, the most robust predictors of emission changes are changes in GDP growth. Periods of strong economic growth increase air pollutant emissions. The coefficients for the fixed effects estimators are not reported in the different regression tables. However, it seems important to discuss them in brief. Despite some variation across the models, emissions from road transport are particularly more difficult to reduce, while SO 2 emissionsfrom both road transport and industryhave a reduction potential that is constantly higher than those of the other pollutants in the sample (see also Jahn 2016, p. 102).
Given that the results of the analysis do not seem to substantially correspond to the theoretical expectations, one promising way to think of the different determinants of the implementation structures could be to consider them as set of individually necessary but only jointly sufficient conditions for policy effectiveness. This logic suggests that only when mostor even allof the challenges discussed above are absent, clean air policies might have the intended effects. A separate analysis of each of the determinants is blind to such compound effects. To check for this, it makes sense to simply sum up the values of the different indicators. The highest possible value of "four" thus represents the empirical manifestation of a single central specialized authority without private sector actor involvement. The value "zero", in turn, exemplifies multiple decentralized unspecialized authorities with private sector actor involvement. It seems most plausible to suggest indicator substitutability as the dominant logic for defining the "gray zone" between these two "poles" or "extremes". This implies that the absence of one feature can be compensated by the existence of another (Goertz 2006), which corresponds to Wittgenstein's (1953) logic of family resemblance.
Model VII in Table 3 presents the interactions between C&C regulations/NEPIs and the aggregated values for the respective implementation structures. Again, NEPIs have no significant influence on the outcome variable. C&C regulations, in turn, appear to have a stronger negative effect on air pollutant emissions the higher the aggregated value of the implementation structures is. The relationship between C&C regulations and implementation structures is shown in Figure 4. Here, it becomes clear that it is only for implementation structures with a value greater than "two" for which we can safely conclude that the effect of policy output on outcomes is negative and significantly different from zero. The most ambitious policy outputs that are put into practice through implementation structures with an index value of "three", decrease air pollutant emissions on average by roughly 2.5 percentage points. In implementation structures with an index value of "four", policy measures with a similar degree of ambitiousness even lead to a 4.9 percentage point decrease in air pollutant emissions.
There are several aspects of both the conceptual and methodological approach chosen that could affect the empirical findings. A potential reason for why the presented hypotheses could not be (fully) confirmed based on the analysis might be that using the rather broad category of NEPIs could obfuscate nuanced differences between economic-and information-based instruments. It could be the case that, for instance, information-based instruments are simply unsuitable to change the people's consumption behavior and this no matter how good or bad they are actually implemented and enforced. Thus, it makes sense to (re-)divide the group of NEPIs and only run the analyses for economic-based instruments. As presented in the Table S3 in the online annex, all key findings remain robust. Another potential caveat to the use of the proposed index to capture the ambitiousness of the policy output changes observed is thatin realitythe difference between changes in policy targets, instruments, and instrument settings (level and scope) are either greater or smaller as indicated. To take account of this, one must check for different index specifications. This was performed by raising the initial index value to the power of "2" (x 2 ) and "0.5" (x 0.5 ). The former specification comes with a stronger distance between policy targets, instruments, and instrument settings, while the latter represents a smaller one (see Tables S4 and S5 in the online appendix). Another issue that might affect the results is the choice of the base year. Say, for instance, that it was an unusually high level of emissions for a certain emission type in one of the countries in the year 1990, this would affect the variable for the rest of the time series. To prevent potential analytical distortions resulting from the chosen base year, the analysis is re-run using different baseline years (see Tables S6 and S7 in the online  appendix). Other models, in turn, control for the influence of energy prices (Table S8 in the online appendix). In Table S9 in the online appendix, the analysis is replicated using ordinary panel-corrected standard-errors techniques as proposed by Beck and Katz (1995). As presented in the online annex, none of these modifications alters the key results. Moreover, it makes sense to run separate analyses for each pollutant individually. While the effects of C&C regulations and implementation structures consistently point into the same direction as for the pooled analysis, they do not reach level of statistical significance for PM (see Table S10 in the online appendix). Moreover, NEPIs do seem to make some difference in combination with implementation structures in the context of CO emissions. This finding, however, have to be treated with great caution given that, in the case of the disaggregated analysis, they only rely on the analysis of five instances of NEPI changes.
Last, it seems necessary to highlight that the relatively low (adjusted) r-square value (~four percent) in the regression models is not a particular problem given the X-centered research design. In other words, the analysis does not intend to explain the whole variance of air pollutant changes but primarily the impact that is caused by environmental policies. It thus shares the relatively low (adjusted) r-square with other scholarly contributions that intend to explain the same phenomenon (see Knill et al. 2012).

Conclusion and discussion
This article examined how different instrument typesnamely, C&C regulations and NEPIsand determinants located at the implementation stage interact. By means of quantitative analysis techniques, it was detected that only C&C regulations that are put into practice through well-equipped and -designed implementation structures can be systematically associated with reductions in air pollutant emissions. NEPIs, in turn, do not make any difference for the outcome dimension and this no matter how they are executed and enforced. While the first finding is very much in line with (theoretical) propositions made in the existing literature, the latter is more surprising. Given that it is premature to condemn NEPIs as entirely ineffective on the basis of the presented analysis, there are in particular two aspects that stand out: first, NEPIs seem to be not as self-executing as often argued in the literature and that, second, the factors guaranteeing the proper working of C&C regulations are not the same ones that can be used to explain the effectiveness of NEPIs.
For now, the key findings of this article must be read as a clear plea for a "green leviathan" (Duit et al. 2016) and for strong governmental intervention in the area of environmental policy. While the regulators should be generally congratulated to their choice of instruments (about 90 percent of all policy changes are related to C&C regulation), there is still much to gain at the implementation stage. Here, only about 50 percent of the implementation structures are designed in a way that they ensure a forceful application of environmental policies. From this perspective, public and scientific discourses about using "new" policy means to combat problems of environmental pollution and degradation are somewhat misguided. Rather, governments should invest more in their enforcement capacities (for a similar finding see Adam et al. 2019).
Future research should delve more deeply into the question of which specific factors condition the implementation of NEPIs. As these determinants could not be found within the administrative and bureaucratic structure, it seems most promising to look for more external factors such as the availability of alternative technologies, the industry's and citizens' willingness to pay for them, as well as the degree of market concentration (for a first theoretical political economy framework relevant to environmental policy design see Jenkins 2014). In this context, it could also be advisable for future research to extent the scope of the presented analysis to GHG emissions that appear to be targeted by NEPIs more frequently than the conventional air pollutants analyzed in this article. These aspects should not imply that no more investigation is necessary with regard to the optimal design of implementation structures. This paper was a first attempt to study the influence of different instrument types and implementation structures in a comparative fashion. To enable comparison, the analysis focused on the manifest and most visible patterns of implementation structures. In reality, however, there are far more nuanced and multidimensional differences between public entities in charge of overseeing and carrying out implementation. In this context, it seems most promising to look for more fine-grained indicators that allow assessing public authorities' implementation capacities.
Moreover, I wish to thank the four Regulation & Governance referees for their very helpful suggestions.
Endnote 1 The analysis did initially not include energy prices in the main model as they are said to be strongly correlated with economic growth. Moreover, the available dataset as provide by the IEA (and available via the homepage of the UK government (2018)) suffers from some missing data.

Supporting information
Additional Supporting Information may be found in the online version of this article at the publisher's web-site:   including the main effect of the policy change and the implementation structure variables.   raising the initial index value to the power of "2" (x 2 ).   raising the initial index value to the power of "0.5" (x 0.5 ).