Conceptualizing and Measuring Autocratization Episodes

: Autocratization affects democracies and autocracies with gradual setbacks in democratic qualities. The current debate on autocratization is lacking a comprehensive and systematic overview of different autocratization concepts and empirical measures. Addressing the gap, this research note identiﬁes and discusses different strategies of operationalizing autocratization periods with continuous democracy data from Freedom House, Polity IV and the Varieties of Democracy (V-Dem) project. Evidence for 26 different autocratization measures for 1900-2019 reveals major inconsistencies between different measures. Our ﬁndings suggest that autocratization episodes should be measured with V-Dem’s Electoral Democracy Index (EDI) or with V-Dem’s Liberal Democracy Index (LDI), which provide ﬁne-grained data and the possibility to test for measurement noise. A 10% threshold reduces the risk of conceptual stretching and enables researchers to detect both autocratization episodes that do and do not result in regime breakdown. We also recommend researchers to additionally test empirical ﬁndings with different carefully selected thresholds. Democracy. L’analyse de 26 mesures d’autocratisation entre 1900 et 2019 montre des incohe´rences majeures entre les mesures. Nos re´sultats recommandent de mesurer les e´pisodes d’autocratisation a` l’aide de l’indice de de´mocratie e´lectorale (EDI) ou de l’indice de de´mocratie libe´rale (LDI), qui fournissent des donne´es diffe´rencie´es et permettent de tester le bruit de mesure. Un seuil de 10% re´duit le risque d’e´tirement excessif du concept et permet d’identiﬁer les e´pisodes d’autocratisation qui conduisent a` l’effondrement ou au maintien du re´gime. Nous recommandons e´galement de tester les re´sultats empiriques avec des seuils diffe´rents et soigneusement se´lectionne´s.


Introduction
The global trend of "autocratization" has been identified and described for some time now. Autocratization is widely considered to be a threat for democracy in many countries in both the global south and the global north (Cassani and Tomini 2019;Mechkova, et al. 2017;Waldner and Lust 2018). In contrast to competing concepts such as "democratic backsliding" (Bermeo 2016;Haggard and Kaufman 2016;Waldner and Lust 2018) or "democratic deconsolidation" (Foa and Mounk 2016;Shin 2020), autocratization broadly refers to the decline in democratic qualities of any democratic regime that may result in the weakening or the breakdown of democracy but also the recession of democratic characteristics in authoritarian regimes. Lührmann and Lindberg define autocratization as a "substantial de-facto decline of core institutional requirements for electoral democracy" Lindberg 2019: 1096). Substantially, this includes three different modes: (a) democratic recession refers to a gradual decline in the democratic quality of a given democratic regime; (b) democratic breakdown describes the collapse of a democratic regime towards an authoritarian regime; and (c) autocratic consolidation denotes the erosion of democratic qualities in autocracies, that is, a process of "autocratic hardening" (Levitsky and Way 2020).
The recent wave of studies on autocratization waves Skaaning 2020;Tomini 2021) has greatly improved our understanding of the spatial and historical dimensions of the decline or decay of democratic institutions in the late twentieth and early twenty-first centuries. The often gradual nature of democratic erosion, however, raises serious conceptual and methodological challenges. These include the informed decisions on naturally arbitrary thresholds and the question on when autocratization periods start and end. Some studies operationalize autocratization episodes as connected periods of time with a substantial decline in the electoral democracy dimension (Laebens and Lührmann 2019;, while others operationalize autocratization as a decline in the liberal democracy dimension . Dealing effectively with these challenges is one of the key challenges of current debates on autocratization and democratization. This research note aims to make an original contribution to solving this challenge. It discusses different definitions of autocratization, compares different operationalizations of autocratization and demonstrates how measurement choices impact the question on how extensive autocratization is. We begin with a discussion of different strategies of how to conceptualize and to measure autocratization with V-Dem, Freedom House and Polity IV data. Thereafter, we discuss the empirical implications of different autocratization measures. In sum, we suggest an autocratization measure that builds on the existing operationalization by Lührmann and Lindberg (2019) but adds confidence intervals to account for measurement choice.

Measuring Autocratization
Any conceptualization of autocratization has to begin with a clear understanding of what democracy is or is not. Although there is no single, universally accepted understanding of democracy in political science (Collier et al. 2006), the two procedural conceptions of democracy most prominent in empirical democracy research are electoral as well as liberal democracy. The electoral conception of democracy captures the degree to which a country observes Robert Dahl's institutional minima of a Polyarchy . The liberal conception of democracy also incorporates civil liberties, rule of law and horizontal accountability as well as minority rights (Coppedge et al. 2011: 253).
Building on this dual conceptualization of democracy, we follow Lührmann and Lindberg (2019) and take an episode approach to measure the concept of autocratization rather than modelling democratic breakdown as a discrete outcome (see also Boese et al. 2020;Edgell et al. 2020). Democracies tend to erode gradually, and the detection of the start and end dates of such episodes enable scholars to analyze democratic breakdowns and resilience as a two-or multi-stage process of erosion ). In addition, by taking an episode approach we are also able to display discrete outcomes.
To measure gradual decline in democratic qualities, we need nuanced cross-national time series data on different aspects of electoral and liberal democracy. To distinguish between autocratization in democracies and in autocracies, we need high quality regime type data. For both, the Varieties of Democracy (V-Dem) dataset (Coppedge et al. 2020b) offers a valuable source. In addition, this research note contrasts measurements with V-Dem data with measurements based on data from Polity IV and Freedom House. The Freedom House index largely measures the concept of liberal democracy, even though, strictly speaking, it measures political rights and civil liberties, but not democracy (Boese 2019: 119;Munck and Verkuilen 2002). The Polity IV index (Marshall et al. 2019) mostly measures the concept of electoral democracy, though it relies on the de jure rather than the de facto implementation of democracy and, in addition, includes the accountability dimension of (liberal) democracy (Boese 2019;Coppedge et al. 2020a: 28;Munck and Verkuilen 2002). V-Dem's Electoral Democracy Index (EDI) and the Liberal Democracy Index (LDI) measure electoral and liberal democracy.
Our review of the current scholarship identifies at least five varieties of autocratization measurement. 1 Figure 1 portrays our autocratization coding scheme and shows the stepby-step decision-making guidelines to identify autocratization episodes with different increments at the start of a period as well as the cumulative drop of democratic qualities. Overall, we conceptualize 26 different ways to code autocratization episodes with four different democracy indices. The first set of measures relies on the EDI from the V-Dem dataset. It ranges on a continuous scale between 0 and 1, with higher values indicating a greater completion of electoral democratic principles. For example, Lührmann and Lindberg (2019: 1100) define an autocratization episode as a drop of 0.1 or more on the EDI. An episode starts with a decline on the EDI of 0.01 points or more. A period ends when there is a temporary stagnation on the EDI with no further decline of 0.01 points in four years or when the EDI increases by 0.02 points. Laebens and Lührmann (2019) use a lower threshold and define autocratization episodes as a drop of 0.05 or more on the EDI. The other conditions remain, but Laebens and Lührmann add that the confidence intervals must not overlap between the year before autocratization and the last year in the autocratization period.
By using the confidence intervals of the V-Dem measurement model, Laebens and Lührmann ensure that any autocratization episode is not a result of measurement noise (2019: 5). In a third empirical specification, we use the 0.1 threshold from Lührmann and  Table 1, Coppedge et al., (2020b). Green boxes indicate coding steps that are only possible with V-Dem democracy indices that provide measurement errors for these indices. *A 1% decrease is set by Lührmann and Lindberg (2019) and Edgell et al. (2020). **Due to the integer scale between -10 and 10 in the Polity IV project (and the integer scale between 1 and 7 with 0.5 steps in the Freedom House data) we set the start of an episode at a 0.05 decrease. For the Polity scale the smallest annual change can be 0.05 (Freedom House: 0.083). [Colour figure can be viewed at wileyonlinelibrary.com]

Autocratization Episodes 437
Lindberg (2019) and add confidence intervals that should not overlap as introduced by Laebens and Lührmann (2019). In a fourth specification, we set the threshold to 0.15, while in a fifth specification we set the threshold to 0.2. In both specifications we test for overlapping confidence intervals. The next three specifications follow the right path in Figure 1 and set the start of a period at a decline on the EDI of 0.05 points or more. A period stops when there is a temporary stagnation on the EDI with no further decline of 0.05 points in four years or when the EDI increases by 0.08 points. We set the cumulative thresholds to 0.1, 0.15 and 0.2 and test for overlapping confidence intervals. By setting the period start to a decline on the EDI of 0.05, we take into account that the smallest possible step in the 21-integer scale by Polity IV is 0.05. All empirical specifications share the problem of right-censored cases. 2 The second empirical approach by  relies on the Liberal Democracy Index from the V-Dem dataset. 3 It also ranges on a continuous scale between 0 and 1. From a theoretical standpoint, the LDI is appropriate for measuring autocratization episodes when researchers want to include measures that cover the erosion of rule of law, erosion in respect for civil liberties, and the executive aggrandizement of legislative and judiciary powers. We use the same thresholds as before for the LDI and recommend eight different operationalizations on how to measure autocratization with decline in the LDI (see Table 1, Figure 1).
In addition to the operationalization with V-Dem data, we replicate the results with data from Freedom House (2018) and Polity IV (2019). First, we rescale both democracy indices between 0 and 1 to operate with the same autocratization definitions as before. 4 We follow the right path in Figure 1 and operationalize autocratization periods with Polity IV data by defining an autocratization period as a drop of 0.1 that starts with a decline of 0.05 and ends with an increase of 0.08 or a four-year stagnation. As before, we also use a lower threshold and define autocratization episodes as a drop of 0.05 or more on Polity IV. We also set the threshold to a drop of 0.15 and 0.2. In addition, we test whether the number of autocratization episodes is affected when following the left path in Figure 1 and set the period start at 0.01. These specifications result in seven different Polity IV measures of autocratization.
The Freedom House operationalization poses a challenge due to the possible thresholds on a 7-point scale for a combined freedom rights and civil liberties score between 0 and 7 with 0.5 steps. However, by using the left path as well as the right path in Figure 1, we can identify all annual changes in the Freedom House index as a starting point of an episode. By taking this into account, we set the cumulative drop at 0.1, 0.15 and 0.2 to compare the results with the other autocratization measures.
One problem of the literature on incremental regime changes is the atheoretical nature of different thresholds (Coppedge 2017;Edgell et al. 2020;Tomini and Wagemann 2018). This is not only a challenge for measuring autocratization episodes, but affects a multitude of measurement issues, in particular the measurement of democracy: "the decision of where to place the cut off points is rarely, if ever, justified in a theoretical manner" (Clark et al. 2017: 221) but primarily by measurement logic (cf. Munck and Verkuilen 2002). Figure 2 compares different autocratization measures by looking at the global trend of autocratization. In addition, we create a Shiny Web Application that uses interactive graphs and a world map of autocratization to unravel spatial and temporal interrelation of autocratization periods (see https://larspelke.shinyapps.io/AutocratiaztionMeasures/). It delineates a global trend line of autocratization and shows the similarities and differences of the 26 autocratization operationalizations defined previously. Substantively, the first row in Figure 2 shows that the differences between the EDI and LDI operationalization at the 0.05, 0.1, 0.15 and 0.2 thresholds are very similar in the global time trend. Unsurprisingly, operationalizations with a stricter threshold identify a much lower number of countries that autocratize compared to substantially smaller threshold measures. Compared to the V-Dem operationalization, Polity IV-based operationalizations count substantially fewer autocratization events, both in the number of autocratization episodes as well as the number of countries affected by backsliding. Finally, Freedom House operationalizations count substantially more autocratization periods in the time between 1972 and 2018 compared to V-Dem and the Polity IV measure of autocratization.
In the second row of Figure 2, we compare those autocratization episodes that start with a 0.05 initial decline at the beginning of an episode. The second row of Figure 2 indicates that the annual differences in the number of autocratization episodes per year is  clearly reduced between Polity IV-based measures and V-Dem-based measures of autocratization episodes. This finding indicates that the periods detected by the first row's V-Dem measures are longer and reflect early democratic backsliders, while Polity IV-based approaches are unable to identify such incremental starting points. Thus, setting the threshold for the start of an autocratization episode at a 5% decline results in a similar number of country-years for both V-Dem-and Polity IV-based approaches. This means that different thresholds, especially for the starting point of an episode, strongly affect substantive interpretations of the incremental nature and recent trends of autocratization processes worldwide (e.g. Lührmann and Lindberg 2019; Skaaning 2020).
To illustrate how much the different operationalizations influence the results regarding which regime types are affected by autocratization, Figures 3 and 4 present different operationalizations by regime types according to Regimes of the World (Lührmann et al. 2018). We see that the distribution of autocratization under democratic or autocratic country-years depends on the autocratization definition. All operationalizations find that most autocratization-years occurred under electoral autocracies. Moreover, Figure 3 shows that Polity IV operationalizations are relatively blind to autocratization in liberal and electoral democracies, while the EDI 0.1 and LDI 0.05 operationalizations note the most autocratization-years under liberal democracy. We therefore recommend proceeding with caution when using different conceptions of autocratization and recommend using robustness tests on different operationalizations to get valid findings.
To further elaborate how the choice of measurement strategy affects the number of countries that autocratize and to add to the recent discussion on a third of wave of autocratization (Lührmann and Lindberg 2019; Skaaning 2020), we calculate two related indicators of congruence between autocratization measures. We first construct an indicator of the percentage of congruence between autocratization measures on the country-level. Different autocratization measures operating with the same overall threshold should be able to detect the same autocratization cases even though different measures vary in the recorded start and end dates, the length of an episode and/or the degree of democratic erosion. We calculate the congruence between different autocratization measures operating with the same overall threshold by using the following equation: where n = the number of autocratization measures with the same overall threshold, and a is an autocratization period that was identified by at least one of the 26 measures. For example, if all nine autocratization measures with a threshold of 0.1 identify an autocratization case the congruence measure is 1; if only one measure identifies an episode the congruence is 11.1%. In the next step, we calculate the mean of this congruence indicator by country over all thresholds (see Figure 5). Figure 5 indicates that a clear regional trend is not evident, while the congruence in detecting autocratization episodes varies extremely between different countries. Unexpectedly, both long established democracies, such as Denmark, and countries that were long under authoritarian rule, such as Namibia, Zambia, Laos, Ukraine and Moldavia show low levels of congruence between different measures. The lacking congruence between Polity IV-, Freedom House-and the V-Dem-based measures of autocratization in these countries may be driven by the coarser scales of Polity IV and Freedom House democracy indices and their inability to detect changes in the democratic quality of consolidated democracies as well as hard autocracies.
As a second indicator of congruence, we identify those backslider countries that are only detected by some but not all of the following measure of autocratization (Bogaards 2012): EDI 0.1, LDI 0.1, Polity IV 0.1 or Freedom House 0.1. Table A1 in the Online Appendix shows the 423 autocratization episodes that are not detected by at least one of the four different measures. 5 In contrast, only 60 autocratization episodes were detected by all four measures, though duration or the start and end year of these episodes may still differ. From these 423 autocratization episodes 177 episodes were found by the EDI 0.1 measures, 102 episodes were found by the LDI 0.1 measure, while the Polity IV 0.1 measure detects 140 episodes and the Freedom House 0.1 measure found 202 episodes.

Illustrations: Autocratization in the United States and in Hungary
The effects of measurement choice can be illustrated by two cases, both of which play a prominent role in the current debate on autocratization: the United States and Hungary (Bogaards 2018 Ziblatt 2018). By comparing these two cases that are potentially backsliding, we are able to show important differences between autocratization measures using the same threshold that may be a result of different definitions of democracy, as well as the quality of the underlying democracy data. Figure 6 shows that all democracy indices record a decline in democratic qualities for the United States after 2015. However, only the EDI, LDI and Polity IV measures identify an autocratization episode, while Freedom House reported a one-point decrease in the political rights dimension (equivalent to a 0.083 decrease) that does not result in the classification as an autocratization episode. However, as the literature suggests the United States are in a period of democratic erosion due to toxic polarization (Abramowitz and McCoy 2019; Graham and Svolik 2020) and a populist president in power (Freedom House 2020), among others. V-Dem reports major declines in the freedom of expression (-0.07 between 2015 and 2019), and the clean elections dimension (-0.093) of electoral democracy, as well as in the legislative constraints on the executive (-0.098), a component of liberal democracy. In contrast, Freedom House documents a decline of one point in the electoral process and a two-point decline in the functioning of government between 2017 and 2018, which results in a one-point decline in the political rights dimension. That is, Freedom House evaluates the decline of democracy in the US less significantly than V-Dem.
For Hungary, which V-Dem describes as "the most extreme recent case of autocratization" (Lührmann et al. 2020: 13)

Conclusion
The discussion in this research note shows that different operationalizations of autocratization lead to non-conforming classifications of autocratization periods (cf. Table A2, Figure 5). We recommend using V-Dem data to operationalize the gradual processes of autocratization and suggest the application of the Liberal Democracy Index or the Electoral Democracy Index and a 0.1 threshold (10% decline on a scale between 0 and 1) with testing for overlapping confidence intervals. Two reasons lead to this recommendation. First, testing for overlapping confidence intervals prevents a detection of autocratization periods that are largely based on measurement noise rather than real democratic erosion processes. Second, the 0.1 threshold reduce the risk of conceptual stretching of the autocratization concept and is also able to detect meaningful autocratization episodes that do not result in a tremendous decline of democratic qualities, such as the 0.15 and 0.2 thresholds. However, for detecting cases of autocratization periods, the lower 0.05 threshold and the 0.15 and 0.2 thresholds seem also suitable.
In addition, to identify the starting point of an autocratization episode, we need finegrained democracy scores. Therefore, Polity IV and Freedom House measurements of autocratization episodes seem to be inappropriate for covering the initial incremental phase of democratic erosion. Moreover, Freedom House and Polity IV democracy scores do not provide confidence intervals in contrast to V-Dem. By using the confidence intervals of the V-Dem measurement model, we can ensure that the identification of an autocratization episode is not the results of measurement noise. This validation is not feasible with Polity IV and Freedom House data. Thus, we caution researchers to use the Freedom House and Polity IV autocratization measurements.
Finally, this research note shows that only 60 autocratization episodes where detected by four illustrative measures, while 423 potential autocratization episodes where not detected by at least one of those measures. Due to the research interest and space limitations, this research note did not provide a systematic test of possible causes and consequences of autocratization. Still, the comparison of different conceptualizations supports the argument made by Skaaning (2020)  Year Democracy Score Hungary Figure  implication for future research is that scholars of democratic regression and autocratization who are interested in systematic quantitative tests of possible causes or consequences of democratic decline should use different measures of such episodes.

Data Availability Statement
The data that support the findings of this study are openly available in the Open Science Framework: https://doi.org/10.17605/OSF.IO/F9N23, reference number F9N23.