EU Cohesion Policy under the Media Spotlight: Exploring Territorial and Temporal Patterns in News Coverage and Tone

This article explores the territorial and temporal patterns of EU cohesion policy media coverage. The topic content and tone of news are analysed using topic modelling and sentiment analysis techniques, which are applied to a new corpus of over 4,000 English and Spanish news stories from the period 2010 to 2017 across three territorial levels. In line with our theoretical expectations, we found significant differences in the tone used across territorial levels, with national and transnational levels being more negative than the regional level. While national and transnational media place relatively more emphasis on politicized EU topics, subnational media focus more on substantive policy topics corresponding with EU policy objectives. Furthermore, media reporting on the cohesion policy evolved significantly over time and reacted to external events, such as the euro and migration crises, as well as internal, country-specific events, such as Brexit in the UK and corruption scandals in Spain. However, the tone of cohesion policy news is positive overall suggesting that the media can, in principle, contribute to public support for the policy and the EU more generally.


Introduction
The mass media play a critical role in the European public sphere by informing citizens about the EU. Research shows that the media not only present EU news but also redefine and reshape news, which can impact on citizens' attitudes to the EU (de Vreese and de Vreese and Semetko, 2004;Vliegenthart, et al., 2008), their European identity (Bruter, 2003(Bruter, , 2009 and voting behaviour (Banducci and Semetko, 2003;Giebler et al., 2017). The Europeanization of national media also has important normative consequences. By contributing to the development of a European public sphere and demos, the media can, in principle, enhance the legitimacy of the EU and buttress European integration democratically (Trenz, 2008;Risse, 2014).
Despite the increasing acknowledgement of the significance of the mass media for EU integration, there is scant research on media coverage of specific EU policies, with the exception of economic policies (de Vreese et al., 2001;Meyer, 2005;Jackson, 2011) and common foreign and security policy (de Vreese and Kandyla, 2009). Furthermore, much of the comparative literature on mainstream news has focused on transnational or national news without paying attention to regional news sources (cf. Perez, 2013;Hepp, et al., 2016). This is surprising, given that EU institutions have made it a priority to communicate in partnership with national, regional and local opinion leaders and stakeholders since the 2002 White Paper on European Governance and, more recently, through a range of strategic initiatives to reconnect with citizens locally in the aftermath of the crisis.
This study aims to uncover and explain territorial and temporal patterns in the thematic coverage and tone of EU cohesion policy news. Drawing on media studies literature, we hypothesized that geographical proximity of media outlets to audiences would impact on cohesion policy topic prevalence and sentiment. Research suggests that EU news coverage by regional media outlets concentrates more on content that has regional significance and an impact on readers' everyday lives than national media outlets, driven by the target reader orientation and resource restrictions of subnational media outlets (Hepp et al., 2016;Offerhaus et al., 2014). In the area of cohesion policy, evaluations of programme publicity strategies have found that regional and local media focus more on regional policy issues and what is actually delivered on the ground than national media (The Evaluation Partnership [TEP], 2013). Developing these insights, we expected regional newspapers to focus on positive topics about the substantive content and outcomes of cohesion policy investments and the implications for citizens' daily lives, while the national news is expected to emphasize negative news associated with controversial and politicized EU topics, such as noncompliance, misuse of funding and conditionality. We also hypothesized that temporal patterns in cohesion policy news coverage would be shaped by Europeanization dynamics, politicized events and policy debates at EU and national levels.
Cohesion policy is a highly relevant case for investigating media coverage across space and time, given its high visibility, subnational reach and political salience. Firstly, it is one of the most visible EU policies with a direct impact on people's daily lives through investments in infrastructure, business grants and training for people across all regions of the EU. With its pioneering multilevel governance model and partnership principle, cohesion policy is credited with encouraging the participation and empowerment of subnational governments in regional policy and EU decision-making, as well as local and civic mobilization and networking at all levels.
Finally, cohesion policy accounts for a major share of the EU budget (around a third of the €1 trillion total in 2014-20) and is a classic 'redistributive' policy involving transfers of funding from richer states to poorer states (Bachtler et al., 2013). The periodic negotiations on the EU's multiannual financial framework (MFF) and cohesion budget heading are particularly salient in the media, as they expose political conflicts between net payer countries and net beneficiaries over the size and distribution of the EU budget.
To address our research aims, innovative techniques (structural topic modelling and sentiment analysis) were applied to an original dataset of online news at various territorial levels. The news corpus covers over 4,000 cohesion-policy-related news stories in English and Spanish over the period 2010-2017 at three territorial levels: the transnational, the national and the subnational. Importantly, this is the first quantitative study of news reports about the EU to include a sizeable sample of regional news sources and to estimate the effect of territorial levels on topic coverage and tone.
The article is organized as follows. The next section sets out our theoretical expectations of the territorial and temporal patterns of media coverage of cohesion policy.
The research design and methodology is then explained. The empirical results are presented in the penultimate section, while the conclusion discusses the theoretical and policy implications.

News Variation across Time and Territory: Theoretical Expectations
The core of this study was to identify variations in news coverage of cohesion policy topics across time and territory and to explain their differences. To do so, we derived hypotheses on the coverage and tone of news topics from the theoretical and empirical literature. The two strands of hypotheses and theoretical expectations are identified and discussed in turn.

Territoriality Hypotheses
The literature on the Europeanization of media discourses and public spheres provides useful insights for investigating cross-national patterns and trends in news coverage. While there are no scholarly articles on news of EU cohesion policy, media studies of the EU's Lisbon agenda for growth and jobs are pertinent as it provides the strategic reference framework for cohesion policy interventions in 2007-13. In their study of the Europeanization of media reporting on the Lisbon agenda, de la Porte and van Dalen (2016) formulate rival hypotheses about the thematic focus of news coverage across EU Member states. The thematic dissimilarity hypothesis suggests that broad EU socioeconomic strategiessuch as the EU's Lisbon agendaare likely to receive varied topic coverage across national media, reflecting the diverse themes addressed and divergent domestic priorities and interests (de la Porte and van Dalen, 2016;Meyer, 2005). This in turn implies fragmented and nationalized discourses and a less Europeanised public sphere in terms of the emphasis on European issues and actors (Koopmans and Pfetsch, 2006). Reviews of EU media studies confirm the national orientation of EU media coverage in terms of topic and actor focus (Machill et al., 2006), while an emerging Europeanization of the media or public sphere in national colours has been reported in recent studies (Segesten and Bossetta, 2019).
By contrast, if the topics reported are similar across countries and transnational media, this would provide evidence of a Europeanized media discourse and public sphere. As noted, most studies do not find strong evidence of the Europeanization of national media coverage of the EU and the public sphere (cf. Koopmans and Statham, 2010). Closer to the domain of interest, this thematic similarity (or 'synchronization') hypothesis has been confirmed in the media coverage of the EU's Lisbon agenda driven by EU priorities (de la Porte and van Dalen, 2016). Other media-related mechanisms supporting the Europeanization of the media and the public sphere include the transnational dissemination of EU stories by news agencies, formal cooperation between media outlets, mutual observation by journalists and opinion-related press reviews of European news (Erbe, 2005).
Similar expectations can be applied to media coverage of the EU cohesion policy, as it is a key EU instrument for delivering the EU's Lisbon agenda (or 'Europe 2020' strategy in 2014-20). On the one hand, the reinforcement of common and increasingly prescriptive EU requirements to concentrate EU funding on the EU's thematic objectivesnotably those of research and innovation, the competitiveness of small and medium enterprises (SME), information and communications technology and transport infrastructure, a low-carbon economy, employment and social inclusionwould imply a Europeanization of thematic focus in cohesion policy strategies and interventions and in the associated news coverage. On the other hand, research on EU governance architectures and the Lisbonization of cohesion policy has found that the breadth of EU objectives and the discretion available to tailor the objectives to the socioeconomic and policy contexts of each member state have led to wide variations in funding priorities across and within member states (Mendez, 2011). This in turn would lead us to expect variations in the media coverage of topics across countries compared with the transnational media, given its more international and less country-specific outlook: H1a Thematic dissimilarity. The thematic focus of media reporting on cohesion policy varies across national and transnational media sources.
H1b Thematic similarity. The thematic focus of media reporting on cohesion policy is similar across national and transnational media sources.
The second dimension of territoriality concerns the national and subnational levels. While most studies of EU media have focused on cross-national differences in EU news coverage (for example, Alarcón, 2010;Boomgaarden et al., 2010;de Vreese et al., 2001), there is some evidence of subnational differences in media coverage of the EU. News coverage of the EU in regional sources tend to focus on content that has a regional significance and impact on the everyday lives of readers, driven by the Europeanization of regions, and the specific 'reader' orientation and restricted resources of subnational newspapers (Hepp et al., 2016;Offerhaus et al., 2014, p. 2). These studies did not analyse the news coverage of specific EU policies or its tone, and only one regional news source was examined per country, raising questions about the representativeness and robustness of the findings. Nevertheless, these findings are consistent with studies of news coverage patterns outside of the EU, which find that local news outlets place more attention on topics that are 'closer to home' (Dunaway et al., 2010;Martin, 1988) and that have a human connection, compared with national news (Chandelier et al., 2018).
A territorial perspective on the media coverage of cohesion policy is particularly pertinent given the policy's strong focus on regions and its emphasis on local engagement and empowerment. In line with the above research, comparative evaluation research confirms that regional and local media focus more on cohesion policy substance and content issues than the national media do (TEP, 2013), although these findings are based on insights gained from interviews with EU journalists and policy stakeholders rather than from the direct empirical analysis of news items.
In line with EU media research, the evaluation of cohesion policy publicity strategies also found that regional and local news outlets are more interested in reporting about cohesion policy topics that have a connection with local realities than national news (TEP, 2013). Related to this, regional media are more likely to emphasize local governments and authorities in their cohesion policy stories than national media because local actors tend to interact more with media outlets at the subnational level to publicise their economic development activities (TEP, 2013). Economic development policies with a strong subnational dimension in terms of territorial jurisdiction or governmental competence Carlos Mendez et al. include transport infrastructure, urban and rural development and local regeneration, community development and cultural heritage policies (Ismeri, 2010). Based on these considerations, we formulated the following hypothesis: H2 Subnational local focus. Media reporting on cohesion policy by subnational media is more likely to focus on local realities and themes than the national media.
Classic news value theories identify 'negativity' as an important factor determining the selection of news content by journalists (Galtung and Ruge, 1965), in line with the dictum that 'bad news is good news'. There is a vast literature on political communication providing evidence of a negativity bias in the media, especially in relation to US political campaigning (for an overview, see Soroka and McAdams, 2015). The existence of a negativity bias in the national news coverage of the EU is also well documented, as is the focus on politicized decision-making (Alarcón, 2010;Norris, 2000;Schuck, et al., 2011). Whether regional or local newspaper outlets are less prone to a negativity bias remains an open question, owing to the lack of research. However, given that we expect subnational news outlets to focus on cohesion policy stories about substantive policy content and outcomes, the news coverage is likely to be less negative than national or transnational news, which have a more political and conflictual slant: H3 Subnational negativity. Media reporting on cohesion policy by subnational media is less negative in tone and topic focus than national media.

Temporal Hypotheses
Previous research has found that news coverage of EU policies is low (Hobolt and Tilley, 2014). However, Europeanization theories suggest that the increased impact and scope of EU policies over time will attract increased media attention (de la Porte and van Dalen, 2016;Koopmans and Statham, 2010;Meyer, 2005). Implied in the concept of a European public sphere is the idea that a public space will emerge for shared political discussions about the EU in which an increasingly Europeanized media plays a key role. Empirical studies provide clear evidence of increased coverage of EU news in national media over time (reviewed in Risse, 2014;Walter, 2017), for example, Vliegenthart et al. (2008), Boomgaarden et al. (2010) Koopmans and Statham (2010), Schuck et al. (2011), Hepp et al. (2016 and Vliegenthart et al. (2008), among others. Yet studies of temporal dynamics in the media coverage of EU policies remain scarce and are non-existent in relation to cohesion policy, although increased news coverage has been reported in relation to EU agricultural and especially monetary policies (Koopmans and Stratham, 2010). The dynamics of media Europeanisation therefore generates the following hypothesis: Temporal patterns in media coverage are unlikely to be linear, at least in the short term. Media research suggests that news coverage is often episodic (Iyengar, 1991), cyclical and event-driven (de Vreese et al., 2001;Gleissner and de Vreese, 2005;Jackson, 2011;Meyer 2005), characterized by sudden shifts in media coverage in line with the emphasis of agenda setting theory on punctuated equilibrium dynamics (Boydstun, 2013: 56). Media Europeanization and blame attribution theories suggest that media coverage is likely to be influenced by changes in the strength of the EU's sanctioning regime to discourage non-compliance with EU rules (Meyer, 2005). For instance, longitudinal analysis of news media across several EU member states shows that the Euro crisis and EU responses have intensified newspaper coverage of the EU (Hepp et al., 2016). De la Porte and van Dalen (2016) find that temporal patterns in media reporting on the EU's Lisbon agenda were driven mainly by EU-level changes in the strategy by EU institutional actors.
These insights are particularly relevant in accounting for cohesion policy news coverage patterns, given the policy's dynamic regulatory and discursive framework shaped by wider EU budgetary bargains, overarching growth strategies and sanctioning regimes (Mendez, 2011). The political and time-specific intergovernmental conflicts surrounding negotiations of the EU MFF are important because EU cohesion takes up a major share of the EU budget and plays a key role in determining net budgetary balances, given its redistributive nature. As the EU's paradigmatic redistributive policy, it is characterized by politicized bargaining over funding, which often involves side-payments to facilitate wider EU goals and package deals. Moreover, the direct involvement of heads of states in agreeing budget deals and the visibility of financial winners and losers makes 'high politics' MFF events salient in the media (Bachtler et al., 2013, p. 263).
The impact of the economic crisis and EU policy response is also likely to be pertinent to cohesion policy. Evaluation research suggests that the volume of media coverage of cohesion policy declined in the immediate aftermath of the crisis as other, more urgent EU priorities attracted media attention (TEP, 2013). The logical implication is that subsequent shifts in EU priorities and policy debates surrounding cohesion policy would lead to a corresponding shift in media attention. For instance, part of the EU's response to the fears of contagion from the sovereign debt crisis was to utilize cohesion policy as a corrective sanctioning tool through reinforced macro-conditionality rules requiring the suspension of cohesion policy funding for non-compliance with EU economic governance rules. A qualitative analysis of a small sample of 'transnational' news items (from Politico, Euractiv and the Financial Times) highlighted the highly politicized and polarized debates between member states and across EU institutions on the introduction and enforcement of conditionality in cohesion policy (Coman, 2018). These considerations inform the 'EU politicization' hypothesis: H5 EU politicization. Media reporting on cohesion policy increases in focus on politicized EU topics in response to EU events and policy debates.
Temporal patterns in news coverage may share common patterns across EU member states but also may exhibit distinctive features, given the different national contexts in which the media operate (de Vreese et al., 2001. There is evidence that temporal trends in the news coverage of EU policies show more variation across countries than news relating to EU institutions (Hepp et al., 2016). As suggested by classic functionalist approaches to media behaviour, the media in any given country are strongly influenced by their political and economic environments and respond to domestic political and social pressures (McQuail, 1994, p. 121).
The two cases studied here exhibit distinctive domestic political and economic contexts that were likely to play an important role in media coverage of the EU and cohesion policy in recent years. Spain was among the EU member states most adversely affected by the economic crisis, particularly in terms of the rise in unemployment, exposing the underlying weaknesses in its economic model and policies, as well as the widespread corruption scandals associated with the misappropriation of public funds. Brexit has been the central issue dominating domestic and EU political affairs in the UK since the 2016 referendum, which is likely to be reflected in EU and cohesion policy media coverage. A historical analysis of EU media coverage in the UK's national press confirms that media coverage tends to rise during historic EU events that have challenged national sovereignty and led to rifts within and across political parties over their position on the EU (Copeland and Copsey, 2017). The final 'national politicization' hypothesis is: H6 National politicization. Media reporting on cohesion policy increases in focus on politicized national topics in response to national events and policy debates.

Research Design, Methods and Data
This section presents the research design and data collection and processing. It also discusses the main methodologies used to analyse the corpus of documents, both of which are established techniques of natural language processing. In view of word constraints, the extended technical details are included in a supplementary online appendix.

Design and Data Collection
The cases selected for this study are Spain, the UK and the transnational media. The choice of the UK and Spain was motivated by several reasons. Firstly, they are both large countries with a high level of devolved political decentralization and significant regional autonomy in cohesion policy decision-making, implying a strong regional media presence. Secondly, they are old EU member states with a long-term experience in managing cohesion policy and have both received substantial EU funding, especially in the lessdeveloped regions, suggesting that funding is of relatively high visibility, although this is more so in Spain. The establishment of the European regional development fund in 1975 was due in part to UK demands to improve its net budget balance with the EU, and the UK has pioneered the application of governance innovations such as the partnership principle, especially in Scotland (Bache, 1999). Among the less-developed member states following accession in 1986, Spain has been the largest recipient of cohesion policy funding throughout the 1990s and 2000s.
The rationale was also to compare media coverage and sentiment between a country that is a net beneficiary of EU funding and has relatively pro-EU attitudes (Spain) with a net payer country that has a relatively eurosceptical public opinion (the UK). Finally, the transnational media has been selected for comparison with national media as they provide a control case for testing our hypotheses. In addition, the transnational media tend to focus on EU-related matters in a more consistent manner than the national media, aiming at covering issues that concern all member states.
Having identified the cases, the next task was to design a data collector. We implemented our own Python-based web crawler for data collection. The data collector takes as inputs specific keywords and media sources and executes a series of web crawlers to collect and process media content (the technical details of the data collector are described in Appendix 1). The first step involved the identification of keywords related to EU cohesion policy for both languages (listed in Table A1). It should be noted that for each word (or group of words) we included different words endings (for example, fund, funds or funding). The next step was to identify relevant online media. Where possible, we strove to ensure that the media sources varied in terms of ideological profile by including both right and left-leaning sources. Crucially, we also identified media sources that were rooted at different territorial levels, that is, targeting mostly national or regional audiences. Thus, we included both left and right-leaning national media sources (e.g., The Telegraph and The Guardian for the UK, and El País and El Mundo for Spain) as well regional sources (for example, The Scotsman or La Voz de Galicia) for each case. In addition, we have a transnational category of sources whose audience is more international in scope (for example, the Financial Times, Politico, Euractiv). Table A2 in the Appendix shows the complete list of news media grouped by level.
Having collected our corpus, we then developed a metric, which we call an EUrelevance metric, for filtering the corpus of data. A filter is necessary because a keyword search can generate many non-relevant articles. The EU relevance metric was based on the following definitions.
We define the intersection of keywords and content for an article as: For the content of an article after removing the stop words (that is, common words and prepositions) we defined the length of this filtered content as:

Length contentFiltered
(2) Finally, we defined the total number of instances that the word Europe (including 'EU', 'European' and so on) shows up in body content of the articles as: Using these definitions, we formally calculated our EU metric value as follows: By using higher thresholds of our EU relevance metric we were able to calibrate the filter so as to improve precision (that is, the probability that a given article is relevant), Carlos Mendez et al. even at the expense of filtering out some potentially relevant articles. This is because it was more important for our analysis that the corpus was relevant to the topic of cohesion policy, even if the high filter threshold excluded some relevant articles.
Applying the filter to the documents collected returned N = 4,418 documents over the 2010-17 period. This constituted our corpus of documents. The distribution was quite highly skewed however, with a very high identification of relevant articles for the Spanish case (N = 3,217). The numbers for the UK were N = 692 and for the transnational media N = 509.
Before any computational text analysis can be conducted an extensive pre-processing of the corpus is required. Details of the pre-processing are provided in the Appendix. The most important aspects were standard pre-processing techniques for text analysis involving: tokenization; stop word removal; bigrams and lemmatization.

Methods
Our main methodology for analysing the documents collected relied on various natural language processing techniques that allow for a relatively rapid, highly automated analysis of large collections of text data. At the same time, we implemented various validation steps involving human coders to validate the results of the computational text analysis. Topic modelling is the main methodology for analysing the corpus of documents. It is a popular technique for an exploratory analysis of documents in computer science and, increasingly, in the social sciences. The aim of topic modelling is to detect the most relevant topics from a given corpus of data, which in our case consists of media articles related to cohesion policy. One of the most widely used techniques for conducting this type of analysis is a generative probabilistic model called Latent Dirichlet Allocation (Blei et al., 2003). The technique allowed us to see how words that co-occur form clusters of topics.
One increasingly popular variant of topic modelling is the structural topic model (STM) (Roberts et al., 2019;2014). The application of STM in political science has been growing over recent years with applications in diverse areas (Grimmer and Stewart, 2013;Lucas et al., 2015;Mildenberger and Tingley, 2017;Triga et al., 2019). Part of the attraction of STM is that it allows researchers to incorporate contextual variables during the model fitting process. This is an important feature for social scientists as we are generally interested in group effects, such as ideology (for example, left-wing versus right-wing) or size (small versus big). An STM allows us to preserve contextual meta-information for the subsequent analysis of estimated effects of such grouping variables. For our analysis, the main contextual variable was the territorial level of the news source. An STM allowed us to estimate effects at different territorial levels across our corpus. Furthermore, as there was a temporal dimension we were also interested in changes over time and how they were driven by events.
In a second step, we applied a sentiment analysis (also known as opinion mining) to the corpus and clusters of topics identified by STM. Sentiment analysis leverages techniques from natural language processing and computational linguistic to determine the polarity (positive, negative or neutral) of a text (see Pang and Lee, 2008 for an extended discussion). Typically, most sentences in a document do not express any opinion and are considered neutral (that is, objective). On the other hand, sentences that are subjective promulgate an opinion. This opinion (or sentiment) can be positive or negative depending on the polarity of the words used. Sentiment analysis can be conducted at various different levels: the document; paragraph; sentence or even the word level.
Given our research objectives, only one level was meaningful for analysing the sentiment in our corpus. A document-level (that is, the media article) analysis could lead to an erroneous interpretation. Most media articles are not about cohesion policy per se, but instead mention cohesion policy in relation to other topics. As an example, a media article on the refugee crisis will most likely contain negative sentiment at the document level. However, the sentence related to cohesion policy may be framed positively. A document-level (or indeed a word-level) analysis would lead to an erroneous classification of sentiment. In order to enhance the precision of our sentiment classification we focused on the sentences in which the cohesion policy text appears in a document. We applied a lexicon-based approach for conducting the sentiment analysis due in large part to its simplicity. Additionally, in order to increase the effectiveness of our method we included a rule-based approach to manage negation words, idioms, intensification or emoticons. 1 The end result is a categorical variable with three values: negative, neutral and positive, for every document.

Empirical Analysis
This section presents the results of the computational text analyses and assesses the findings in relation to the hypotheses. The first part focuses on the results of the STM and their implications for the thematic similarity or dissimilarity hypotheses. The next subsection addresses the territorial hypotheses about the nature and tone of the specific topics at different territorial levels through topic modelling and sentiment analysis. The final section explores temporal patterns in cohesion policy news topics to assess the Europeanization and politicization hypotheses.

Thematic Analysis
The STM model returned a reasonably coherent cluster of topics to which the labels in Figure 1 were assigned (details of the model fitting and the validation are provided in Appendix 3). The model allowed us to derive estimates for the topic proportions for the three cases, as shown in Figure 1. This shows a degree of topic convergence in both Spanish and English (transnational and the UK) in terms of the main topics discussed. In all cases, we identified major topics that centred on EU thematic objectives for cohesion policy: low-carbon economy; R&D and innovation; employment, training and entrepreneurship; transport and infrastructure; and local and cultural heritage. However, the specific topics varied across our cases. The employment and urban and local themes were the only dominant themes across all cases, whereas the other thematic areas of intervention were unique to each case. The transport and infrastructure topic was found only in the Spanish case, no doubt reflecting the much higher level of EU funding allocated to this type of expenditure historically as a less-developed cohesion country. This contrasts with the UK, where interventions in the business (SME) and energy sectors received distinctive news coverage in line with policy priorities and that did not feature in the Spanish or transnational media.
We also see that cohesion policy was frequently mentioned in connection with broad EU political themes: EU affairs, primarily about the eurozone crisis or the migration crisis; EU budgetary politics related to issues such as cohesion policy conditionality; and financial irregularities, the latter being among the top three topics in all our cases. Again, the relative importance of these topics varied across the cases. Conditionality was a dominant topic only in the transnational media. It did not feature as a dominant topic in either the UK or Spanish media, where it was subsumed under the wider topic of EU affairs. Similarly, Brexit was a UK-specific topic in cohesion policy news stories.
Overall, these findings provide strong support for the thematic dissimilarity hypothesis (H1a). The thematic focus of media reporting on cohesion policy does vary across the national and transnational media sources in terms of topics covered and their relative importance, implying nationalized media discourses and public spheres. Notwithstanding the different emphases attributed to topics, we found some overall similarities in terms of the focus on EU objectives, spending irregularities and broader EU politics.

Topic Prevalence
We now turn our attention to within-case differentiation in topic prevalence based on the territorial level of the media outlet. To get an insight into the impact of territoriality on media reporting we now introduce our contextual variable into the model fitting. This is where an STM is particularly useful for gauging differences in the emphasis given to topics across different groups. Our analysis controlled for the territorial level in which the media sources were rooted, distinguishing between regional and national sources when comparing the UK and Spain. We found significant differences in the estimated proportions of topics discussed in the media when controlling for territorial level. These differences can be visualized in Figure 2, which shows the logit estimates for each topic of a change from one territorial level to the other. Positive coefficients on the x axis with 95 per cent confidence intervals that do not cross the zero line indicate a significant difference in terms of the prevalence of a topic at the national level. Negative coefficients indicate the opposite; that is, the topic is more prevalent at the regional level.
By and large, there were different topic foci across the levels. The national media tended to focus on higher level issues such as EU affairs, budgetary politics and irregularities while the regional media's focus was more congruent with the EU's policy objectives and priorities, discussing topics such as energy and the environment, and local and cultural heritage investments. As the thematic analyses revealed, the higher level and politicized EU affairs issues about EU funding, conditionality and irregularities were even more prominent in the transnational news sources.
Overall, the STM model findings provide supporting evidence that media reporting on cohesion policy by subnational media is more likely to focus on local realities and themes relating to policy objectives and content that have a more direct impact on the daily lives of readers than the national media (Hypothesis 2). The plots in Figure 2 for the two country cases specificallysuggest that the national media give prominence to topics that have a negative slant, such as scandals associated with irregularities or fraud, or debates about enforcing compliance with wider EU objectives through conditionality (Hypothesis 3). To test this hypothesis we turn to the sentiment analysis of content.

Sentiment Analysis
To analyse potential differences in news tone across territorial levels we performed sentiment analysis on the corpus. As noted, the sentiment analysis was performed at the sentence level, as a document-level approach could seriously misrepresent the tone, that is, in cases where a document is about negative topics such as a crisis or irregularities yet cohesion policy is mentioned positively. The sentiment analysis is based on the threefold classification shown in Figure 3, which depicts the distribution of sentiment for the three cases. We can see in Figure 3 that in all cases the positive category constitutes the largest proportion while the negative category has the lowest proportionalbeit very marginally so for the transnational media. The absolute difference between positive and negative proportions is respectively 21.1, 32.8 and 50.4 per cent in the positive direction for the transnational, Spain and UK cases.
The relatively high positive levels for the UK are striking. To explore this further we created a binary variable to distinguish between the pre and post-Brexit campaign period. We found a significant change in sentiment between the two periods with an increase in positive tone, and a decrease in the observed neutral and negative sentiment in the post-Brexit phase compared with the pre-Brexit period. 2 Nonetheless, our primary concern is with potential differences between territorial levels.
To test whether there were significant differences between territorial levels and content sentiment we used a χ 2 test of independence to examine the relationship for the two relevant cases. A significant relation did emerge, which was more pronounced in the UK case, χ 2 (2, N = 666) = 33.73, p < 0.001. In Spain, the relationship remained significant, although to a much lesser degree: χ 2 (2, N = 3,869) = 8.87, p < 0.05. Post hoc inspection of the residuals reveals the factors driving the result.
The residuals indicate that significant discrepancies between expected versus observed counts occur only in the negative sentiment cells. The mosaic plots in Figure 4 allow us to see clearly where the significant deviations from expected counts occur. As a rule of thumb, residuals of 2 or more signals quite a significant deviation. These cells counts are coloured in Figure 4. The UK offers the clearest illustration where the regional level shows much less negative sentiment than expected, while the inverse is the case for the national level with a much higher negative tone than expected. In the Spanish case, it is driven by more a negative tone in the national media. This provides some support for Hypothesis 3. We found evidence that the national media is more negative than expected in Spain and the UK. However, in neither Spain nor the UK did we find that the regional media was more positive than expected.
It is also instructive to compare sentiment at the topic level (see Figure 5). The breakdown in Figure 5, which shows the percentage of negative sentiment per topic, shows that the most negative sentiment can be found in news on policy process topics. Examples include the top two bars for each case in Figure 5, all of which are topics relating to financial compliance (irregularities), the EU budget and wider EU institutional and policy decision-making (EU affairs). These topics can be contrasted with those with much lower levels of negativity (the bars at the bottom of Figure 5 for each case) that are focused on policy intervention themes. In both Spain and the UK irregularities and EU affairs have the highest negativity scores. As demonstrated earlier, these themes are more dominant in national news than regional news, providing further evidence of relatively greater negativity in national news (H3). For the transnational media the irregularities topic also has high negativity values, although the two topics with the most negative slant are on the highly politicized issues of EU funding conditionality and budget negotiations (for a review of the political debates, see Bachtler et al., 2013;Coman, 2018). The Eastern Europe topic also has a relatively high negative slant in the transnational media, linked to stories about redistributive politics, policy ineffectiveness and misuse of funding.
There are also country-specific differences. Brexit is a distinct topic in the UK news on cohesion policy with high negativity. The policy intervention themes with the lowest negative slant also vary across the cases. For instance, Spanish news on the role of cohesion policy in addressing youth unemployment and supporting R&D and innovationwellknown weaknesses in the Spanish socioeconomic modelhave particularly low negativity, while the topics of energy and of local and cultural heritage in the UK case have the lowest negative ratings. Again, the previous analysis showed that these policy content topics are more prevalent in regional than national news, lending further support to the territoriality hypothesis about relatively lower subnational negativity (H3).

Temporal Analysis
In Figure 6 we present the temporal distribution of articles for the collected sample. At the aggregate level an upward sloping curve can be seen over time for all cases, especially in the case of Spain. At face value this would provide support for the hypothesis that media reporting on cohesion policy has increased significantly over time, in line with the expectations of the Europeanization theory (H4). However, it is important to qualify this finding by acknowledging that part of the increase is likely to be due to the better performance of the web crawler in more recent periods. Some media sources, especially regional sources, would have certainly have had a less developed web presence a decade ago than today. Notwithstanding these limitations, it does appear that there is an upward trendmost clearly in the Spanish and UK cases.
A more nuanced strategy would focus on changes over time for specific topics. As suggested in Hypothesis 5, news coverage is likely to follow episodic patterns that are influenced by events. Shifts in EU priorities and policymaker debates surrounding cohesion policy would lead to a corresponding shift in media attention. The punctuated attention to certain topics can be clearly seen in Figure 7, suggesting that media reporting on cohesion policy responds to politicized topics and prominent events and policy debates that vary at the transnational and national levels. For instance, there has been increased coverage and focus of transnational media on politicized EU topics, such as the budget negotiations, in response to EU events and debates at particular junctures (in line with Hypothesis 5). The budget example in transnational media in in Figure 7 shows a spike during the EU's final 2014-20 budget negotiations. The Greek crisis line plot also shows how the topic dropped in attention from its early peak to receive a new impetus in the lead up to the 2015 bailout referendum. The refugees and migration topic rises following the Syrian refugee crisis, which led to EU debates about making cohesion policy funding conditional on the acceptance of migration quotas and using the funds to address migration integration challenges. A similar punctuated pattern can be seen in the national media discussion of the broader category of EU affairs.
In line with Hypothesis 6, national media reporting on cohesion policy has increased in coverage and focus on politicized national topics in response to national events and policy debates. The dominance of the financial irregularities theme in all three cases can be explained by the audit explosion in EU cohesion policy following the financial mismanagement scandal and resignation of the European Commission in the early 2000s, which led to a major and sustained increase in audit activity across all EU member states (Mendez and Bachtler, 2017), but the different temporal patterns in each case reflect domestic and European events and relationships. In the case of Spain, the increased coverage of spending and irregularities topics reflects a post-crisis landscape dominated by political conflict over austerity and corruption scandals that even led to the downfall of the prime minister in 2018. In particular, the crisis led to funding scandals being exposed notably the 'Gürtel' and Andalusian 'ERE' caseslinked to fraudulent public spending; and more media attention to wasteful vanity projects, some of which had received EU cofunding.
For the UK, the centrality of the Brexit topic is clear. Perceived EU funding waste and excessive UK contributions to the EU budget were dominant campaign themes used by the leave coalition and were, as a consequence, reported in media coverage of the debates. However, as demonstrated in the sentiment analysis section, the overall tone of newspaper coverage of cohesion policy in the UK became more positive during and after the referendum campaign as a result of an increase in stories highlighting the benefits of the policy. Other topics are more stable over time for the national media, depicted by flatter lines in Figure 7. These include core themes of cohesion policy, such as employment for both national cases (albeit linked to 'youth' employment in the Spanish case, reflecting the high level of youth unemployment since the outset of the crisis in 2007), SME development (UK) and territorial cohesion (Spain). In the transnational media we can find some topics such as eastern Europe that remain rather stable over time, in line with the policy's rationale since the enlargements as a redistributive policy for lessdeveloped member states that are highly concentrated in the east since the 2004/2007 enlargements.

Conclusion
This article investigated EU cohesion policy news coverage in the media by applying innovative techniques to a new corpus of over 4,000 English and Spanish news stories from the period 2010 to 2017 covering three territorial levels of news media (transnational, national and subnational). This is the first quantitative study of EU media to include a sizeable sample of regional news sources alongside national sources, and to estimate the effects of this territorial dimension on EU news topic coverage and tone.
In line with theoretical expectations, we found significant differences in the topic focus and tone across territorial levels, with the national and especially transnational level being more negative in tone than the regional level news. National and transnational media placed relatively more emphasis on politicized EU topics (irregularities, conditionality, budgetary politics), while subnational media focused more on substantive policy topics in line with local realities. This may be because national and transnational media pursue more sensationalist stories and relate EU cohesion policy to 'high politics' EU agendas, issues and decision-making dynamics. Another contributory factor may be the weaker emphasis placed on engaging with national media compared with local and regional media in the communication strategies of cohesion programme managers and communicators (TEP, 2013).
The differences in thematic focus across countries and with transnational media imply nationalized media discourses in media coverage of cohesion policy and a less Europeanized public sphere. Media reporting on cohesion policy reacts both to external events, such as the Euro and migration crises, and internal country-specific events, such as Brexit in the UK and corruption scandals in Spain. However, the overall similarities across the three cases (Spain, the UK and transnational media) in the focus on EU thematic objectives and on politicized topics relating to spending irregularities and EU affairs suggest a potential for contributing to a European public sphere.
Moreover, the contribution of cohesion policy media coverage to the legitimization of the EU is likely to be positive, given the overwhelmingly positive sentiment in news stories across all cases. What is striking is the high level of positivity in the UK, where one would expect more negativity given the high level of anti-European public opinion and negative media coverage of the EU more generally (Copeland and Copsey, 2017). This finding was the result of a fairly sophisticated parsing of the document texts in which sentiment analysis was performed at the sentence level in which cohesion policy was mentioned. This allowed us to capture the nuances in which news documents may have addressed a fairly negative topic, but in which cohesion policy was framed positively.
More generally, these findings demonstrate that the well-known negativity bias in EU news does not apply to cohesion policy (see also de Vreese and Kandyla, 2009). Cohesion policy news is positive in tone overall, albeit with variations across topics. This implies the need for EU media studies to take a more fine-grained approach in studying EU news coverage and its effects by taking into considerations the variations in news tone across different policy domains.
Methodologically, our findings suggest that computational text analysis techniques provide a useful tool to study territorial and temporal patterns in European news coverage and tone. The existing literature in EU media studies remains strongly focused on qualitative analysis techniques through the human coding of relatively small samples of news stories. While hand-coded media analysis will continue to provide valuable insights and is a cornerstone of framing analysis, computational techniques open up opportunities to do big data analysis in future EU media studies. Further, STM provides a valuable methodological alternative to other forms of topic modelling by allowing the impact of contextual variables (such as territorial level) on topic coverage to be modelled statistically.
Several policy implications can be drawn from the findings. Firstly, unlike much of the media coverage of EU institutions and elections, the tone of EU cohesion policy news is overwhelmingly positive. The implication is that efforts to improve cohesion policy communication could contribute to raising public awareness of the benefits of the EU in citizens' daily lives and the EU's aim to reconnect with citizens. Secondly, in an era of big data and fake news, computational text analysis provides a cost-efficient method for EU and national policymakers to monitor media topics and tone across international, national and subnational levels, thereby allowing the development of targeted media strategies and campaigns. For instance, the identification of negative coverage in specific locations and news outlets would allow policymakers to launch targeted media campaigns to publicize counter-narratives, increase understanding and encourage objective and balanced reporting. For the 2021-27 period, the European Commission proposes to increase the visibility of cohesion policy through greater communication of the results of EU-funded projects of strategic importance and by requiring the development of social media outreach plans (European Commission, 2018), which could employ similar media analysis techniques to those used here for evaluation purposes.
A controversial reform proposal by the Commission is the introduction of a cohesion policy conditionality on the rule of law. While this has a justifiable rationale in the context of debates about breaches of democratic values in Hungary, Poland and some other lessdeveloped states, the evidence from this article suggests that increasing policy conditionality is likely to lead to increasing negative media coverage. This could in turn lead to public resentment in member states in breach of the rule. Whether and how cohesion policy news impacts on public attitudes to the EU is a key question for future studies. The positive tone of cohesion policy news overall in the cases investigated in this article suggests that the media can, in principle, contribute to increasing public support for the policy and the EU's legitimacy more generally.

Funding Information
This article is part of the COHESIFY project funded by the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 693427.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table A1.