Ranking Economics Journals Using Data from a National Research Evaluation Exercise

This paper describes an algorithm for creating a ranking of economics journals, using data from the 2014 UK Research Excellence Framework (REF) exercise. The ranking generated by the algorithm can be viewed as a measure of the average quality of the papers published in the journal, as judged by the REF Economics and Econometrics sub&#8208;panel, based on the outputs submitted to the REF.


I. Introduction
Research evaluations are increasingly used to inform the allocation of public research funding. A recent large-scale example is the 2014 Research Excellence Framework (REF), which evaluated a sample of research conducted by researchers in UK higher education institutions. As well as research outputs, which count for 65% of the overall score, departments were also evaluated according to their research environment (15%) and their non-academic impact (20%). The latter is the main difference between the REF and the 2008 Research Assessment Exercise (RAE), as the RAE did not evaluate departments on the basis of non-academic impact.
The quality of each submitted output (i.e. a journal article, working paper, book chapter or authored book) was assessed by the members of one of 36 sub-panels and given an individual score. Out of the 2600 outputs submitted to the Economics and Econometrics sub-panel 28% of outputs were classified as 'world-leading' (4*), 49% as 'internationally excellent' (3*), 20% as 'recognised internationally' (2*) and 3% as 'recognised nationally' (1*). 1 The scores were made publicly available at the level of the department only, along with a list of the outputs submitted by each department. 2 This paper uses the publicly available data on research outputs from the REF 2014 to construct a ranking of economics journals. This is done by using a simple algorithm which allocates a rank to each submitted output based on the journal it was published in and compares the predicted share of outputs in the different categories at the departmental level to the actual shares. The algorithm systematically changes the rank of the journals to find the combination that best reproduces the actual department-level shares. To my knowledge, this is the first attempt to construct a ranking of economics journals using data from the REF, or any other research evaluation, complementing analysis of such data in other fields (see e.g. Varin, Cattelan and Firth, 2016).
The ranking generated by the algorithm can be viewed as a measure of the average quality of the papers published in the journal, as judged by the REF sub-panel, based on the outputs submitted to the REF. Since the outputs were assessed individually it should not be viewed as an attempt to construct an 'official' UK ranking of economics journals. It is likely that the actual scores given to the outputs by the sub-panel members varied among papers published in the same journal, which is why the ranking is best viewed as an attempt to infer the average quality of the papers published in the journal.
The paper is organized as follows. Section II describes the algorithm used to construct the journal ranking and section III presents the ranking along with some robustness checks. Section IV offers some concluding remarks.

II. Methodology
Each of the 2,600 outputs submitted to the Economics and Econometrics REF sub-panel is given an initial rank. Using the initial ranks assigned to each paper, we can predict the proportion of 4*, 3*, 2* and 1* submissions for each department. We then calculate the squared difference between the predicted and actual proportions for each category and sum the squared differences over departments (i = 1, 2,…, 28) and categories (j = 1, 2,…, 4). The sum of squared differences (SSD) weighted by the number of submissions from each department (N i ) is given by: where p ij is the actual proportion of j-star submissions in department i, andp ij is the predicted proportion:p where r ni is the rank assigned to output n submitted by department i. I (·) is the indicator function which is equal to one if the expression in the parenthesis is true and zero otherwise. The journals are then sorted in random order and the following algorithm is run: 1. Starting with the first journal re-calculate the SSD after temporarily assigning the journal each of the four possible ranks 2. Assign the journal the rank which leads to the lowest SSD in step 1 3. Repeat steps 1 and 2 for all the journals in the ranking 4. Repeat steps 1 to 3 until the algorithm converges. Convergence is declared when an iteration (i.e. a run through steps 1 to 3) decreases the SSD by < 0.0001.
The criterion for including a journal in the ranking is that at least five papers published (or forthcoming) in the journal were submitted to the REF. Together the papers published in the resulting 96 journals constitute 82% of the 2,600 outputs submitted. This threshold is somewhat arbitrary, but represents a tradeoff between the desire to include as many journals as possible in the ranking and having a reasonable basis for ranking a journal. Note that all the submitted outputs are included when calculatingp ij , but the rank is held constant at the initial value for journals not included in the ranking.
In the next section, this approach is used to generate a journal ranking using the Keele list 3 to assign the initial ranks. The sensitivity of the results to the chosen starting values and the inclusion criterion is then subsequently explored, and the generated ranking is compared with two existing rankings of economics journals.

Journal ranking
The initial rank is set equal to the rank of the journal the paper was published in according to the Keele list. Eighty-five per cent of outputs are classified in this way. The remaining outputs are either working papers (6%), book chapters or authored books (2%), or papers published in journals not covered by the Keele list (7%). Working papers, book chapters and authored books are assigned a rank equal to the modal rank of the outputs submitted by the respective department minus one. 4 Papers published in journals not in the Keele list are given a rank of 1, with a number of exceptions listed in Table 1.
In the case of several journals, the rank of the journal depends on the initial sort order. The algorithm is therefore run 1,000 times, each time with a different random sort order. Typically the algorithm converges in 5-10 iterations and decreases the SSD from 125.5 to about 38-45 depending on the sort order. This represents a reduction of about 64-70%. The results from the 1,000 runs are summarized in Appendix Table A1. Before proceeding it is worth mentioning that while most of the journals are consistently classified in either just one category or in two adjacent categories, there is a small number of journals which are more erratic in terms of their ranking. For example, the Journal of Economic Geography is ranked as 3* just under one third of the time and 1* about two thirds of the time, while the Journal of Economics and Management Strategy is ranked as 4* about 40% of the time and 1* just under 60% of the time. This may reflect that the papers published in these journals were considered to be of more variable quality by the REF panel, but it could also be a consequence of the fact that the journals had a small number of REF submissions.
The approach suggested by Hudson (2013) is used to convert the results into a ranking, where each journal is categorized according to the likelihood that it belongs to a certain category. To be specific, a journal is considered to be 4* if the proportion of times the algorithm ranks it as 4* is greater than or equal to 0.65. Journals with a proportion greater than or equal to 0.5 but lower than 0.65 is classified as a 'probable 4*' while journals with a proportion greater than or equal to 0.35 but lower than 0.5 is classified as a 'possible 4*'. This is then repeated for the other categories, only that now the cumulative proportion is used. For example, a journal is considered to be 3* if the proportion of times the algorithm ranks it as 3* or higher is greater than or equal to 0.65 and it has not already been ranked as 4*. As discussed by Hudson (2013) the choice of cutoffs is arbitrary, but this approach avoids a sharp distinction between the categories. The results are presented in Table 2. It can be seen from the table that the journals typically considered to be the top-5 economics journals 5 (see e.g. Card and DellaVigna, 2013) are all ranked unambiguously as 4*, which gives the results some validity. In addition, a number of high-quality field journals are ranked as 3*.
The algorithm ranks the relatively new American Economic Association journals 6 very highly, either as 4* or probable/possible 4*. The same goes for Quantitative Economics, the relatively new Econometric Society journal. Also of interest is that two economic history journals (Journal of Economic History and Explorations in Economic History) are ranked as 4* and a third (Economic History Review) as possible 4*.
While the findings in general seem plausible, there are some unexpected results. For example, while the Journal of Environmental Economics and Management is generally considered the top field journal in environmental economics, Environmental and Resource Economics performs better in the ranking. 7 Another surprise is the relatively low rank of

Starting values
To explore the sensitivity of the results to the starting values chosen for the algorithm the analysis was re-run using randomly chosen starting values for the 96 journals in the ranking. All the other aspects of the methodology remain unchanged. The results are presented in Appendix Table A2. The ranking is remarkably stable: the correlation 9 between the rankings is 0.94, which suggests that the final ranking is not very sensitive to the choice of starting values. In most cases the journals either keep the same rank or move from being unambiguously ranked in one category to being possibly or probably ranked in the same or an adjacent category (or vice versa). There is a small number of journals for which the rank changes more dramatically: Econometric Theory, for example, is 'upgraded' from 3* to 4* and the Review of Income and Wealth is 'downgraded' from 8 At the suggestion of a referee an additional analysis was run which distinguishes between regular issues and conference issues of the American Economic Review (AER) and the Economic Journal (EJ). The conference issue of the AER receives a lower (possible 4*) rank, while the conference issue of the EJ is somewhat unexpectedly ranked higher than the ordinary issue (4* vs. 3*). This result should be tempered by the fact that only five and eight papers published in a conference issue of the AER and EJ, respectively, were submitted to the REF. The ranking of the remaining journals was largely unaffected by treating conference issues of the AER and EJ separately from the regular issues. When treated separately both regular and Feature issues of the EJ receive a 3* rank. 9 To compute the correlation possible/probable 4* was coded as 3.35/3.65 and possible/probable 3* as 2.35/2.65. possible 4* to 2*. The latter journal has only six submissions to the REF, which suggests that its rank should be treated with some caution.

Inclusion criterion
As discussed in section II, the criterion for including a journal in the ranking is that at least five papers published (or forthcoming) in the journal were submitted to the REF. Since this threshold is somewhat arbitrary it is useful to explore the sensitivity of the results to increasing the threshold to 10 papers. This reduces the number of journals included in the ranking from 96 to 60. As before, the rank is held constant at the initial value for journals not included in the ranking. The results are presented in Appendix Table A3. Again the results are very stable: the correlation between the rankings of the 60 journals with 10 or more submissions is 0.95. Nearly all of the journals keep the same rank or move from being unambiguously ranked in one category to being possibly or probably ranked in the same or an adjacent category (or vice versa). Only one journal is affected more dramatically: the Journal of Agricultural Economics is 'downgraded' from 2* to 1*. These results suggest that the ranking for journals with 10 or more submissions is fairly robust to changing the inclusion criterion.

Simulation results
To investigate the validity of the methodology, a simulation experiment was carried out in which each of the journals in the ranking was assigned a hypothetical true rank according to the ranking presented in Table 2. 10 Each output published in one of the journals in the ranking was assigned the rank of the journal, while the remaining outputs were assigned a rank according to the assumptions made in the analysis above. Based on the true ranks the department level scores were calculated, and the algorithm was run as before.
The correlation between the ranking produced by the algorithm and the true ranking is 0.86, and this increases to 0.93 when restricting the sample to the journals with 10 or more submissions to the REF. Most journals are either unambiguously assigned the correct rank, or ranked as probably/possibly belonging to the correct category. This is the case for 66% of the journals overall and 80% of the journals with 10 or more REF submissions. When the algorithm assigns a incorrect rank, this is nearly always the rank immediately above or below the true rank.
These findings are encouraging overall, but reinforces our earlier warning that the ranks of journals with only a small number of submissions should be treated with some caution. Table 3 shows the pairwise correlations between the rankings based on the REF data, the Keele ranking and the recent rankings from the Academic Journal Guide (AJG) 2015. 11 10 Journals ranked as probable 4* were considered to be 4* and those ranked as possible 4* to be 3* (and analogously for journals ranked probable/possible 3*) for the purposes of assigning the true ranks. 11 The guide can be downloaded at www.charteredabs.org (subject to free registration). We use the rankings in the 'Economics, Econometrics and Statistics' and 'Business History and Economic History' categories. Like the Keele list, the AJG 2015 divides the journals into four quality categories, plus an additional 'journal of distinction' category which is re-coded as 4* for the purpose of computing the correlations. It can be seen that the correlation between the rankings based on the REF data is much higher than the correlations between these rankings and the Keele and AJG 2015 rankings.

IV. Concluding remarks
This paper has described an algorithm for creating a ranking of economics journals using data from the 2014 UK REF exercise. The ranking generated by the algorithm can be viewed as a measure of the average quality of the papers published in the journal, as judged by the REF Economics and Econometrics sub-panel, based on the outputs submitted to the REF.
The ranking produced is broadly in line with citation-based rankings of economics journals, such as Kalaitzidakis, Mamuneas and Stengos (2011). This finding complements the work by Clerides, Pashardes and Polycarpou (2011), who find that the rankings of economics departments in the earlier 1996 and 2001 RAEs are largely in agreement with metrics-based rankings. It is also consistent with the results from an official Higher Education Funding Council for England analysis of the 2014 REF data (HEFCE, 2015), which finds that the rank of the outputs submitted to the Economics and Econometrics sub-panel correlate highly with a range of citation-based metrics.
The ranking can be described as 'egalitarian' rather than 'elitist' in the sense of Neary, Mirrlees and Tirole (2003), as most of the journals are ranked relatively highly. This needs to be interpreted in the context of the REF being the driver of funding for UK universities, as well as the perceived prestige of different disciplines. While on one hand, departments have a strong incentive to submit only their best research to the REF, sub-panels may also have an incentive to give relatively high scores to the outputs evaluated, as suggested in newspaper commentary after the results were made public (see e.g. Marginson, 2014).
Some cautions are in order; firstly the ranking of the journals is based on a sample of papers published in the journals, in some cases as few as 5. The submitted papers may not be representative of the population of papers published in the journal. Secondly, the ranking is based on the subjective judgments of a small group of economists, albeit a very influential one in a UK context. Thirdly, if fewer than five papers published in a journal were submitted to the REF the journal is omitted from the ranking regardless of its quality. For these reasons, the ranking should be considered a complement to existing rankings of economics journals, and not as a stand-alone measure of the quality of the papers published in a journal.    Note: The cumulative proportion is the proportion of times the algorithm has classified the journal in the assigned category or higher. Note: The cumulative proportion is the proportion of times the algorithm has classified the journal in the assigned category or higher.