Herding Cats? Management and University Performance

Using a tried and tested measure of management practices which has been shown to predict firm performance, we survey nearly 250 departments across 100+ UK universities. We find large differences in management scores across universities and that departments in older, research-intensive universities score higher than departments in newer, more teaching-oriented universities. We also find that management matters in universities. The scores, particularly with respect to provision of incentives for staff recruitment, retention and promotion, are correlated with both teaching and research performance conditional on resources and past performance. Moreover, this relationship holds for all universities, not just research-intensive ones.


Sarah Smith
The Centre for Market and Public Organisation (CMPO) 2 Priory Road Bristol, BS8 1TX Email: sarah.smith@bristol.ac.uk For further Discussion Papers by this author see: www.cepr.org/pubs/new-dps/dplist.asp?authorid=163041 *This research was funded by The Economic and Social Research Council through the Centre for Market and Public Organisation. Thanks to two anonymous referees and the editor, and to Nick Bloom, Clare Callender, Wendy Larner for helpful comments, to the student interviewers (Eloise Pollard, Kamahl Hoque, Rob Cole, Doris Flores, Eric Larson, Alice Chester-Master), to Salomo Hirvonen for research assistance and to all the university managers who gave us their time.

Introduction
The publication of the latest national and international university league tables typically makes UK newspaper headlines. The performance of universities, in both research and teaching, matters. Higher education is a strategically important sector and there is evidence that investments in research-type education pays off in areas that are close to the world technological frontier (Aghion et al., 2010, andAcemoglu, 2006). In a number of countries, government funding for universities is explicitly linked to performance metrics, including research outputs (in the UK) and negotiated performance targets (in the Netherlands). Many universities now compete in global markets for both students and staff who are likely to pay close attention to how different institutions perform.
This raises the important question of what contributes to universities' success. Beyond the obvious importance of resources, Aghion et al (2010) identified the external environment, as measured by the degree of competition and autonomy from central government control of decision making, faced by universities in the US and Europe as an important driver of performance in world rankings. In this paper, we focus on the internal environment -arguably something that universities can better control -and examine whether the quality of management within universities affects their performance.
This follows a growing body of research that has demonstrated that good management practices improve firm performance (for a recent summary see Bloom and Van Reenen 2012). The underlying premise is that there are universally "good" and "bad" management practices and that these practices matter in a meaningful way for how an organisation performs. This has been supported by empirical findings showing that there is a wide dispersion in the quality of management practices and that differences in (measured) managerial practices can explain part of the long-standing heterogeneity between organisations in performance (see Bloom and Van Reenen, 2007, Bloom et al, 2011, Black and Lynch 2001. In this paper we examine whether the same is true of universities. There is a commonly expressed view -illustrated by the quotes above -that managing academics is, like herding cats, either impossible or pointless. Academics are seen as differing to workers in most other organisations in ways that may make management tools less effective. One difference is that academics are thought to have a high degree of intrinsic motivation in relation to their work (i.e. they care directly about their research and/or teaching). Besley and Ghatak (2003) and Benabou and Tirole (2006) have emphasized that sharp incentives may not be as important or effective when agents are motivated. 2 Even when it comes to extrinsic motivations among academics, many of these (such as academic status) are determined by a wider peer group in the academic community, rather than being determined by an academic's department, faculty or university managers. This may make internal management tools less effective. These perceived differences motivate our interest in looking directly at management and performance in universities.
To collect information on management practices, we adopt the same tried and tested survey tool originally developed in Bloom and Van Reenen (2007). We use this to examine the relationship between management scores and a number of externallycollected measures of performance, covering both research and teaching. Our focus is on a single country, the UK, in order to control for cross-country differences in the institutional context. The UK provides a good 'test bed' for several reasons. First, the university sector is important in the UK in terms of revenue, exports and contribution to innovation. 3 While US universities dominate global league tables, UK institutions perform well compared to those outside the USA. In the recent ARWU ranking, 11 out of the top 100 universities were in the UK, compared to 58 in the US, but only 3 in France and 5 in Germany. At the very top of the international league table, the top ten universities are split eight to two between the US and the UK. This performance is in spite of the fact that in the UK (private plus public) spending on tertiary education as a percentage of GDP (1.3%) is below the OECD average (1.6%) and half the level that it is in the US (2.6%). Second, in comparison to many European Universities, those in the UK compete highly for both students and research funding (Aghion et al 2010) and there are ongoing major reforms to funding for many of the UK's Universities which are only likely to increase the degree of competition between institutions. Third, the performance of UK universities has been subject to a high degree of external measurement and benchmarking for nearly two decades. Performance measures cover both research and student satisfaction and these measures are widely disseminated across producers and consumers and are linked to public funding. Fourth, there is considerable diversity in the type of provider within the university sector in the UK, a by-product of successive government's attempts to expand the uptake of higher education to lower income individuals.
To date, there has been relatively little quantitative evidence on university management practices. 4 A number of papers have looked at the cost efficiency of administration in universities (for example, Case and Thanassoulis, 2006;Bayraktar et al, 2013;and Lu, 2012). Aghion et al (2010) examine autonomy in decision-making from local or central government control, but in their cross-national sample cannot separate this out from competition. Moreover, their focus is on the external environment rather than the internal organisation. Possibly closest to our study, Goodall (2006Goodall ( , 2009 explores the role of leaders in universities and, in particular, expert leaders. She finds evidence that the appointment of strong academics at the top of the organisation is associated with improved research performance at the university level. We do not rule out the potential importance of leadership but our focus is on a set of core operations-oriented management practices (monitoring of performance, setting targets and use of incentives).
We examine the academic discipline (departmental) level, which enables us to examine variation in management practice scores both across and within universities, and look at how the scores correlate with external measures of teaching and research performance, controlling for resources and past performance.
Our survey data reveal a number of interesting findings. We find a very low degree of correlation in management practice scores across departments, compared to other multiplant firms and hospitals that have been studied. In other words, management practices appear to be relatively heterogeneous within universities, although we find no significant differences by academic discipline. When looking at the relationship with performance we find that management scores at the department level are more important than management scores at the university level. We find clear differences across universities, particularly by university type (older, research-intensive compared to newer, more 4 Bloom et al (2011) find that high school management is associated with better performance and at the level of higher education, Aghion et al. (2010) and Aghion (2008) provide descriptive evidence that university autonomy and competition are associated with better outcomes in terms of research rankings. teaching-oriented). Management structures vary by type, particularly in the degree to which management practices are decentralised. And management scores vary by university type. Departments in older and more research-intensive universities tend to be better managed than departments in newer and more teaching-focused universities. The biggest difference is in managerial practices with respect to incentives for recruitment and retention of staff.
We also find that the management scores are strongly positively correlated with externally-assessed measures of performance in both research and teaching. This correlation is robust to including a number of controls including those for the level of resources and past performance. We cannot rule out that both management and current performance (conditional on past performance) are related to some unobservable event, but we can rule out anything that might affect all aspects of management since the relationship with performance is driven primarily by the quality of management practices on one dimension: with respect to provision of incentives. Universities with high incentive scores perform well in terms of both research and teaching but performance management and, in particular, targets are not related to measured outcomes.
Finally, we find that the relationship between management scores and performance holds for both research-intensive and newer, more teaching-focused universities. We surmise that one reason why newer universities do not adopt the research-intensive universities' model may be limited competition between university types.
We describe our sample and survey methodology in Section 2. Section 3 presents some preliminary descriptive statistics, while Section 4 contains the main results on the relationship between management and performance. Section 5 concludes.

Institutional setting, sample and methodology 2.1 The institutional setting
The UK university sector comprises 158 institutions that have degree-awarding powers.
Most of these are not-for-profit. 5 All undertake both research and teaching, but the balance between these activities varies. The main divide is between "old universities" (founded pre-1992) which are typically more research focused and "new universities", granted university status post-1992 as part of a government drive to increase participation in degree-level education. But there is also arguably a further divide between the 24 most research-intensive older universities (known as "the Russell Group" 6 that account for around 15% of the sector but 75% of all research income) and other older universities, and also between newer universities that were former polytechnics (which offered higher diplomas and degrees, often in more technical subjects, that were governed and administered at the national level) and those that were previously further education colleges.
Our analysis therefore separates four groups of universities. These are (1) The "Russell Group" (2) "Other Old" universities, founded before 1992 (3) "Former Polytechnics" and (4) "Other New" universities (primarily former further and higher education colleges and specialist colleges). We show below that there are meaningful differences across the four groups. Full details of the institutions in our sample and the four groupings are given in Table A1.
In an international comparison, Aghion et al (2010) identified UK universities as having a high level of autonomy from government over budgets and hiring and a high level of competition for funding for both research and teaching. Going forward, this level of competition is set to increase. Recent reforms have allowed UK universities to charge differential fees and at the same time reduced the student-based subsidies provided to universities and eased the caps on (UK resident) undergraduate student numbers. 7 Arguably, however, the nature of the competition varies across universities. Responses to our survey reveal that the research-intensive universities see themselves competing in international and national markets (for staff and students) while newer universities focus more on local markets.
Undergraduate degrees in the UK typically involve three years full-time study (four in Scotland) across all these university types. Currently around 35 percent of UK resident individuals attend university. Attendance at university is not a right for all individuals who complete high school, but is conditional on performance in national exams taken at 6 So-called because the first informal meetings of the group took place in Russell Square in London. 7 Postgraduate student numbers are not capped. age 18 (17 in Scotland). 8 In common with the USA, but in contrast to much of Continental Europe, many UK resident students study away from home. Entry standards vary considerably between university and competition for places is very strong, particularly at the elite research-orientated universities. Students from outside the UK make up a significant proportion of the student body (around 14% of undergraduates and over 60% of postgraduates, UKCISA) and competition for these students is worldwide.

Our sample
The population for our study consisted of universities that made a submission to the most recent Research Assessment Exercise (RAE), carried out in 2008. This RAE involved (the latest in a series of) peer-review assessments of the research outputs of academic staff within a department, designed to produce a quality profile of the department for the purposes of allocating research funding (more details in section 2.4).
Our selection of only RAE-submitting institutions was in oreder to provide an external performance measure relating to research. It will tend to bias our sample to universities with at least some research-active staff relative to the full population of institutions with degree-awarding powers but, as shown in Table 1, our relevant population covers all four types of universities (Russell Group, Other Old, Former Polytechnics and Other New).
UK universities are generally organised into faculties covering broad groups of related academic disciplines (for example, medicine, sciences, social sciences, arts) and, within this, departments, which contain discipline-specific academics. Interviews were carried out with Heads of Departments. We selected Heads of Department since their key responsibilities include recruitment and retention of staff and deployment of staff and other resources.
Rather than spreading our sample thinly across a large number of different academic departments with relatively few observations for each, we deliberately focused on four academic subjects -Psychology, Computer Science, Business & Management, and English. These were chosen to cover the full range of disciplines (Science, Humanities and Social Sciences) and because, as shown in Table 1, a relatively large number of 8 Compulsory schooling ends at age 16 in the UK. High school ends at 18 and students wishing to go to University have to achieve (high) standards in the exams taken at the end of high school (known as 'A' levels). universities made an RAE submission in these subjects (76+), allowing us to obtain reasonable sample sizes across the four university types. If we had chosen Economics, for example, the relevant population would have consisted of only 35 departments concentrated among Russell Group and Other Old universities. Business & Management gives us a larger and more diverse population of 90 departments. 9 In fact, we show in our analysis that there are no significant differences in the management scores across academic departments (university type is more important in explaining variation across our sample). We therefore think it is likely that surveying a different set of academic departments would yield similar results.
As shown in Table 1, a total of 120 universities had at least one of these four academic departments submitting to the RAE 2008. We also surveyed human resource (HR) departments in all the submitting universities in order to look at the relative importance of management practices at the department and university level. For each university, this gives a potential maximum of five observations, although it is clear from Table 1 that older universities typically have a higher number of RAE submitting departments than newer universities. Our final sample contains information on management practices in 248 departments (including the HR department) within 112 UK universities. Our sample includes 34 universities for which we observe only one department, 38 for which we observe two, 25 for which we observe three, 12 for observe four and three for which we observe all five.

The management practices survey
To measure the quality of management practices we use an existing methodology that has been used in manufacturing (Bloom and Van Reenen, 2007), health  and the social care sector (Delfgauuw et al, 2011). Using an existing methodology has a number of advantages. First, the survey has been extensively tried and tested, successfully being used to survey several thousands of organizations in more than 20 different countries. Second, following the same methodology and using a common set of indicators allows us to set our results in a wider context. 9 There are 60 universities which have a Business & Management department and no Economics department submitted to the RAE. For these, it is likely that the Business and Management department includes some economists who would have been assessed by the Economics and Econometrics RAE subpanel. Their outputs and scores will have been taken into account in the overall departmental Business & Management RAE score.
The focus of the management survey is a set of operations-focused management practices. The survey does not cover leadership or values, although these are likely to also be important in explaining performance variation (Goodall, 2006, 2009, Stephan et al, 2012. At the core of our survey is a set of 17 indicators of management practices, grouped into four subcategories as follows: The interviews were carried out during Summer 2012 by six students (five from the University of Bristol and a recent graduate from Boston University), including one first year undergraduate, three third year undergraduates, one student with a Masters and one doing a PhD, spread across a number of departments (Law, Classics, Management, Administration and Biology). They undertook a two-day training programme which had been designed by the original management interview team at the London School of Economics. This training programme, together with paired practice interviews, helped to ensure a consistent approach. The interview process was project managed on a day-today basis by McCormack.
The interviews were independently double-scored by two interviewers, one conducting the interview, the other listening in. 10 Any differences in scores were discussed and reconciled at the end of the interview. If the difference in scores was two or more (which was the case for 17 out of a total of 3,757 indicator scores), there was a discussion with the project manager. In 974 cases the scores differed by one point with no obvious patterns across interviewers or indicators. These smaller differences were discussed and resolved by the two scorers. The double-scoring was to ensure that the interviews and scoring are comparable across interviewers, although our regression analysis additionally controls for interviewer fixed effects, as well as the time of day the interviews were done.
To ensure unbiased responses, interviews were conducted by telephone without the respondents being aware in advance that they were being scored, making it more likely that the interviews genuinely captured actual management practices. In addition, the interviewers were not given any metrics on the universities' performance in advance of the interview and nor were survey respondents asked for this information. These were matched in from independent sources after the interviews were finished.

Measures of performance
UK universities are monitored by independent public regulators on the basis of their research and teaching performance, with much of this monitoring related to performance at the departmental level. In addition, there are several independent rankings which combine this performance information with other indicators. We use three of these performance metrics in our analysis, focusing on departmental level rankings. This is in contrast to earlier research that used only university level rankings (e.g. Goodall, 2006Goodall, , 2009Aghion et al, 2010).
First, the Research Assessment Exercise (now Research Excellence Framework) provides an assessment of the quality of research output of the academic staff members at departmental (discipline) level. These quality profiles were intended to provide objective, comparable measures of the department's research performance as assessed by peerreviewers. In principle, these measures are potentially comparable across academic disciplines, but we focus on relative performance within disciplines. RAE results are available from 2008 and 2001. We use ranking information, reversing the rankings such that a higher number indicates a more highly ranked department.
Second, we use an assessment of student satisfaction. This is measured by the National Student Survey (NSS) satisfaction score, which has been collected annually since 2005.
We focus on responses to the question "Overall, I am satisfied with the course" (scored 1 -5 where 1 indicates completely disagree and 5 indicates completely agree). We normalise at the department level to allow for differences across disciplines.
Third, we use the ranking of the department according to independent university guides.
There are several of these in the UK: we focus on the Complete University Guide where rankings information is available at the department level over the period 2008 -2013. 11 These rankings are weighted indices covering research outputs, student satisfaction, student outcomes and measures of resources. Since this reflects both teaching and research at departmental level this is our key measure of output. Again, we reverse the rankings such that a higher number reflects a better performance.

Other controls
We include in our analyses controls for resources at departmental and University level, including the number of staff, students and expenditure. This includes both academic spending per staff member which is a reasonable measure of (average) salary within a department and other spending, which we normalise by number of students. These measures of resources are derived from sources external to our respondents to the management survey (mainly from the government regulators). Details are provided in Table A2. It is important to be able to control for resources in exploring the link between management and performance. It means, for example, that better management in a department is not simply picking up a higher level of resources. When it comes to the incentives scores, we can also rule out that better hiring and promotion practices are simply allowing Heads of Department to compete more aggressively in terms of the academic salaries they can offer.

Differences across types of university
In Table 2 we show that the groupings of universities that we identify above (Russell Group, Other Old, Former Polytechnics and Other New) are meaningful, in terms of there being significant differences in the performance of the departments, the resources available, the markets they operate within and the management structures. Second, there are clear differences in the level of resources across the university types. 12 As expected, the research intensive universities have a higher level of resources as measured by academic spending per staff member and other spending per student. Staffstudent ratios are also lower in these elite universities.
Third, there are differences in the markets in which the Universities operate. The older universities see themselves as competing internationally and nationally, while the newer ones primarily see themselves as competing with other institutions locally. 13 Fourth, the summary statistics in Table 2 (collected as part of the survey) give some indication of differences in management structures across the types of institutions. In both groups of older universities, management is more typically a part-time role (where the rest of the time is for academic activities) and fixed-term. While managers (Heads of Department) at these universities are only slightly more likely to come directly from an academic position than those at the new universities, they are much more likely to return to being an academic (rather than a management role). This highlights alternative routes to becoming Head of Department in the UK. In the first route (more common in older universities) being Head is a temporary administrative responsibility that rotates among senior academic members of the Department. In the other route (more common in newer universities) being Head of Department is the first step of a management career, on the way to a more senior faculty-or university-level position. These two types of managers may have very different objectives. In the first case, the managers may try to minimise the cost of being Head but also focus more on what they think will enhance the academic environment of the department. In the second case, the manager may pay closer attention to university-level management policies. In the next section we look at whether manager characteristics are reflected in different scores.
Finally, there are differences in the extent to which aspects of management are centralised within universities. Across all types of university, operations (the organisation of research and teaching) are largely left to departments. However, there are clear differences with respect to "incentives", where the old universities have more decentralised processes than the new. In our analysis, we find there to be a strong link between decentralisation of incentives and the quality of this dimension management practices. Since we cannot differentiate between decentralisation and quality, our interpretation is that good management practices in relation to incentives involve decentralisation to the department level.

Variation in the management scores
We begin our analysis by describing the variation in the management scores.

Comparison with other sectors
By applying essentially the same survey to universities as was used to measure management practices in manufacturing and hospitals we can make some high-level comparisons across these industries. Focusing on 15 of the 17 individual indicators that are the most directly comparable, 14 we find universities score relatively highly (mean score = 3.24, SD = 0.476) compared to both manufacturing (mean score = 3.03, SD = 0.642) and hospitals (mean score = 2.45, SD = 0.612). We find the greatest differences with manufacturing in relation to targets, possibly related to the high level of benchmarking information in the UK higher education sector, and incentives, which may reflect the importance of individual talent in research. However, we do not put too much weight on this cross-sectoral comparison. Although we have gone to some length to attempt comparability in scores across studies, we cannot completely rule out some differences in scoring.
One difference that is more meaningful is the high degree of heterogeneity in scores within universities compared to manufacturing firms and hospitals. In previous studies when several "plants" were sampled from the same organisation, subsequent analysis showed a high level of correlation between management scores within the same organisation (0.530 for hospitals and 0.734 for manufacturing firms). Thus multi-plant sampling acted as a check on scoring of management quality but the analysis focused on the organisation-level average. In the case of universities, however, the degree of correlation in scores across departments within the same institution is very low (0.086). This is even the case when we look at departments within institution interviewed by the same interviewer (0.036). The high degree of heterogeneity may well arise because departments within institutions essentially operate in separate labour markets, some being national and others international, depending on the academic standing of the department.
Compared to manufacturing or hospitals, many staff within a university department do not have skills that are transferrable to other department within the same institution (as distinct from moving to the same department in another institution). Further, in some universities, departments also operate in different markets for students. Thus we focus our analysis on departments. 15

Variation by department and university type
Across departments (Table 3, panel a), HR departments score more highly overall than the academic departments. But this higher overall score masks differences across the subcomponents of the management scores. The biggest positive gap in favour of HR departments is in "targets"; in "operations" (processes for research and teaching) the academic departments score higher. This ties in with the fact that universities appear to decentralise these operations processes to the departments (as shown in Table 2). Within academic departments, Business departments typically score highest and English departments lowest, but these inter-departmental differences are not significant within Universities (i.e. controlling for university fixed effects).
By contrast, there are sizeable differences in scores across university types which are statistically significant, controlling for department type (Table 3, lower panel). In terms of the overall score, departments in research-intensive universities (the Russell Group) score significantly higher than the rest, with an average score of 3.48. This is followed by Other Old, then Former Polytechnics and finally the Other New, where the average score is 3.19. Figure 1 shows that the management scores among the two types of new universities are more dispersed with several departments performing quite poorly in terms of their overall management score.
The higher overall management score among the research-intensive group universities is not driven by consistently better performance across all sub-groups of scores. There is no significant difference by university type in "targets" and "operations" and while there are significant differences in "monitoring", it is the Other New universities that score highest with a score of 3.43. The higher score in the research-intensive universities is driven by ratings on the "incentives" component of the management practices scores.
The incentives scores are the most dispersed across the university types and there is a 15 In our analyses we cluster standard errors by university. more than one standard deviation difference in mean incentive scores between Russell Group and Other New Universities. Table 4 presents the scores for the individual   indicators by university type. The table shows there is only one incentives indicator where there is no difference between the research intensive university departments and the rest and that is "removal of poor performers". And in for this score Russell Group departments under-perform relative to their scores on other incentive indicators, suggesting this is an outlier to otherwise higher scores in this group of management practices.

Accounting for variation in management scores
What explains variation in management scores across universities? To what extent do different scores across university types simply reflect differences in their other characteristics, for example, resources or the pressure felt by departments with respect to volume of students?
To explore this, we estimate the following linear regression: where M ij is the management z-score for department i in university j. We focus only on academic (i.e. we exclude the HR) departments. We run separate regressions for the overall score and for each of the main components (operations, monitoring, targets and incentives). Z 1 is a vector of controls at the department level including characteristics of the manager (female, whether full-time manager, years' tenure and likely next role) and measures of departmental level resources (the number of staff, the number of students, spending on academic staff and other spending). Previous studies (e.g.  have shown competition to be an important determinant of management practices so we explore this by including measures of competition (these are self-reported and were collected as part of the survey). Z 2 is a vector of controls at the university level, including the number of cost centres as defined by the University regulator (to allow for the spread of the university across academic disciplines) and an indicator for London, as several of the London Universities share central administration for degree awarding functions. We also include departmental and interviewer fixed effects and cluster standard errors at the University level.
The results are reported in Table 5. Column (I) has no controls other than university type, departmental and interviewer fixed effects. For university type, the reference group is the research intensive group of universities (Russell Group). This column shows this group scores around 0.5 standard deviations higher than all the other three types on overall management score.
Moving from column (I) to (II) shows the effect of adding controls for manager characteristics, resources, competition and London location. A number of the manager characteristics variables enter significantly. Female heads of department score lower overall and, specifically, lower in relation to incentives. This is an interesting finding, although we do not know for sure whether this reflects a genuine difference or a gender difference in reporting. Full-time managers score lower, particularly in relation to operations and incentives. Because this is a self-reported measure referring to the time spent doing the job, one possibility is that worse managers spend longer on management tasks. The manager's next role also affects how they perform. Managers who are likely to return to academia (the default category) score lower in terms of their overall management score. The difference is most pronounced in relation to operations and targets. The latter in particular are likely to reflect university management policies, and our findings are thus consistent with managers anticipating a return to academia having fewer upward-looking career concerns. The number of years as Head of Department is not significantly correlated with any of the management scores.
Looking at the other controls, we find some significant variation in management scores with our measures of resources, although this is not systematically the case for all of the dimensions of management. London-based institutions score lower on average.
Including these controls, we find no significant differences across types of universities in the quality of management with respect to "operations" (second set of columns), "monitoring" (third set of columns) or "targets" (fourth set of columns). But on "incentives", the research intensive Russell Group score significantly better than the other university types even with controls for resources. The new universities each score over 1.3 standard deviation below the research intensive, with the other old group of universities having a score between the most research intensive and the new universities (0.74 standard deviations lower than the most research intensive). The difference in management quality on "incentives" drives significant differences in overall scores between the research intensive and the Other Old and Former Polytechnics. In summary, the results in Table 5 Column (II) confirm that Russell Group universities score better on incentives and that this does not simply reflect their higher level of resources.
Column (III) of Table 5 reports an additional specification in which we explore the link between management practices and the extent to which management processes are centralised within the university (this information was collected as part of our survey in addition to the management practices questions). We focus on centralisation of three aspects of management: monitoring, operations and targets. The questions were not asked in all cases and so the sample sizes are therefore considerably smaller. We therefore run a simpler specification excluding the controls for manager characteristics, resources, competition and London location. The results in Column (III) suggest that centralisation -and what is centralised -is important. Centralisation of "operations" has an overall positive effect on the overall scores, raising them by just over 1.5 of a standard deviation.
But centralisation of "incentives" has the opposite effect -it reduces the departmental scores -and this reduction is significant for the incentives scores (where it lowers them by nearly three quarters of a standard deviation), the operations scores (a reduction of 0.36 of a standard deviation) and the overall management practices score (a reduction of 0.731). Universities that decentralise incentives to the department level score more highly and this decentralisation is more common in the elite universities than other types of universities. Our interpretation of these findings is that the quality of incentives management within universities is inherently linked to decentralised incentives processes. 16 This finding echoes the earlier findings from Aghion et al (2010), which looked across, rather than within, country.

Does management matter?
We have shown that there are significant differences in management scores across universities. We now turn to address the key question of whether this matters for performance. While we cannot establish causality in a single cross section, we control for observable differences in resources and condition on past performance, allowing us to control for university-and departmental-level factors which have a time-invariant effect on output.
16 Our centralised management score is from the survey. Given this, an alternative explanation is that University managers who are poor managers blame this on centralised management. But this interpretation is not supported by the difference in the association of different aspects of management with the centralisation measures.

Allowing for differences in resources and controlling for past performance
To explore the relationship between management and performance further, we control for differences in resources and attempt to mop up unobserved heterogeneity by additionally controlling for past performance.
We estimate the following regressions: Y ijt = α +φM ij + γ'Z 1ij + δZ 2j + γY ij,t-5 + Uni_type j + dept i + u ij (2) where Y ijt refers to a performance measure. We run separate regressions for the CUG ranking, the RAE ranking and the NSS score. In each case, we use the most recent measure, although in the case of the RAE ranking, this is last available for 2008. We include the same controls as before (Z 1 and Z 2 ). We estimate (2) without and with lagged performance, the latter specification allowing us to control for unobservable departmentand university-level factors which have a time-invariant effect on performance. We choose the five-year lag to allow management to be correlated with changes in performance over a reasonable period. Choosing other lags yield similar results.
The main results are summarized in Table 6. Column (I) shows the correlations including only interviewer fixed effects. Column (II) adds controls for manager, department and university characteristics, as well as indicators for university type. The results confirm that, even within university type and conditional on resources, the management score has a significant and positive effect for CUG and RAE rankings. For NSS scores, the coefficient is positive but not significant. Column III adds a further control for lagged performance. In this specification, the overall management z_score is now positive and significant in regressions for all three performance measures. Controlling for both university type and past performance, a one standard deviation improvement in management score is correlated with a 2.74 improvement in the CUG ranking, a 2.49 improvement in the RAE ranking and a 0.14 standard deviation improvement in the NSS score. While we cannot give this a strict causal interpretation, these results clearly signal that management is at least part of the story for why departments perform well.
4.3 What level and aspects of management seem to matter for performance? These results strongly indicate that it is management at the department level that matters for measures of department performance. We explore this further looking at whether there is an association between central management practices and university-level performance measures. We use the ARWU ranking of world universities and the CUG ranking of universities in the UK. We regress these performance measures on two alternative university-level management scores. The first reflects the departmental scores and is the average of the management scores among the academic departments. The second is a university level measure and is the HR department management score. Table   7 contains the results. The columns labelled (I) present the former, the columns labelled (II) the latter. The results for the ARWU ranking show that management at the academic department level is relatively more important than university-level management in explaining (positive) performance. The former is associated with a 27.4 point increase in the world ranking, while the latter is associated with a 43.4 point fall. For the CUG ranking, the university level score is associated with a 4.3 point fall in the position in the rankings. These findings suggest that what the HR department does is not associated with increases in performance.
We now turn to the association between the measures of department-level performance and individual sub-groups of management practice scores. Figure 4 shows that the overall university ranking (our preferred measure since it combines both research and teaching) is most strongly associated with the use of incentives. Table 8 confirms examines this in a regression framework. We run the same specification as before (equation (2)) to look at the relationship between performance and management scores, but we now include the all four sub-groups of the overall management practices score in a "horse race" to see which has the strongest association with performance. All regressions include the full set of controls as in Table 6 column (II) but we show only the coefficients on the management scores.
The results confirm that that the sub-group of scores for incentives are the most consistently associated with performance. The incentive score enters positively and significantly for both the CUG and RAE rankings, increasing these by 3.8 and 3.5 respectively. There is evidence that operations also matter. The operations score is positive and significant for the RAE ranking (though smaller than for incentives) and it has the highest (though not significant) coefficient for NSS scores. The coefficients on monitoring and targets are negative (albeit insignificant) for all outcomes. These aspects of management practices do not appear to matter for performance in either research or teaching.

Do different aspects of management matter in different types of university?
The results so far have shown that better management practices at departmental level are associated with better performance and that practices with respect to incentives matter most. But it is possible that for the newer universities, where international reputation is less important than local reputation and teaching is more important for income than research, freedom to recruit and retain matter less and perhaps other aspects of management matter more. These universities have historically been subject to greater central control and less autonomy at both departmental level and university level, as many of these were previously part of local government and adopted faculty level structures sooner than the older universities. It may be that monitoring and targets have greater returns in these settings.
We explore this in Table 9, which presents the associations between the management scores (both overall and sub-groups) and the three sets of outcomes for different types of university. We estimate the same specification as before (equation (2)) but we include an additional interaction term between the management score and an indicator for "new universities", combining both former polytechnics and other new universities. We include the full set of controls as in column II of Table 6. In this table, the coefficient on the management score captures the association between management and performance for older (pre-1992) universities. The interaction term captures any difference in the association for newer universities.
None of the results provides any support for the idea that incentives matter less in newer universities. There is little clear difference between old and new universities in the association between overall score and performance. Operations scores matter less and targets matter significantly less for teaching in newer universities. But incentives appear to matter more in newer universities than in older universities. The coefficient on the interaction term between incentives and being a new university is 5.2 points higher for the CUG ranking and 2.2 points higher for the RAE ranking, though neither are significantly different from zero.
This raises the question of why newer universities do not adopt the same model as the more successful older universities. One plausible explanation is the fact that there is relatively limited competition across university types. The markets that Russell Group importance of competition and autonomy in driving performance.

Discussion and conclusions
This paper has examined whether management differences between universities are associated with differences in their performance. Using the UK as a test bed and a tried and tested measure of management performance, we have shown wide variation in the management quality across universities. In particular, we have shown differences in scores between older, research-intensive universities and newer, more teaching-oriented universities. In addition, we have shown that these differences are associated with differences in performance. Higher management scores are associated with better performance on externally validated measures of both research and teaching (often seen, in this sector, as orthogonal to each other). These results are robust to controls for resources (academic and non-academic spending and staff/student ratios) and to lagged performance.
We find significant differences in the management practices at the 'plant' level within the firms -one department within a university might be well managed whilst another is not.
And we also find that the management of the central administration -as measured by the human resources department -is very weakly correlated with better output at departmental and university level. Management in universities is also relatively heterogeneous relative to other organisations (e.g. manufacturing, hospitals).
We also find significant differences between aspects of good management practices.
Good practice with respect to incentives -the freedom to retain, attract and reward good performers -is the most important correlate of good performance. The setting of targets and monitoring has a much weaker association with good performance. Further, the relationships we find hold for both world leading research intensive universities and those more focused on teaching. We suggest that limited competition between university types may explain why newer universities do not adopt the management model of elite research-orientated universities. Our findings therefore build directly on Aghion et al (2010) who found that market incentives, in the shape of competition, and autonomy from central government control, mattered for universities across Europe and the US.
Our results suggest that management structures which allow freedom to use incentives and autonomy at the plant (departmental) level matters for output in this sector; competition may be a factor in adoption of this model.
We have only a single cross-section so do not claim causality, but we are able to condition on resources and past performance to deal with unobserved heterogeneity that might jointly explain both management score and current performance. In addition, two aspects of our findings suggest that the strongly patterned set of associations we find may be robust to endogeneity bias. First, the fact that the different aspects of management practice correlate differently with performance suggest that shocks to performance do not lead to the adoption of the whole 'new management' set of practices including monitoring, target setting and use of incentives. Second, if better management were put in as a response to negative shocks, this might explain our findings of a positive association between changes in performance and better use of incentives. But this would mean that departments with negative shocks (poor student performance or poor research performance) were given greater freedom to use decide how they retained, recruited and dealt with poor performers, whilst not having any changes to the extent to which performance was monitored or targeted. This seems somewhat unlikely. And thus, in summary, we think our results are not driven by reverse causality, but point to the importance aspects of good management in the use of incentives at the plant (departmental) level to motivate academics. This contrasts to the commonly held view that these individuals are impervious to good (or bad) management.     Table A2 in the Appendix. P-value refers to equality of means across university types, controlling for department and clustering standard errors at the university level.     Table A2 in the Appendix. Regressions also include department type and interviewer fixed effects.    Whether management role is fixed term Survey Likely next job is academic Sees self next -academic role Survey Likely next jobs is management Sees self next -management role Survey Likely next job is retirement Sees self next -retirement Survey

MANAGEMENT PRACTICE INTERVIEW GUIDE: UNIVERSITIES LEAN MANAGEMENT (1) Standardisation of processes
This question focuses on research and aims to understand the standardisation of process within the department/ univeristy a. Can you briefly outline what processes you have in place within the Department/ University for facilitating the development of research ideas into published research? For example -can staff apply for internal research grant money to help them with funding for research or conference travel, do you provide support in bidding for external funding, do you have regular work in progress seminars? b. Do you have a mentoring scheme for young academics? How formal is the process? What is the role of the mentor with respect to their mentees?

Score 1
Score 3 Score 5 Scoring grid: Unable to articulate any clearly defined process; Processes are in place, but they are not well structured.
There are clearly defined and well structured processes (2) Continuous improvement This question has a wider focus on processes for research and teaching and aims to understand whether there is continuous improvement/ whether there is a process for learning and for innovating a. Thinking more generally about processes you have in place for improving both research and teaching, how do you know that the processes are working? Do you carry out regular reviews of the processes for potential areas of improvement? b. Can you give me an example of a recent improvement to research or teaching processes? How did the change came about. c. To what extent are members of the department/university involved in suggesting improvements to processes? Can you think of any examples of a staff idea was taken forward? Score 1 Score 3 Score 5 Scoring grid: Processes are not reviewed in terms of performance. Process improvements -if at allare made when problems occur. Limited involvement of staff; suggestions from staff/ carers are not sought/ developed.
Process review and improvements occurs at irregular meetings involving; some attempt to develop ideas from the bottom up, but not systematic.
Reviewing processes and exposing problems in a structured way is integral to individuals' responsibilities. Staff are centrally involved in developing improvements.

PEFORMANCE MANAGEMENT (3) Performance tracking
Tests whether the overall performance of the organisation (department/ university) is tracked using meaningful metrics and with appropriate regularity a. What kind of performance/ quality indicators do you use to keep track of how the department/ university is performing? (prompt if necessary -for example, numbers of students, research outputs, student satisfaction, teaching performance) b. How frequently do you look at and review performance using these measures? Who gets to see the performance information?

Score 1
Score 3 Score 5 41 Scoring grid: No clear idea of how overall performance is measured. Performance measurement is ad-hoc.
Most important performance indicators are tracked formally; tracking is overseen by senior staff.
Performance is continuously tracked and communicated against most critical measures, both formally and informally, to all staff using a range of visual management tools (4) Performance review Tests whether performance of individual members of staff is reviewed in a comprehensive way and systematic way. a) Do you have a process for reviewing the performance of individual members of academic staff? Who is involved? b) How frequently do the reviews take place? c) What aspects of performance are reviewed?

Score 1
Score 3 Score 5 Performance is reviewed infrequently or in an un-meaningful way e.g. only success or failure is noted.
Performance is reviewed periodically with both successes and failures identified. Only some aspects of performance are considered.
Performance is continually reviewed, based on the indicators tracked. All aspects are reviewed to ensure continuous improvement.

(5) Performance dialogue
Tests the quality of review conversations a) How are these performance review meetings structured? Do they have a clear structure and set agenda? b) Do these reviews involve performance metrics (such as those discussed in indicator 3)? How is this data used in the performance review? c) When a problem is discussed during these meetings, how do you identify the root cause? d) What sort of follow-up plan would there be after such as meeting? Would it be very detailed? Would there be specific action points? Would there be a review to ensure that the action points had been followed up? Score 1 Score 3 Score 5 Scoring grid: The right information for a constructive discussion is often not present or the quality is too low; conversations focus overly on data that is not meaningful. Clear agenda is not known and purpose is not explicitly.  (Offa)) or linked to a process of performance assessment done by outside bodies? d) Do you have any targets that are not linked to an external process of performance assessment? (how) Do these internal targets link to the external targets?

Score 1
Score 3 Score 5 Scoring grid: Only targets are those directly set by external bodies.
As well as explicit targets set by the regulator, there are other internal targets that link to external performance assessment and also some internal targets.
Comprehensive range of internal targets covering a number of dimensions -both external performance assessment and own performance assessment.

(8) Target inter-connection
Tests how well targets and goals cascade down the organisation and the extend to which they are responsive to individual department needs a. Are the targets set centrally and cascaded down? b. Are targets set uniformly for different departments/ faculties? What if a department had different needs and the targets were not appropriate, could they be modified? c. If there are centrally-set targets, do faculties/ departments (additionally) set their own targets? Is this because the central targets are not appropriate or is it seen as appropriate that local units have some autonomy? How do the locally-set targets fit with central targets?
Score 1 Score 3 Score 5 Scoring grid: Goals do not cascade down the organisation. Departments/ faculties may set goals but these are done on a purely individual and ad hoc basis.
Goals do cascade, but there may be a concern that they are imposed too rigorously. Individual departments may set additional targets but these do not link well to central targets.
There is a cascading of targets but the centre recognises varying individual needs of departments. Departments may have some autonomy to set their own goals, but this is done within an overarching framework.

(9) Time horizon of targets
Tests whether organisation has a rational approach to planning and setting targets and the extent to which the organisation is actively engaged in pursuing long-term goals 43 a) What kind of time scale do your targets cover? Are they based purely on the latest regulatory cycle (eg. the next REF process or the next student satisfaction survey) or do you also have longer-term goals? b) Which goals receive the most emphasis -the short-term goals or long-term? c) To what extent are the long-term and short-term goals linked together? Could you meet all your short-run goals but miss your long-run goals?

Scoring grid:
The only focus is on short-term targets based on the current regulatory cycle. There are no longterm goals (or the organisation is prepared to miss long-term goals in order to achieve shortterm ones).
There are short and long term goals for all levels of the organisation. But the goals do not link well together and the organisation does not have a coherent strategy in terms of trading off shortterm and long-term goals.
The organisation has clear long-term goals that are translated into specific short term targets so that short term targets become a 'staircase' to reach long term goals (10) Target stretch Tests whether targets are appropriately difficult to achieve a) How tough are your targets? Do you feel pushed by them? b) On average, how often would you say that you meet your targets? c) If there are centrally-set targets -do you feel that all departments are equally pushed in meeting their targets? Or do some groups get easier targets? Score 1 Score 3 Score 5 Scoring grid: Goals are either too easy or impossible to achieve, at least in part because they are set with little involvement of key staff, e.g., simply off historical performance In most areas, senior staff push for aggressive goals based, e.g., on external benchmarks, but with little buy-in from staff. There are a few sacred cows that are not held to the same standard Goals are genuinely demanding for all parts of the organisation and developed in consultation with senior staff, e.g., to adjust external benchmarks appropriately (11) Clarity and comparability of targets Tests how easily understandable performance measures are and whether performance is openly communicated a) If I asked your academic staff directly whether they had been given individual performance targets, what would they tell me? b) Do people think about how their performance compares to the performance of other people? How would they be able to make any assessment of their relative performance? c) Do you compare or rank staff performance in any way?

Score 1
Score 3 Score 5 Scoring grid: Performance measures are complex and not clearly understood, or only relate to government targets. Individual performance is not made public Performance measures are well defined and communicated; performance is public at all levels but comparisons are discouraged Performance measures are well defined, strongly communicated and reinforced at all reviews; performance and rankings are made public to induce competition

TALENT MANAGEMENT (12) Rewarding high performers
Tests whether good performance is rewarded proportionately a) Do you have an appraisal system for your academic staff for deciding their pay and (financial/non-financial) rewards? Does this differ between junior and senior staff? b) How much flexibility is there to reward your best performers, financially and non-financially? What range of options do you have (reduced teaching, research money, accelerated promotion)? How much discretion is there in terms of pay (and promotion)? c) Overall, how does your reward system compare to that at other comparable organisations?

Score 1
Score 3 Score 5 Scoring grid: Not much systematic appraisal and people are rewarded equally irrespective of performance level There is an evaluation system for the awarding of performance related rewards at the individual level; these are mainly non-financial and rewards are always or never achieved There is an evaluation system for the awarding of performance related rewards, including personal financial rewards

(13) Removing poor performers
Tests whether organisation is able to deal with underperformers a) If you had a member of academic staff who was struggling or could not do his or her job, what would you do? Can you give me a recent example? b) How long would under-performance be tolerated? c) Are there some members of staff who seem to lead a charmed life? Do some individuals always just manage to avoid being fixed/fired? Score 1 Score 3 Score 5 Scoring grid: Poor performers are rarely removed from their positions Suspected poor performers stay in a position for at least a year before action is taken We move poor performers out of the agency or to less critical roles as soon as a weakness is identified (14) Promoting high performers Tests whether promotion is performance based a) Can you tell me about career progression and the promotion system within your organisation -for both junior and senior academic staff? b) How would you identify and develop your star performers? c) What types of development opportunities are provided and how are these personalised to meet individual needs? d) Are better performers likely to be promoted faster or are promotions given on the basis of tenure/seniority?

Score 1
Score 3 Score 5 Scoring grid: People are promoted primarily on the basis of tenure People are promoted upon the basis of performance We actively identify, develop and promote our top performers

(15) Managing talent
Tests what emphasis is put on talent management