With nearly seven million hospital operations performed each year in England and Wales and an annual NHS budget of > £1 billion, operating theatres are a significant expense [1]. As all stakeholders – patients, politicians, staff, managers – seek to maximise benefits from them, proper measurement of processes and outcomes becomes necessary.

Previously, we concluded that surgical operating list ‘efficiency’ is the completion of all the scheduled operations (that is, with no cancellations) whilst utilising properly the time available (that is, no under- or over-runs) [2, 3]. In response, Sanders et al. observed that this notion of ‘efficiency’ was possibly incomplete [4]. It is theoretically possible for two teams to be equally efficient, fully utilising their time without over-runs or cancellations, but one team consistently completes more work than the other. Thus, there is a need to define ‘productivity’ in addition to ‘efficiency’ to complete the picture of performance [5]. However it would seem challenging to develop a measure encompassing all surgical teams, undertaking as they do different combinations of procedures.

The word ‘productivity’ has various connotations in economics, engineering, or administration. The Organisation for Economic Co-operation and Development (OECD) offered more than 20 meanings, recognising that ‘productivity’ and ‘efficiency’ are often confused and emphasising that each industry needs to develop its own measures [6, 7]. For surgical operating lists, we therefore use the engineering sense: ‘efficiency’ measures how well the service functions (that is, work output for costs or effort); ‘productivity’ measures the output. It is theoretically possible for a service to be highly efficient but have low total work output, and conversely to work at high output, but extremely inefficiently [4, 5].

An objective measure of productivity would enable healthcare professionals to confirm they work as hard and effectively as reasonable within the time available, and help them achieve this aim. Other stakeholders such as patients, managers or politicians also need to be satisfied that the investments they make and the reliance they place in operating theatres are used as effectively as is reasonable.

Two measures of productivity that have been applied to hospital settings are ‘balanced scorecard’ and ‘data envelopment analysis’ (DEA). The former consists of first listing all relevant, desirable or possible (that is, quantitative and qualitative) measures of performance (such as utilisation, cancellation rates, complication rates, satisfaction, etc) and then assigning to each a score. While this encompasses a wide range of performance criteria, the scoring systems are necessarily arbitrary and weighting factors need to be employed, which may not be the same across all hospitals or even across different surgical teams [8, 9].

DEA is a very detailed mathematical process that attempts to quantify the performance of teams or units across a range of chosen indices with reference to the ‘best performing units’. However, the analysis requires conceptualisation of abstract ‘efficient frontiers’ and complex linear programming. Although some authors have argued for the validity of DEA in the context of healthcare [10, 11], it does suffer from important limitations. First, DEA yields results only relative to the currently best-performing teams, i.e., there is no absolute reference point or ideal. Thus a team only knows its performance if all other teams have themselves been analysed by DEA: it is not meaningful to conduct DEA for one team in isolation. Second, the mathematical complexity of DEA means its results are often expressed in terminology that makes it difficult to strive for a simple goal. Applying DEA and solving the required mathematical equations is generally outside the scope of most surgical-anaesthetic teams.

The broad aim of this paper is to develop a simpler, yet still rational alternative to balanced scorecard or DEA approaches to measure surgical productivity. In a step-by-step approach we (1) first define the core criteria that any ideal measure needs to satisfy. Then (2), we develop a theoretical measure that fulfils these. Finally (3), we assess the utility of this theoretical measure by application to hypothetical and real data sets. The step from (1) to (2) is a logical process: that is, if the criteria are acceptable, then the measure is deducible from them. However, the step from (2) to (3) is a practical one: i.e., a test of the measure’s ability to apply in practice. Thus, ours is primarily a theoretical analysis, which we extend to apply meaningfully to real surgical lists.