Understanding waiting lists as the matching of surgical capacity to demand: are we wasting enough surgical time?

Authors


Correspondence to: Dr Jaideep J Pandit
E-mail:jaideep.pandit@dpag.ox.ac.uk

Summary

If surgical ‘capacity’ always matched or exceeded ‘demand’ then there should be no waiting lists for surgery. However, understanding what is meant by ‘demand’, ‘capacity’ and ‘matched’ requires some mathematical concepts that we outline in this paper. ‘Time’ is the relevant measure: ‘demand’ for a surgical team is best understood as the total min required for the surgery booked from outpatient clinics every week; and ‘capacity’ is the weekly operating time available. We explain how the variation in demand (not just the mean demand) influences the analysis of optimum capacity. However, any capacity chosen in this way is associated with only a likelihood (that is, a probability rather than certainty) of absorbing the prevailing demand. A capacity that suitably absorbs the demand most of the time (for example, > 80% of weeks) will inevitably also involve considerable waste (that is, many weeks in which there is spare, unused capacity). Conversely, a level of capacity chosen to minimise wasted time will inevitably cause an increase in size of the waiting list. Thus the question of how to balance demand and capacity is intimately related to the question of how to balance utilisation and waste. These mathematical considerations enable us to consider objectively how to manage the waiting list. They also enable us critically to analyse the extent to which philosophies adopted by the National Health Service (such as ‘Lean’ or ‘Six Sigma’) will be successful in matching surgical capacity to demand.

It is well known that waiting lists for surgery exist in the UK's National Health Service (NHS) but the problem is also a concern in many other countries [1–3]. A waiting list (or queue) arises if the ‘demand’ for surgical services overwhelms the ‘capacity’ of the system. If a simple ‘imbalance between capacity and demand’ explains the genesis of waiting lists then a solution would seem straightforward. The fact that this is not the case suggests that a clearer understanding is needed of what is really meant by ‘demand’ and ‘capacity’ and of how to match the two.

Martin et al. [4] noted that few studies had examined the determinants of prolonged waiting. Buhaug [1] observed that ‘the dynamics of waiting lists were not well understood’ and that the ‘productive capacity’ of hospitals was ‘hard to estimate’. Bellan [3] agreed that there was very little information available about waiting list changes over time. If these authors are correct, the NHS target to limit the wait of all patients for definitive surgery to just 18 weeks from point of referral seems ambitious, because the NHS will be trying to solve a problem without fully knowing its fundamental cause (or without establishing the necessary consensus around measures of demand and capacity). However, the NHS has also initiated a programme of research to run concurrently with the development of solutions (see: http://www.18weeks.nhs.uk/Content.aspx?path=/ and http://www.institute.nhs.uk/). Many of the ideas seem based on ‘Lean Thinking’ or ‘Six Sigma’ processes, themselves rooted in ideas used by the Toyota Motor Company [5] and the Motorola Corporation [6–8] to streamline production systems. Many anaesthetists and surgeons will by now have experienced an array of local initiatives concerning ‘efficiency’, ‘productivity’, ‘Lean’, ‘balanced scorecards’, etc. Unfortunately, the basic mathematical concepts underpinning these are less well disseminated. An analogy would be for a health service (quite reasonably) to prioritise the treatment of sepsis, but then (unreasonably) decline to define what ‘sepsis’ was or to promote teaching of the basic mathematics of oxygen supply and consumption.

The Royal College of Anaesthetists’ Academic Strategy (Pandit) Report [9] recommended that health services research (which addresses the issues alluded to above) should be one of the three main planks of the specialty’s academic focus (along with generic research in the basic sciences and translational clinical research). To meet this recommendation (and also recognising the important part played by anaesthetists in day-to-day NHS and theatre management), the Royal College is in the process of hosting an Institute of Health Services Research [10].

These developments make relevant the main purpose of this article, which is to discuss what is really meant by ‘demand’ and ‘capacity’ in the context of surgical services. Of necessity described mathematically, the analysis nonetheless provides rigorous insight into some of the practical, everyday problems encountered in organising surgical services that try and match the two. What follows logically from this is a discussion of the extent to which ‘Lean Thinking’ applies to healthcare services. Our review demonstrates how, when a process is analysed carefully in mathematical terms, important insights can be gained in the managerial (and political) aspects of that process. This paper is, thus, in many respects a critical appraisal of the philosophy underpinning some organisational changes taking place in the NHS.

At the outset, it is important to distinguish operational, strategic and tactical decision-making in relation to operating theatre management [11–13]. Operational decisions concern day-to-day, immediate and local problems (for example, one team invariably starts late, or elevators transporting patients don’t work, or anaesthetic rooms are untidy, etc). Often the solutions are hospital-specific and while instructive, may not be relevant to all hospitals. Strategic decisions concern the overall direction of the service as a whole (for example, whether healthcare should be funded by direct taxation, whether non-medically qualified practitioners should deliver anaesthetic care unsupervised, etc). These decisions affect all hospitals because they set the framework in which NHS hospitals function. Tactical decisions are short-to-medium term concerning service planning to implement the strategic decisions (for example, optimum models for theatre scheduling, theatre allocations etc). These decisions theoretically apply to all hospitals since they revolve around generic models. This paper primarily concerns tactical, and not operational or strategic issues.

‘Time’: the proper measure of both demand and capacity

Throughout our analysis, we use time to measure both demand and capacity, in contrast with some other analyses that focus on numbers of patients (for general outline see: http://www.institute.nhs.uk/ or http://www.steyn.org.uk) [4, 14–16]. Thus we regard 10 patients whose operations require ∼30 min each as the same demand (taking into account gaps during a list) as two patients whose operations take ∼150 min each; that is, ∼300 min of surgery is the demand created by both scenarios. Thus, the specific ‘demand’ is the surgical work (the min or h of surgery) generated from the outpatient clinic.

‘Capacity’ represents the surgical operating lists, which in the UK are time-sensitive – scheduled in ‘blocks’ generally of half-days (4 h) or full days (8 h) and occasionally ‘long days’ (of 10 h or 12 h). Employment contracts for many relevant staff, including consultants, are similarly time-sensitive, such that a 4-h aliquot of time makes up a contractual ‘session’ (termed ‘programmed activity’) [17, 18]. Therefore, the relevant ‘capacity’ of a surgical service can readily be described by the min or h of surgical time per week available to it.

These notions of demand and capacity can be extended to each step in the patient journey. Thus the capacity of the surgical outpatient clinic is the min or h of clinic time per week; the demand it faces is the time needed to assess referrals from general practitioners (GPs) [19]. The demand faced by a postoperative surgical ward might be the length of stays of postoperative surgical patients, and so on.

Many differences exist between this UK scenario and the US, where several detailed analyses of demand and capacity have been published [11–13, 20–22]. Staff contracts are not generally so time-sensitive in the US, especially for senior consultant staff. Theatre time is not always scheduled as block-time, but variable for each specialty or team depending upon how many referrals they receive. Thus, capacity is not a quantity fixed by contract but is instead more flexible, with potential to adjust it to create incentives. Furthermore, US hospitals can modify their capacity for certain surgical services as a means to compete for business [22]. In the UK, surgical capacity is largely regarded merely as a passive means to cope with an ever-present demand rather than as an active means to expand business profits. For all these reasons the conclusions of US work, although relevant, cannot be simply extrapolated to the UK.

To help understand the problems of balancing capacity and demand we now consider various hypothetical scenarios, initially discussing very simple scenarios (that some readers may find self-evident) in steps leading to more complex models (that other readers will find impenetrable in isolation, unless the simpler models are introduced first).

Determining optimum capacity

Optimum capacity for a simple scenario of constant demand

A hypothetical NHS general surgical team consistently books patients from its outpatient clinics patients for elective surgery whose operations always need a total of exactly 3000 min.week−1 of operating time. (For simplicity we ignore the impact of any surgical emergencies). If the surgical capacity (list-time) of this team is exactly 3000 min.week−1, this always matches the demand precisely, thus no waiting list results (line 1, Fig. 1A). If the capacity is instead fixed at any value > 3000 min.week−1 then there is always spare capacity (line 2, Fig. 1A). If, on the hand, capacity is for some reason fixed at any value < 3000 min.week−1 then there is a shortfall in capacity and a cumulative backlog of patients will develop (line 3, Fig. 1A). The rate at which this backlog develops is proportional to the disparity between the capacity and the demand: if the capacity is 2 400 min.week−1, the backlog (waiting list) develops at a rate of 600 min.week−1 (Fig. 1B).

Figure 1.

 Elective demand from outpatient clinic (bars, min.week−1) or capacity (horizontal lines, min.week−1) over 40 weeks for a hypothetical surgical team. Example A: both demand and capacities are constant and three different capacities are: line 1, capacity = demand; line 2, capacity > demand; line 3, capacity < demand. Example B: weekly rise in backlog of min of surgery (‘waiting list’; solid circles) for line 3 in A, where capacity < demand.

We can describe the ability of capacity to meet demand in terms of the proportion of weeks in which a chosen capacity absorbs the demand. At any capacity less than the mean demand of 3000 min.week−1, demand is fully absorbed in precisely 0% weeks; at any capacity exceeding mean demand, demand is fully absorbed in all (100%) weeks. These proportions can be viewed as probabilities, which for this scenario are ‘binary’: a capacity chosen randomly will either never fully absorb demand (0% probability) or always absorb it (100% probability) but no value in between (Fig. 1B). Matching capacity to demand appears very easy in this simple scenario.

Optimum capacity when demand varies

Unfortunately, real life is not so straightforward and while true capacity might often be near-constant (for example, if the surgical list is always staffed by cross-cover arrangements), it is inconceivable that the demand week-to-week will be constant and several reasons are suggested: GPs have variable referral rates [23–25]; the nature of patients’ problems (casemix) varies; and even for the same operation the time required may vary due to patient co-morbidity, or the specific intervention needed [26–28]. Even after any learning curves, laparoscopic operations have inherently greater variation than other procedures [29]. Operations can show time distributions for their duration that are non-Gaussian [30].

Figure 2 shows a scenario where demand varies. The mean demand is 2400 min.week−1 and hypothetical capacity has been set to this mean level. However spare time (excess capacity) in one week cannot be carried forward to the next week, because ‘time’ cannot be stored for future sale or consumption as can manufactured goods such as cars or washing machines. Furthermore, it is not always apparent in advance for any week that time is spare, since this is only known after the event. Therefore the occasions when demand outstrips capacity always contribute to a backlog, but the occasions where capacity outstrips demand cannot always compensate, and capacity is wasted (cumulative line, Fig. 2A).

Figure 2.

 Plot of demand (bars, min.week−1) or capacity (horizontal lines, min.week−1) over 10 weeks for a hypothetical surgical team. Example A: demand varies while capacity is constant and set to mean demand of 2400 min.week−1. The solid circles show the weekly rise in backlog of min of surgery (‘waiting list’). Example B: the horizontal lines for three different levels of capacity are associated with three different proportions of weeks in which demand is absorbed: line 1 (demand never absorbed); line 2 (demand always absorbed); line 3 (demand sometimes absorbed).

In contrast to the binary probabilities for the simple scenario of Figure 1, the probabilities of various capacities meeting the demand for this scenario in Figure 2 are rather different. Assuming that this is representative of the team’s demand, then for all capacities set anywhere lower than the lowest possible level of demand, the proportion of weeks in which demand is fully absorbed is 0% (line 1, Fig. 2B). For all capacities set to levels exceeding the highest possible demand, the proportion of weeks in which demand is absorbed 100% (line 2, Fig. 2B). A capacity in between these two limits absorbs demand somewhere between 0–100% of weeks. If capacity is set very high (greater than the highest level of demand) then this will certainly absorb all demands encountered within the range experienced by the team, but at the same time there will be considerable waste in many weeks. Unless the healthcare system or hospital can afford this, a lesser level of capacity has to be selected, at which there will inevitably be some weeks in which demand is not matched, but at which there is less wasted (excess) capacity. From this we can conclude that it is important to know the variation in demand (not just the mean demand) to calculate the right capacity.

It is evident from Figure 2B that whenever the demand outstrips capacity – even for a week or two, there arises a backlog of surgery. We discuss some practical ways in which this backlog may be managed below (in the section ‘Interim summary: managing uncertainty’).

Estimating the likelihood of a chosen capacity absorbing demand

We can regard the proportion of weeks in which different capacities are able to meet the demand as a cumulative ‘probability density function’. Let us suppose that a gynaecology team has a mean (SD) demand of 2400 (600) min.week−1 with a Gaussian distribution. It might help make rational decisions about choosing the right capacity if we were able to say that a capacity of x min.week−1 will have a 25% chance of meeting this demand, a capacity of y min.week−1 an 70% chance, a capacity of z min.week−1 a 98% chance and so on.

Figure 3A shows a version of the type of graph in Figures 1 and 2, showing the min.week−1 of surgery booked from a hypothetical surgical clinic over a whole year. Horizontal lines show hypothetical capacities set at mean and mean + 1 SD min.week−1. The data can be plotted as a histogram, which in this case shows a characteristic Gaussian distribution (Figure 3B, now showing the hypothetical capacities as vertical lines). Mathematically the probability density function of the Gaussian distribution is simply the integral (that is, the area under the curve) of Figure 3B. This can be calculated in one step using integral calculus but can also be explained by the following process: take the capacity that might work (a guess of, say, 2400 min.week−1), then draw a vertical line at that point on the x-axis of the histogram in Figure 3B (this is done for the mean and mean + 1 SD values), then calculate the area under the curve to the left of that vertical line. Then, make another guess (for example, 3000 min.week−1) and repeat the process, and so on. Finally, construct a plot of the areas under the curve for each of the guesses. Figure 3C shows the result and represents the proportion of weeks that a certain capacity will absorb the demand for the data shown in Figures 3A and 3B. For these data, a capacity of 2400 min.week−1 (capacity = mean demand) will absorb demand in 50% of weeks, a capacity 3000 min.week−1 (mean + 1 SD demand) will absorb demand in ∼80% of weeks.

Figure 3.

 A: Plot of demand (bars, min.week−1) or capacity (horizontal lines, min.week−1) over 50 weeks for a hypothetical gynaecology team. The two horizontal lines show capacities set at mean demand and mean + 1 SD demand. B: Histogram (Gaussian distribution) of the data in A, plotting the number of clinics in the 50-week period generating a certain demand. The vertical lines now show the hypothetical capacities set to mean and mean + 1 SD of demand. C: The probability density function for the Gaussian distribution in B (the proportion of weeks in which a certain capacity on the x-axis will absorb the demand). As examples, the vertical lines show the proportions for two hypothetical capacities, at mean and mean + 1 SD of demand (that is, 50% and 80%, respectively).

Figure 4 shows the same process for a very much smaller SD on the calculations (the mean demand is still 2400 min.week−1). Compared with Figure 3, as the variation in demand narrows (that is, the SD becomes smaller), the level of capacity that absorbs demand in a high proportion of weeks lies closer to the mean demand. Indeed, for the extreme scenario of zero SD (Fig. 1), the probability density function becomes ‘binary’, as we have discussed above.

Figure 4.

 A: plot of demand (bars, min.week−1) for a hypothetical team that exhibits little variation in its clinic bookings for elective surgery. The horizontal line = mean demand. B: Histogram (Gaussian distribution) of the data in A. The vertical line = hypothetical capacity set to mean demand. C: The probability density function for the Gaussian distribution in B. The vertical line = probability for a hypothetical capacity set at mean demand.

Figures 3 and 4 also remind us that the shape of the normal distribution curve is always the same, regardless of the actual mean or SD values composing it. Thus, a capacity set at mean demand will always absorb demand in 50% of weeks, a capacity set at mean + 1 SD demand will always absorb demand in ∼80% of weeks, and so on.

Estimating probabilities for non-Gaussian variations in demand

In real life the demand generated from clinics may follow any one of a number of non-Gaussian distributions [30]. Figure 5 demonstrates bimodal, skewed and random distributions. However, for all these, the associated probability density functions are always the integral of the original distribution and can be computed. What appears interesting however, is that in each case – regardless of the pattern of distribution – setting a capacity of ∼70–80% of the maximum range of demand yields the probability that demand will be absorbed in ∼80% of weeks (very similar to the case for the Gaussian distribution in Fig. 3).

Figure 5.

 A, B, C: histograms for the number of clinics generating certain demands (arbitrary units) for bimodal, skewed and random distributions respectively. D, E, F: corresponding probability density functions for the histograms in A, B, C. Note that, regardless of the type of distribution, a capacity set at ∼80% of the range of variation absorbs demand in ∼80% of weeks (dotted lines).

Some teams may show cyclical or seasonal patterns of demand (Fig. 6). However, this does not change our fundamental understanding because a cyclical pattern may itself be viewed simply as a variation around a mean value. Even if the cyclical pattern were modelled (for example, using sine-wave functions) there may be additional variation (‘noise’) around the cyclical value that might need to be accounted for (Fig. 6B).

Figure 6.

 Cyclical or seasonal variation exhibited by a hypothetical list. A: the bars show the min of surgery booked from clinic each week; the smooth sine-wave indicates the overall trend of the data; the horizontal line indicates the mean for the sample. Thus, the data vary around the mean value. B: the same sine-wave trend occurs (smooth curve), with the same mean value (horizontal line) but the actual data from clinic shows greater variation around the fitted sine-wave curve.

Estimating probabilities when capacity also varies

If surgical lists are occasionally cancelled for whatever reason (for example, staff absence, lack of equipment, etc) then this will add complexity to any mathematical model. Surgical capacity may also vary due to different utilisation of the scheduled time on each list from week to week (for example, due to gaps, cancellations, etc). In the previous sections, we have posed the general question: ‘In what proportion of weeks will a fixed surgical capacity of 1000 min.week−1 absorb a mean (SD) demand of 900 (150) min.week−1?’. A simple calculation based on the t-distribution yields an answer of ∼76%. Introducing a variable capacity changes this question to: ‘In what proportion of weeks will a surgical capacity of 1000 (120) min.week−1 absorb a demand of 900 (150) min.week−1?’. This makes any probability estimates less precise so that the t-distribution now yields an answer like: ‘We can be 95% confident that demand will be absorbed between 17% and 76% of weeks’.

More importantly, minimising variation in surgical capacity is something under the direct control of surgical teams and managers through various interventions. The surgical list we examine below has occurred every week as scheduled for 2 years, save just three occasions, yielding a mean (SD) capacity with very small variation of 457 (7) min.week−1. This is much less variable than the corresponding variation in demand for that list. Previous publications have discussed how efficiency and productivity within any given list can be maximised [27, 28].

Demand and capacity data from real surgical lists

It is rare to find hospitals analysing their results very quantitatively, especially in the UK, making sparse any real data describing how capacity is in practice tailored to demand. Bellan [3] investigated changes in surgeons’ booking pattern for eye surgery from clinic over time and found a skewed distribution (akin to Fig. 5B/E). Westbury et al. [31] reported a Gaussian distribution of demand generated by gynaecology clinics and also estimated the probability density function from their data. Some of the distributions represented at some specialist websites (for example, http://www.steyn.org.uk) appear random. So it would seem that several types of distribution can exist. Dexter has argued that many distributions related to theatre times are log normal; that is, the logarithm of the actual times, rather than the actual times themselves, follow a normal distribution [32] and, therefore, the analyses of these times does not really differ fundamentally from that of the normal distribution.

We used published data on mean operation times [26–28] to analyse the bookings for surgery from 56 consecutive clinics of one urology team (Fig. 7A). The histogram was rather skewed, with the majority of clinics generating < 300 min of need for surgical time, but a significant number generating > 400 min of need for time. The resulting probability density function was in fact quite similar to that for a Gaussian distribution, and indicated that ∼600 min.week−1 of operating time would be ∼90% likely to absorb the demand generated from clinics. This could be achieved, for example, by one full-day (480 min) list per week and one half-day (240 min) list alternate weeks: when provided with this capacity, the team’s waiting list did not increase.

Figure 7.

 Data from real urology list over 56 successive weeks. A: bookings from clinic. B: histogram of data in A, suggesting a skewed distribution. C: probability density function for the histogram in B.

There is, however, one concern. The data analysed represent only a sample for that real-life team; what sample size is appropriate and would samples influence the pattern of distribution (and therefore the conclusions drawn)? The answer is yes: Figure 8 shows the same data replotted using quanta of between 25–100% of the data points in Fig. 7. It is clear that if only the first 14 data points are used, the peak would lie at ∼500 min.week−1. Epstein and Dexter [33] have argued that ∼30 data points are representative in this context, and this seems confirmed by Figure 8.

Figure 8.

 Three-dimensional plot showing the effect of sample size on the data in Fig. 7. Note the higher peak at ∼500 min.week−1 of surgery when only the first 25% of data points are used, declining when a larger sample is analysed.

Understanding probability density functions in practice: the problem of waste

For our hypothetical gynaecology list in Figure 3, let us suppose that a gynaecology manager considers it reasonable to set surgical capacity at mean + 1 SD of the known demand (∼3000 min.week−1), which reliably absorbs demand in ∼80% of weeks.

Let us now suppose that a new diktat requires greater emphasis on reducing waiting lists, so the manager quite rationally increases capacity to mean + 2 SD demand (∼4000 min.week−1), so that demand is absorbed in > 90% of weeks (Fig. 3C). Two problems arise. The first is that although extra investment is needed to achieve this greater capacity, the proportion of occasions in which demand is now absorbed has increased by only ∼10%. By comparison, if the original capacity had been 2000 min.week−1 then the same quantum increase in capacity to 3000 min.week−1 would have increased the proportion of weeks in which demand was met by 60% (from ∼20% to ∼80%; Fig. 3C). This is one example of the well-known ‘law of diminishing returns’. Clinicians will readily recognise parallels in the oxyhaemoglobin dissociation curve, where increasing partial pressure of oxygen above a certain level achieves less and less increase in saturation, as the prevailing saturation rises. Managers and politicians need also to recognise that the returns on investment diminish beyond a certain point, and that this point is determined by the mathematics and not by intuition.

The second problem is that the proportion of excess capacity (waste) increases, which further impairs the returns on investment. ‘Eliminate all waste’ makes an attractive political slogan often popularly translated into policy statements [34]. For the various scenarios of demand we have discussed above, it is indeed possible to ‘eliminate all waste’ by the simple expedient of setting capacity to a very low level. However, and as we have seen, this will come at the price of greatly reducing the probability that prevailing demand will be absorbed.

This balance between ‘waste’ vs ‘probability of meeting demand’ can be appreciated graphically. Using the same data as used in Figure 3A (see now Fig. 9) and re-calculating to show that the proportion of time wasted, we see that this rises as capacity is increased (while the proportion of time utilised correspondingly declines). A very low capacity of ∼1000 min.week−1, associated with almost no wasted time and high utilisation, will absorb demand in very few weeks, causing a huge rise in the waiting list. On the other hand, capacities of ≥ 4000 min.week−1 that absorb demand every week are associated with ≥ 50% of time wasted (or, < 50% time utilised). There are few benefits of increasing capacity ≥ 4000 min.week−1 as this does not further increase the proportion of weeks where demand is absorbed, but merely increases the proportion of time wasted.

Figure 9.

 Plotting waste. The same data as Fig. 3 re-plotted, showing for a very wide range of capacities (x-axis): the % of weeks when demand is absorbed (right axis); the capacity utilised expressed as a % of the min of capacity available (left axis); and the capacity wasted expressed as a % of the min of capacity available (left axis). The vertical dashed line at 2000 min.week−1 approximates the current aim for NHS theatre utilisation (at ∼85–90%) which is associated with little wasted capacity but a very low proportion of weeks in which demand is absorbed.

Note that none of this argument has anything to do with how hard staff are working, whether or not they turn up to work on time, whether equipment is available, etc. We have made no assumptions about such imperfections and the analysis is actually predicated on smooth running of the system. Yet it appears that even in otherwise perfect systems, some waste is necessary to ensure that demand is met.

The NHS has set as one of many targets an operating theatre utilisation level of > 85% [14–16, 35]. Although this is less than 100%, it is at a level that implies a capacity insufficient for prevailing demand (corresponding to the vertical line in Fig. 9). Often, utilisation of individual theatres can exceed > 90% [27, 28]. It would appear that we are not ‘wasting enough surgical time’ in order to manage the problem optimally.

Interim summary: managing uncertainty

We have described some initially simple mathematical concepts, then factored in some real-life elements to make the scenarios a little more complex. These notions (which we can call a ‘probability theory approach’) can be adopted by any surgical team to match their capacity to demand [31]. That is, the team can estimate the time needed for surgery from clinic bookings (over at least 30 clinics or weeks), then plot the resulting histograms and probability density functions to estimate the optimum capacity; then estimate the proportion of time wasted for any chosen capacity (Figs 3 and 7) [31].

Mathematics is one thing, but management is often about achieving the right balance between investment and returns, or balancing the sometimes conflicting aims of a service. Decisions on capacity can influence the likelihood of generating a significant waiting list, yet these decisions have to be made on data that yield probabilities rather than certainties. Even when they are exact, probabilities do not imply exact outcomes. A politician may decide to take a gamble; that is, in order to minimise costs, he/she sets the capacity at a level where there is only a 10% chance of meeting demand. Yet, as with all gambles, this may pay off and the politician is lauded for insight. A more cautious politician may set capacity at a level that has a 99% chance of meeting demand, yet through bad luck this may not capture sufficient demand in a given time period; the politician’s judgement may then be criticised. So an important question for society as a whole regarding the capacity of its health service is whether politicians or managers should have complete flexibility in choosing surgical capacity (and, therefore, be allowed to take gambles), or whether capacity should instead be fixed to within relatively narrow probability limits determined by more quantitative analyses. For example, a legal or constitutional requirement that surgical capacity must be fixed at, say, 80% demand (assessed mathematically) would remove politicians’ room for manoeuvre but should make clearer to the public what they can expect from their healthcare system, and the probability of achieving those expectations.

The fact that waste is an inherent, inevitable by-product of meeting demand makes relevant the question: what is an acceptable level of waste?

Even when capacity is generally matched well to demand, an occasional mismatch will create a small backlog and, generally, the following measures are available to manage this. The size of the backlog (waiting list) is proportional to the disparity between capacity and demand, so if the disparity is small, these need to be employed only sparingly. If large, however, then the following measures can become core elements of policy: (i) increase capacity (the number of regular surgical lists per week); (ii) create ad-hoc lists temporarily (‘waiting list initiatives’) [36]; (iii) use a ‘standby’ list of waiting patients able and willing to be called at short notice on the day of surgery, as soon as it becomes clear that there is space on that day’s list [37]; (iv) use capacity of other hospitals, including the private sector [38, 39] or via the NHS ‘Choose and Book’ system [40]; or (v) use the waiting list itself as a buffer. We expand on this last point below.

Queuing theory approach to analysing a waiting list

Figure 10 shows pictorially how keeping a waiting list can itself help to convert fluctuations in arrival of patients from clinic into a smooth demand for surgery. All patients enter a common pool or waiting list, and from this pool are taken suitable patients to fill each week’s list optimally. Thus, perhaps in a rather perverse sense, a waiting list can actually help utilise capacity by ensuring each week’s list is full, despite the variations in demand week on week generated from clinic [41, 42]. For this reason it has been suggested that waiting lists are a useful form of rationing as they help improve theatre utilisation, reduce costs, and some patients get better, change their minds or die during their wait, thus reducing overall burdens on the system [43].

Figure 10.

 Pictograms showing varying demands generated from clinics in successive weeks (left column), with those patients booked for elective surgery joining a pre-existing pool or queue of patients (the waiting list, middle column). The size of this pool varies week on week. From this pool are selected combinations of patients to yield a constant demand upon theatre time in successive weeks (right column). Thus, the waiting list smoothes a potentially irregular flow of patients. Note that each pictogram represents a ‘unit of time’ to perform surgery rather than an individual patient.

There are several reasons for not pursuing this line of argument. First, we do not know how large (or small) the waiting list needs to be before the adverse effects of long waits outweigh any putative benefits. Second, we recall that the ultimate aim of effective management is to eliminate waiting; we cannot logically use the problem itself as the solution. Third, even if it could be shown mathematically that a certain size of waiting list aids in use of resources, then ethical questions arise in keeping certain patients waiting in the full knowledge that for some, this may harm their clinical condition.

The pictogram in Fig. 10 resembles a ‘queue’, with arrivals on the left of the picture joining a queue, while patients exit the queue into the ‘service’ on the right. This analogy is apposite as ‘queuing theory’ is a branch of mathematics with powerful applications for the demand-capacity problem.

In the early part of the 20th century, Erlang (working for the Copenhagen Telephone Company) and Engset (working for the Norwegian government’s Telegrafverket, now known as Telenor, one of the largest carriers in the world) addressed the problem of how many telephone exchanges/switches were needed for an expanding number of calls that threatened to saturate existing capacity [44–46]. The issue was not one of lazy or slow telephonists, rather a mathematical problem in which, despite perfect telephone operators, the mechanics of the system could be overwhelmed.

In suggesting solutions, Erlang and Engset created the formal branch of mathematics now known as ‘queuing theory’ [44–46], which perceives ‘demand’ as a queue for service. Queues are an everyday experience in banks, shops, public transport, etc. and the general system is one where a customer arrives to be served at a service point. Mathematically, the nature of the service (surgery or shop or bank) does not matter and the key elements to model are simply: (i) how frequently customers arrive; (ii) the rate at which they are served; and (iii) the number of servers available. The arrivals are not completely regular but instead often resemble a Poisson distribution [47]. Just as a Gaussian distribution has its characteristic features for continuous data (for example, time in min or weight in kg), so also does a Poisson distribution for categorical data (for example, arrivals.h−1), and this lends itself readily to mathematical modelling. The type of questions that can be addressed include: ‘Given a certain mean arrival rate and a certain number of servers, what is the expected waiting time for customers?’ or ‘How many servers do we need to be x% confident of keeping our waiting time less than y minutes?’. As long as the average arrival rate is known and Poisson distribution assumed, robust conclusions can be drawn using the models.

The simplest queue is one where there is a single arrival point (line) and a single server; in reality many queues can co-exist and involve multiple servers (often themselves inter-related in a network) and demonstrate patterns of arrival that are much more complex than Poisson (for example, block arrivals). Increasingly sophisticated models of queues (see Appendix) can examine the optimum organisation of a queue (for example, each surgical team has its own queue vs a pooled waiting list), the best order of service (for example, ‘first in, first served’ vs ‘random service’), and issues such as booked vs sequential appointments [48, 49]. Discussion of these details is outside the scope of this review; unfortunately queuing analyses lose in simplicity what they gain in mathematical robustness. The Appendix illustrates some of the mathematics involved, which can seem very complicated, but the strength is that queuing theory can deal with many situations and address more sophisticated questions than can our ‘probability theory approach’, discussed in the earlier part of this paper [50, 51].

Queuing theory yields results which are sometimes counter-intuitive. Let us consider one surgeon in a clinic where on average four patients arrive per hour, with a mean consultation time of 10 min each. With an average of just 40 min each hour taken in consultations, intuition might suggest that there is unlikely to be any waiting time. Yet queuing theory predicts (accurately) that patients will wait in clinic for a mean of 20 min. Now, to reduce this wait, a second surgeon is employed. Having doubled the capacity, we might expect the waiting time to halve to 10 min. In fact, and as predicted by the theory, waiting time declines dramatically to just over 1 min. If then, managers are tempted to increase bookings because of this second surgeon’s presence (to allow patients' attendances to double from 4 to 8 h−1), then intuitively we might hope that waiting time simply doubles to a modest (and acceptable) ∼2 min. Unfortunately, the increase is much more marked to (a possibly unacceptable) ∼8 min. Sceptical readers who still believe their intuitions might wish to work through the mathematics of the Appendix using these data, or try more user-friendly applets such as those at http://www.egr.msu.edu/~ziad/qt/mms.html to be convinced by these examples.

One important result from queuing theory is that the size of the queue and wait time are inversely related to the utilisation of the system (Fig. 11), with the size of the queue or delay in service markedly increasing if utilisation exceeds ∼80% [51]. Thus, regardless of whether we use queuing theory or our earlier ‘probability theory approach’ for analysis, a ‘waste’ of ∼20% is likely to be needed for optimum management of the demand. When Taylor et al. [52] applied queuing theory to the organisation of anaesthetic emergency services they concluded that ‘the price of a better service is more waste of time’.

Figure 11.

 Generalised relationship between size of queue vs level of utilisation for a queuing process termed M/M/n (see Appendix for definitions).

Is it possible to reduce or eliminate variation in the system?

To summarise the logic so far: at the root of the waiting list problem seems to lie the fact that demand is variable. Therefore, if variability could be reduced (as in Fig. 2) or eliminated (as in Fig. 1) then the problem of capacity planning could be greatly simplified. The NHS Institute has therefore incorporated ‘Lean’ and ‘Six Sigma’ thinking into its drive to obtain the ideal reductions in variations within the system. This section considers whether this is achievable.

One difference between our emphasis and NHS policy is that we discuss how to understand just one demand-capacity step, namely the problem faced by operating theatres. The patient’s journey can in fact be regarded as a series of discrete steps, for example: from home to GP clinic [25]; from there to hospital clinic [23, 24]; from there to a waiting list [48–51]; from there to pre-operative tests [53], then to ward admission [54]; from there to operating theatre followed by admission to postoperative recovery [55]; and then finally discharge home (or to another institution). Each step has its own separate requirement for analysis, often with its own distinct measure of demand and capacity.

NHS planners – responsible as they are for the service as a whole, rather than just one aspect of it, often view this pathway as a continuum – a cornerstone of policy being to ‘map the whole process’ [56, 57]. The preferred measure of ‘demand’ appears to be ‘the number of patients flowing into the system’. An Australian study (influenced by this philosophy) argued that healthcare providers should think about the patient's journey from arrival to discharge as a complete care process, rather than as a discrete series of steps [58].

The difference between this ‘unitary’ philosophy and one where steps are considered discrete may explain some previous disagreements in the literature. Gallivan et al. [59] regarded ‘length of inpatient stay’ as the key measure of demand on their cardiac intensive care service and used queuing theory to conclude that, because of variability in this demand, they needed greater capacity to facilitate admissions. Gallivan’s approach was, however, criticised by several leads of the NHS modernisation program [56, 57]. They did not view ‘length of stay’ as a proper measure of demand (but omitted to define what an alternative measure might be) [56]. Second, they stated explicitly that the variability in demand that Gallivan et al. [59] encountered was ‘not inevitable, but caused by the system’– reducing (or even eliminating) this variability was the cornerstone of their policy. Castille et al. [57] summarised this view: ‘poor standardisation of procedures [by specialists], poor co-ordination, etc’ could be effectively managed to reduce variability.

It is important critically to assess if this view is correct [60].

‘Lean’ and ‘Six Sigma’ thinking in healthcare

The broad concept of eliminating variation emanated from an influential analysis of how the Toyota grew into a major multinational company [5]. The apparent novelty of Toyota’s approach was the introduction of a set of principles including: ‘eliminating waste’ in all its forms (both time and infrastructure); ‘do it right first time’ (no trial- or error-led behaviour); emphasis on ‘value’ in both activity and product; ‘flexibility’ (in terms of roles of staff as well as production line activities); and the notion that the customer ‘pushes’ (directs) the system rather than being passively ‘pulled’ through it. ‘Six Sigma’ thinking, perhaps attributable to the Motorola Company [6, 7] (but also based on the work of Deming and Juran [8, 61]), is a philosophy aimed to reduce defects in manufacturing or systems processes, and can be coupled with ‘Lean Thinking’ to form a tool for ‘transformational change’ in an institution. Numerous companies, academics and management consultants now offer advice (for a fee) to introduce these ideas to the NHS.

Desirable as these motives are, there are several theoretical and practical concerns. ‘Lean’ can be anodyne and unobjectionable (how can one possibly object to minimising waste, or to doing things right first time?). Because many of these principles are rather vague and interpreted widely, they do not constitute a set of criteria by which a system can be sharply defined as ‘lean’ or ‘un-lean’. Indeed, one influential book has defined a ‘lean system’ as one that ‘requires less human effort, space, capital, time, to make products (services) with fewer defects.’ [61]. The problem with this is that the outcome (not the process) is used to provide the definition and so the self-fulfilling prophecy is that ‘Lean’ is always the system that works.

It is plausible for a manufacturing process to have little or no variability because the production of the same item (a given model of car) should take the same length of time, because all materials used and end-units of production are alike. In other words, there is process variation (related to production line activities) and separately inherent variation in the substrate of production (the materials from which cars are made). The latter is very low in most manufacturing industries (the quality of steel or aluminium is generally consistent). However, inherent variation in substrate is high in healthcare (each patient is different). Morton and Cornwell [62] use the term ‘irreducible variability’ for hospitals, which arises mainly from (i) the unpredictability of patients’ response to treatment and (ii) constant uncertainty in diagnosis, with provisional diagnoses always changing. Factories designed to make cars need not (and indeed cannot) adapt themselves in the middle of the manufacturing process to make instead a television set; hospitals need, by analogy, to be designed to do precisely this. Thus patients, not processes, contribute the greater part of variability for hospitals. The only way to eliminate variability in hospitals is to eliminate patients [62]. So while ‘Lean’ philosophy seems superficially reasonable for industries based on commodities (such as manufacturing, iron, copper, steel), its potential efficacy in services (such as law, accountancy, healthcare) requires more robust analysis.

When subjected to scientific enquiry, the evidence supporting ‘Lean’ is sparse. Notwithstanding the difficulty of clearly identifying a process as ‘Lean’, science requires that a beneficial effect should be detected on a measurable, pre-specified outcome (for example, the proportion of over-running lists, or last-minute cancellation rate, etc). Vest and Gamm [8] recently undertook a comprehensive, systematic review of peer-reviewed literature to assess which outcomes, if any, had been positively influenced by ‘Lean’/‘Six Sigma’. Despite retrieving 207 potentially relevant articles from the literature search, methodological problems in the original studies justified only nine for inclusion in their analysis. Although these nine claimed great benefits for ‘Lean’, in fact weak or non-existent statistical analyses in these studies meant their claims were unsubstantiated. Lack of control groups or randomisation also meant factors other than ‘Lean’/‘Six Sigma’ may have influenced the results.

Even outside healthcare and back within the realm of manufacturing, ‘Lean’ has been criticised. Coffey and others [63, 64] have argued that much of the putative ‘Lean Thinking’ did not actually exist in practice (at least not in sufficient form to drive Toyota forward). Instead, simple automation that reduced labour costs appeared sufficient to explain Toyota’s success, without need to invoke the philosophical dimension. Another important observation relates to alternative model specifications of cars. A given model of car can have a combination of, say, colour, seat trims, fittings etc, and these give rise to variation. By eliminating the combinations (or choices) it is possible to eliminate variation from the production process. Many Japanese cars offered customers fewer variety than competitors (many elements of the car considered ‘optional’ such as electric windows or air conditioning were incorporated as ‘standard’). Thus, in the early 1990s there were ∼20 000 combinations for General Motors’ Astra model, but only 568 choices in Toyota’s Corolla model. At once, this increased quality, lowered base costs and helped ensure less variation, and so led to fewer discrete problems in the manufacturing process [64].

If restriction of choice is an inherent part of a successful ‘Lean’ process, then this may be an unexpected revelation for healthcare services that aspire to be ‘lean’. Patients come with all sorts of co-morbidities or complications and they rightly expect and require a personalised service, with choice. Since surgeons do not confine themselves to just one procedure there is inherent variability due to their casemix, which will itself vary week on week. Therefore, a surgeon undertaking three cholecystectomies and two hernias is by definition less ‘Lean’ than one undertaking six cholecystecomies. Some automation of surgery is possible with laparoscopy and robotics but, even after the learning curves, it prolongs rather than reduces procedure duration and variability [29].

Therefore, healthcare is perhaps better modelled on industries such as antique clock restoration, where each clock (like each patient) is very different and needs to be treated individually. There is little or no resemblance to car manufacturing, and unfortunately there are no successful multinational antique clock restorers and little analysis of how such small specialist firms might apply ‘Lean Thinking’ or ‘Six Sigma’ to their highly personalised craft [65]. Furthermore, as Winch and Henderson [65] argue, reducing the ‘richness of healthcare’ might exacerbate problems of misadventure and inefficiency.

Conclusions

We have shown mathematically that, where variation exists, waste is necessary to be certain of meeting demand. ‘Lean thinking’ is suggested as the solution that reduces or eliminates variation, but it has shortcomings. It is more likely that variation can be reduced in capacity rather than in demand (such as arrangements for cross-cover of staff leave); ‘Lean’ may be more applicable to managing inventories and equipment, and perhaps ‘Lean’ is useful as a motivational device, but there is perhaps less evidence to support the adoption of ‘Lean’/‘Six Sigma’ as a cornerstone of NHS policy.

The process variation (that is, the operational problems faced by all hospitals, such as equipment or staff shortages, or delays in transporting patients, etc) should – and can – be minimised. However, it would seem more difficult to tackle the inherent variation due to uncertainty of patient outcomes (such as difficulties in diagnosis or variations in response to therapy, or anatomical variations in surgery, etc). There is no parallel uncertainty about the fate of a hunk of metal that enters the Toyota process: the metal will always turn into a car, and not a ship or a train. Whether ‘Lean’ and ‘Six Sigma’ are effective will, therefore, logically depend upon which of these two sources of variation predominates in hospitals. Our prejudice is that it is inherent variation that supervenes, but the answer can only be obtained by further quantitative research.

The influence of variation on the need for capacity can only be understood in mathematical terms because these are fundamentally mathematical concepts. The mathematics can become quite complicated (for example, queuing theory), especially when used to address increasingly sophisticated questions. Readers who hope for non-mathematical solutions to their theatre management problems will be as properly disappointed as those who wish for non-pharmacological explanations of anaesthesia or a non-anatomical understanding of complex surgery.

It is unfortunate that in the UK, these important issues of demand and capacity management have for too long been regarded outside the province of mainstream professional journals. All too often the answers to the questions are judged ‘obvious’ or simply amenable to ‘intuition’ or ‘common sense’ (but as we have seen, queuing theory shows our intuitions can be very wrong). Anaesthetists and surgeons, who are at the forefront of the delivery of operating services, are in a key position to provide hard data to inform the debates that we have highlighted. They should not take at face value various initiatives such as ‘Lean’ or ‘Six Sigma’, but instead apply their own intellect and training to criticise these, just they would critically appraise a new drug, diagnostic investigation or treatment. We hope that some of the basic ideas we summarise will help them do that.

Acknowledgements

We thank Dr Keith Dorrington, Fellow of University College, Oxford and Dr Andrew Morley, Consultant Anaesthetist, St Thomas’ Hospital, London for their helpful comments on this paper.

Appendix

Illustration of a general approach using queuing theory

To offer a flavour of the mathematics involved: if the mean of completed operations is λ.h−1, the mean length of stay in the system is 1/μ (h), and the steady-state distribution of the number of patients in the system follows a Poisson distribution, then the probability of having i patients in the system is [46]:

image(1)

If the capacity of the system is N patients then the length of the wait list (iwaitlist) is:

image(2)

By using equations (1) and (2), we can calculate the average number of waitlist cases E[iwaitlist] as follows:

image(3)

Formulae (1–3) represent what is known as an M/M/n queuing model (and is in abbreviated form). This notation serves to describe the pattern of arrivals, the pattern of service, the number of servers and the queue ‘discipline’ followed by the arrivals. More complex models can be used.

Ancillary