The best way to determine the best way to undertake a hysterectomy
Prof R Garry, 94 Westgate, Guisborough, TS14 6AP, UK
Hysterectomy remains one of the most commonly performed major surgical operations in the developed world. The introduction of laparoscopic techniques to undertake this important procedure has led to a reappraisal of the indications for conventional vaginal hysterectomy (VH) and abdominal hysterectomy (AH) as well as attempts to define the role of the new procedures. Most gynaecologists and their patients would agree that the optimum procedure is the one that can be performed with the greatest safety and produce the greatest relief of symptoms and improvement in quality of life in the most cost-effective manner.
How then can we best determine the risks and benefits associated with the various methods of hysterectomy? There are a number of research strategies available, and they all have been used to compare the various methods of hysterectomy. The simplest and most economic is to perform a retrospective analysis of outcomes by retrieving data from the patients’ notes. The paper by Donnez et al.1 in this issue of BJOG is an example of this type of trial. A second approach is to prospectively collect the data required according to a predetermined protocol and to collect all data at the moment the event occurs. The VALUE study2,3 is this type of trial in which a cohort of more than 37 000 women who underwent hysterectomy in England, Wales and Northern Ireland between 1994 and 1995 were studied prospectively. This large, carefully designed study reported that laparoscopic hysterectomy (LH) was associated with double the risk of operative complications associated AH. A third approach is to undertake a prospective randomised controlled trial (RCT) such as the eVALuate study.4–6 A further method of assessing the pros and cons of each method is to undertake a systematic review and meta-analysis of the relevant RCTs.7 This approach obviously requires others to have undertaken a number of suitable RCTs before such a meta-analysis can be undertaken.
Each of these methods has strengths and weaknesses.
Retrospective trial methodology
Retrospective studies are intended to look backwards and examine exposure to suspected risks. Typically, studies of this type would take a well-defined end-point, such as the development of breast or lung cancer and retrospectively determine the relative risks of developing these diseases with say smoking. The retrospective nature of the data collection makes it difficult to ensure that the comparison groups are similar for all factors except those under investigation. Sources of error due to confounding and bias are more common in retrospective studies.
Prospective trial methodology
In most circumstances, prospective studies are to be preferred over retrospective studies particularly when it is required to make a precise estimate of either the incidence of an outcome or the relative risk of outcome based on exposure, such as complication rates following an intervention. The choice of a retrospective study particularly one conducted up to 16 years after the intervention would seem to be an inappropriate method of investigation for this topic.
Randomised trial methodology
Randomisation is important as it helps reduce the possibility of bias. Without randomisation, there may be a tendency for researchers to recruit participants into the trial who are more likely to respond to the intervention or to select these participants for particular intervention groups they favour. The purpose of random allocation to different interventions is to ensure that known and unknown confounding factors are evenly distributed between treatment groups, thereby reducing bias in the trial to the minimum. This was certainly the intention in the eVALuate study that was rigorously set up and monitored to ensure minimal bias. We were therefore more than a little surprised to read of the Donnez’s group suggestion that ‘the conclusion reached by Garry et al. is not admissible because of considerable bias’.8 We think that one of the trials reviewed in this comparison does indeed show considerable design and execution bias but will attempt to show that the biased trial is not the eVALuate study.
The data for the eVALuate study were collected prospectively according to strictly designed protocols and then randomised with equal strictness using centralised concealed computer-generated techniques. The structured protocols included obtaining preoperative details in sections of a specific trials data book that were completed by both the patient and a research nurse. Intraoperative details were completed and placed in the same research data book by the surgeon conducting the operation immediately on completion of the surgery and while still in the operating theatre. A separate postoperative diary was completed by the patient twice a day with sections that were also completed by doctor and nurse each day. The patient then completed a daily diary from day of discharge detailing pain and mobility and the amount of activity they were able to perform until reviewed by the surgeon 6 weeks after surgery. The patient also completed detailed health questionnaires and resource use questionnaires preoperatively at the time of discharge and 4 and 12 months later. Validated instruments used to assess outcomes included pain assessment using Visual Analogue Scores and records of analgesia used. Health status was assessed using short from 12 (SF12) Health Survey Questionnaire and European Quality of Life-5 Dimensions (EQ-5D) quality-of-life instrument together with a Sexual Activity Questionnaire and Body Image Score instrument. This type of pre-planned prospective data collection is designed to ensure as accurate and as complete collection of data as possible.
Different trial methodologies impact on reported results, and this can be readily appreciated by comparing the current Donnez’s study with the eVALuate study. The headline major complication rates following LH in the AH arm of the eVALuate study was 11.1% compared with only 0.49% in the Donnez’s study. Taken at face value, these figures suggest that it is 27 times more risky to have a LH performed in the UK than in Brussels. These alarming differences in complication rates are, of course, largely explained by marked differences in trial methodology used in the different studies.
The rigour with which the complication data are collected will affect the reported incidence. The Donnez’s paper is a report of a single-centre retrospective study collected over a 16-year period and analysed after periodic trawls of hospital records at the completion of various stages of the study period. It is accepted that retrospective studies in general may seriously underestimate outcomes.9 This certainly seems to be the case in this particular study. For example, the only complications recorded among 405 AHs were three bladder incisions and one rectal perforation. Surprisingly, there were no cases recorded of infection, fever, urinary problems, bleeding, wound, anaesthetic problems, deep venous thrombosis, pulmonary embolism or any other well-recognised complications of this procedure. This team may have an extraordinary effective technique, or they may consider many adverse events occurring after surgery not to be complications or the methodology they use may not allow many complications to be detected. In Donnez’s retrospective study, only five categories of complications were recognised. In contrast, in the eVALuate study, 18 different complication groups covering most of the adverse conditions that we considered are a consequence of each surgical intervention. One of the problems of comparing complication rates is that there is no internationally agreed protocol defining what should be considered an operative complication.
What is defined as a complication will self-evidently affect the overall reported rate of complications. For example, in the eVALuate study, fever was defined as a temperature over 38°C on one occasion. In the Donnez’s study, a fever was only recorded when all the following co-existed: a temperature >38.5°C for more than 2 days requiring intravenous antibiotics associated with hypogastric tenderness and an inflammatory reaction (elevated C-reactive protein [CRP] and hyperleucocytosis) and slight induration of the vaginal vault. Similarly, in the Donnez’s study, bladder lesions were classified as minor complications, while these lesions were considered major complications in the eVALuate study. Given these fundamental differences in definitions, it is inappropriate and quite misleading to compare headline complication rates.
Donnez’s group assert that they ‘firmly believe that LH offers multiple advantages over AH and VH and that laparoscopy is clearly the most appropriate technique’. Their study does not present any data to support these claims and indeed is not designed to answer these questions. Valid comparisons of different interventions can only be made when the populations studied are similar in all respects save for the interventions under investigation. Although Donnez’s study gives no demographic details of the groups they compare, it can be deduced from data they present that the AHs were performed primarily on women with enlarger uterus >16 weeks size and the VHs may have been undertaken primarily in those with prolapsed. Such confounding variables may well influence complication rates and therefore preclude valid comparisons to be made. Retrospective trials without strict admission criteria cannot be used to compare the efficacy of different interventions, and the study design of this trial makes the comparisons highlighted in its title inappropriate.
RCTs are thought to be the most rigorous way of determining whether a cause–effect relationship exists between treatment and outcome and for assessing the cost-effectiveness of a treatment.10 Correct randomisation techniques, even of an open design used in the eVALuate study, ensure each therapy group is similar, and the effects of the various interventions on a whole variety of outcome measure can be made with validity. The eVALuate study was able to demonstrate that when compared with AH, LH was less painful, was associated with a shorter recovery and better short-term quality of life. These observations can only be made when comparing similar defined populations that are exposed to the single variable of different surgical operation. We were also able to show using the same matched populations that the laparoscopic approach was associated with better quality of added life years (QALYs) and was a cost-effective procedure compared with AH. These positive findings in favour of LH were offset by some negative findings. These included the obvious fact that the laparoscopic approach took longer to perform, and the more contentious issue that the LH was associated with a higher complication rate as defined by us. The Donnez’s group speculate that the higher complication rates reported in the eVALuate study compared with their current study were surgeon related and due to inexperience of the surgeons participating in the trial. This concept was tested and reported in the full account of the trial.5 A detailed statistical study concluded that variations between surgeons did not appear to have an impact on the outcome for patients.
There is, however, one particular complication that does hint at some problem with the laparoscopic approach in this study. The ureter was injured on six occasions (0.66%) in the laparoscopic arms, while there were no such injuries in either the abdominal or vaginal arms of the study. This was still an infrequent complication and not statistically significant, but the trend was concerning. These differences led us to suggest that particular care must be taken to recognise this problem and develop techniques and technology to minimise the risk. The comparable rate of ureteric injury in the Donnez’s study was 0.32%.
The methodology of prospective RCTs is usually the most rigorous technique that can be applied to determine the effectiveness of any intervention. There are, however, difficulties associated with this approach, which can lead to confusion and even become frankly misleading. A feature unique to randomised trials and of particular relevance in the eVALuate study is that most RCTs including this one are designed to be analysed with intention-to-treat methodology. This is a strategy for the analysis of controlled trials that compares women in the groups to which they were originally randomly assigned.11 This is to ensure that the treatment groups remain similar. This is the reason for randomisation and the feature may be lost if analysis is not performed on the groups produced by the randomisation process. Intention-to-treat analysis also allows for the possible none compliance and deviation from policy by clinicians. Intention-to-treat provides information about the treatment policy rather than the potential effects of specific treatments. In the eVALuate study, as a consequence of this methodological requirement 32 (3.5%) of the laparoscopic procedures converted to abdominal procedures, and in most cases, this was before the surgery was begun. As a result of the intention-to-treat requirement, these cases were classified as unintended laparotomies and therefore considered to be major complications. Similarly, nine (2.7%) of planned VHs were converted to AHs and classified as complications. Many clinicians including this author believe that changing the designated surgical approach is not in itself a complication but rather prudent practice. Much discussion took place within the clinical trials group over this matter. It was, however, agreed that if these ‘difficult’ cases were removed from the analysis, it would suggest that both laparoscopic and vaginal approaches were safer and more generally applicable than they really are. There are thus major pros and cons associated with this methodological decision. We eventually decided that the intention-to-treat methodology demand that they each be analysed within the group to which they were randomised but were fully aware that if we had excluded them from the analysis, it would have significantly reduced the reported complication rate associated with LH and VH. In fact, it would have altered the result so that there were no statistically significant differences between the complication rates of the groups. We discussed this problem in the original paper but feel that subsequent comments suggest that its implications have not been widely appreciated. These statistical devices may confuse rather than clarify end-point, but they do not represent bias. Indeed, they are present to minimise or remove bias.
Donnez’s group also dissents, without proving supporting evidence, from our conclusion that VH is more cost-effective and should be considered as the optimum approach when the method is technically suitable. We can only repeat our observations that VH was completed more rapidly than LH with lower operative and postoperative costs, which were not offset by any increase in complication rates and were associated with similar hospital stays, pain profiles and quality of life as assessed by our panel of validated instruments. In the formal cost-effectiveness analysis, VH dominated LH being associated with both a better quality-of-life measurement and being cheaper. We therefore felt that these observations indicated that VH should be considered when technically suitable. We cannot therefore qualify these observations save to say that the VH arm of the study recruited only patients whom the surgeon considered were technically suitable for that approach. Inappropriately extending VH to cases not suitable for that method may well be counter-productive.
This commentary has concentrated on the comparison between a single retrospective cohort study and a single prospective randomised trial. The advantages and disadvantages of these different methods of study are discussed. Donnez’s group posed the question do we really need randomised trials to endorse the value of laparoscopic hysterectomy? I trust that this commentary has given a positive response to that question. This is not to dispute that other trial modalities have a place. Retrospective studies with methodical collection of data can give much information at a fraction of the time and cost associated with a RCT. The Donnez’s study has shown for example that in their expert hands the rate of lower pelvic organ damage associated with LH is acceptably low. This conclusion has been reached with a simple and cheap trial. It is not valid, however, to use their methodology to either compare the rates of complications between the three methods of hysterectomy studied in their trial or is it valid to compare their complication results from this trial with other trials that employ different definitions and methodologies.
The advantages of RCTs are apparent from our study but so are the difficulties and problems associated with this type of trial. Good RCTs are very time and labour intensive and are very costly to run. The eVALuate study took more than 6 years to complete and cost almost £1,000,000 and much intellectual trauma on the way. Confidence in the completeness of data collection can often only be obtained by prospective collection. As the effects of different surgeries are likely to be small, significant changes in outcomes can only be obtained by comparing the responses of matched populations. This can best be achieved by randomisation of the study population. RCT methodologies can, however, also cause confusion, which may obscure the real outcomes. It is clearly important to be aware of the minutiae of the study details to obtain maximum benefit from the data so painstakingly collected. Despite these difficulties, RCTs can obtain types of data than can be obtained in no other way, and this trial design should ensure that many types of data are collected in the optimum manner. The details and the deficiencies of RCTs must be appreciated before all the potential benefits available with this powerful investigative tool are fully realised. So, in answer to the Donnez’s studies question ‘do we really need randomised trials to endorse the value of laparoscopic hysterectomy vis-a-vis other techniques after encountering only 0.44% of major complications of 3190 laparoscopic procedures’ I believe that RCTs provide information that can only be obtained by that methodology and represents the most important and informative trial design to help clinicians and their patients determine the optimum role for LH, VH and AH in current gynaecological practice.
Disclosure of interests
None of the authors have any conflicts of interests concerning this paper.
EBM and laparoscopic hysterectomy—Commentary on ‘The best way to determine the best way to undertake a hysterectomy’
While Garry clearly considers randomised controlled trials (RCTs) to be the holy grail of evidence-based medicine (EBM) we believe that such studies are not appropriate in all circumstances. Indeed, we would like to draw his attention to Smith and Pell’s paper entitled ‘Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials’ published in the BMJ (2003;327:1459–61). Smith and Pell conclude that ‘the effectiveness of parachutes has not been subjected to rigorous evaluation by using randomised controlled trials’, but observational data unquestionably point to parachutes reducing the risk of injury in case of free fall. They suggest that those who dispute the validity of such an observation might benefit from participating in a ‘double-blind, randomised, placebo-controlled, crossover trial of the parachute’ to prove their point.
Furthermore, we are of the firm opinion that all EBM must issue from a level playing field. In our view, it is not ethically responsible to conduct a prospective RCT with gynaecologists whose experience significantly differs, in some cases being borderline with respect to the required learning curve. Indeed, the mean number of laparoscopic hysterectomies per year per gynaecologist was only five in Garry’s study. This, in itself, is enough to bias the results. The primary end-point in the eVALuate study was the occurrence of major complications, which was as high as 11% compared with just 0.49% in ours.
Concerning the low complication rate in our study, Garry considers this to be the result of ‘marked differences in methodology’. However, if we look at the literature, our rate is comparable to most series involving experienced surgeons (Bojahr et al., J Minim Invasive Gynecol 2006;13:183–9 and Karaman et al., J Minim Invasive Gynecol 2007;14:78–84). A recent study in Finland investigating almost 14 000 laparoscopic hysterectomies showed a urinary tract injury rate of 1.4%, which is also significantly (!) lower than the rate in the eVALuate study (Brummer et al., Hum Reprod 2008;23:840–5).
According to Garry, laparoscopic hysterectomy is time-consuming and expensive. This is not so. We clearly demonstrated and published in the New Engl J Med that:
- 1In experienced hands, laparoscopic hysterectomy takes less than 1 hour.
- 2If reusable material is used, no additional expense is incurred.
- 3Hospital stay, painkiller use and recovery (back to work) time are significantly reduced (Nisolle and Donnez, New Engl J Med 1997;336:291–2).
We understand Garry’s need to defend the eVALuate study, having taken more than 6 years to complete at a cost of almost £1 million, not to mention the ‘intellectual trauma’ involved. We also understand his frustration that this study is not validated by recent literature findings. However, while we agree that prospective RCTs may well be the optimal choice in most cases, certain criteria must be respected—and the level of experience of participating surgeons must be foremost among them.
Comparing the complication rate in the eVALuate study (11.1%) with that obtained in ours (0.49%), Garry states that ‘taken at face value, these figures suggest that it is 27 times more risky to have a laparoscopic hysterectomy performed in the UK than in Brussels’. We regret to say that he may be right, at least in the context of this particular study. This, unfortunately, is the downside of EBM…. Fortunately, we do not consider these results to be truly representative and firmly believe that in experienced hands, in the UK just as elsewhere in the world, laparoscopic hysterectomy is a swift, safe and highly effective technique.
Disclosure of interest
The authors confirm that they have no conflict of interest to declare.
J Donnez, J Squifflet, P Jadoul, O Donnez
Université Catholique de Louvain, Clin. Universitaires St Luc, Brussels, Belgium