J. L. Hall RN; J. C. Hall DS, FRACS.
Use of outcome events in surgical trials: a systematic review
Article first published online: 29 AUG 2012
© 2012 The Authors. ANZ Journal of Surgery © 2012 Royal Australasian College of Surgeons
ANZ Journal of Surgery
Volume 82, Issue 11, pages 771–774, November 2012
How to Cite
Hall, J. L. and Hall, J. C. (2012), Use of outcome events in surgical trials: a systematic review. ANZ Journal of Surgery, 82: 771–774. doi: 10.1111/j.1445-2197.2012.06246.x
- Issue published online: 4 NOV 2012
- Article first published online: 29 AUG 2012
- Manuscript Accepted: 2 JUL 2012
- outcome event;
- surgical trial
Surgical trials sometimes fail to clearly identify the primary outcome events of interest. This results in trials that are diffuse and difficult to interpret.
The objective of this study was to systematically review the use of outcome events in surgical trials.
Surgical trials published between 1 January 2007 and 30 June 2010 in 26 peer-reviewed journals representing a wide range of specialty interests were used in this study.
Copies of all potentially relevant articles were scrutinized to identify the admissible surgical trials. Two investigators experienced in health research methods used a standardized form to extract discrete information (i.e. it was an ‘identifying and counting’ exercise that did not require subjective evaluations). All forms were double-checked.
Twenty-four per cent (130 out of 531) of the trials failed to declare the primary outcome events – 11% (56 out of 531) of the trials indicated the primary outcome events in the abstract, but not in the body of the article. The compliant trials used a median of three primary outcome events (interquartile range: 2–5, absolute range: 1–17), and a median of 19 statistical comparisons (interquartile range: 9–32, absolute range: 1–130). Only 2% (11 out of 531) of the trials made an adjustment for the multiple testing of statistical significance (9 of these trials declared a single primary outcome event). Composite outcome events appeared in 9% (48 out of 531) of the trials and these studies contained a median of 24 statistical comparisons.
Many surgical trials fail to clearly define the specific outcome events of interest, and this is often accompanied by a subversive number of statistical comparisons.
Outcome events, or endpoints as they are also called, are the measurements made by investigators to evaluate the results of clinical trials. The primary outcome measures should be directly related to the main aim of the study, and when planning clinical trials, they are used to estimate the number of patients required to obtain a reliable result. Most clinical trials include secondary outcome events that evaluate peripheral issues – clinical trials are onerous to perform and is expensive, so investigators use them to gain as much potentially useful information as possible. The CONSORT (CONsolidated Standards of Reporting Trials) statement requires ‘completely defined pre-specified primary and secondary outcome measures, including how and when they were assessed’. Compliance with these criteria allows the readers to separate the key issues from the fringe topics. Failure to comply with these requirements inevitably results in clinical trials that are diffuse and difficult to interpret.
The aim of this study was to systematically review the use of outcome events in surgical trials. We paid particular attention to the declaration of primary outcome events, the number and nature of the outcome events, the length of follow-up and the number of statistical comparisons.
A systematic review was conducted of surgical trials published between 1 January 2007 and 30 June 2010 in 26 peer-reviewed journals (American Journal of Surgery, Annals of Internal Medicine, Annals of Plastic Surgery, Annals of Surgery, Annals of Thoracic Surgery, ANZ Journal of Surgery, Archives of Surgery, British Journal of Surgery, British Medical Journal, Canadian Journal of Surgery, Diseases of the Colon & Rectum, European Journal of Surgical Oncology, Journal of the American College of Surgeons, Journal of the American Medical Association, Journal of Bone & Joint Surgery (USA), Journal of Paediatric Surgery, Journal of Trauma, Journal of Vascular Surgery, Lancet, Neurosurgery, New England Journal of Medicine, Otolaryngology Head & Neck Surgery, Surgery, Breast, Urology and World Journal of Surgery). These are readily available international journals that have a high-impact factor for that specialty. We inspected all of the relevant issues of the journals: we did not rely on databases, such as Medline, to identify articles. Digital and paper copies were made of all potentially relevant articles. These candidate articles were scrutinized to identify the admissible surgical trials, which were defined as prospective studies that evaluated: (i) the effect of an intervention on health outcomes after the patients were randomized into groups, and (ii) considered a topic directly relevant to the practice of surgery (i.e. they did not include interventions performed by physicians and radiologists).
Two investigators (JLH and JCH) experienced in health research methods used a standardized form to extract data. This form contained identifying material and information about the outcome events (the declaration of primary and secondary endpoints, the number and nature of the outcome events, the length of follow-up and the number of statistical comparisons). Words such as ‘main’ or ‘principal’ were accepted as being indicative of primary outcome events: words such as ‘other’ or ‘additional’ were accepted as being indicative of secondary outcome events. Non-clinical endpoints were only considered in this context (i.e. we did not include investigations that were used incidentally to monitor therapy). We diminished the risk of observer variation by only collecting discrete information (i.e. it was an identifying and counting exercise that diminished the need for subjective evaluations). We double-checked the data for each article and resolved discrepancies by discussion.
The scope and magnitude of the study makes it a representative of the recent surgical literature. Hence, this study is descriptive in nature and we have refrained from using comparative statistics. Data are described using the median, interquartile range and absolute range statistics.
There were 590 candidate trials. Close scrutiny led to the exclusion of 59 trials – 36 did not use clinical endpoints (11 concerned education/training and 7 were about devices or prostheses), 11 were not trials (7 used data from a published clinical trial), 7 were non-surgical (3 were about percutaneous coronary artery procedures and 2 evaluated Caesarian sections) and 5 were reviews. This left 531 admissible studies.
Nine journals accounted for 68% (359 out of 531) of the admissible trials (Table 1). Eleven specialty interests accounted for 70% (374 out of 532) of the admissible trials (Table 2). The orthopaedic trials were related to the knee (n = 22), spine (n = 13) and hip (n = 13). Gastroduodenal surgery included bariatric surgery (n = 8) and surgery for gastro-oesophageal reflux (n = 7).
|Annals of Surgery||14% (74)|
|British Journal of Surgery||12% (65)|
|Journal of Bone & Joint Surgery (USA)||9% (50)|
|The Lancet||6% (31)|
|Diseases of the Colon & Rectum||5% (29)|
|World Journal of Surgery||5% (27)|
|Annals of Thoracic Surgery||5% (24)|
|New England Journal of Medicine||5% (24)|
|Groin hernia||3% (16)|
Sixty-three per cent (332 out of 531) of the admissible trials declared primary outcome events in the Methods section. Thirteen per cent 13% (69 out of 531) declared primary outcome events in other parts of their article: Abstract alone (n = 56), Results and Abstract (n = 8), Introduction and Abstract (n = 4), and Results alone (n = 1). Twenty-four per cent (130 out of 531) failed to declare primary outcome events in any part of their article.
Fifty per cent (266 out 531) of the admissible trials declared secondary outcome events in the Methods section, while 22% (115 out of 531) declared secondary outcome events in other parts of their article. Hence, 18% (150 out of 531) of the trials failed to declare secondary outcome events in any part of the article.
The generic outcome events most often encountered in the 531 admissible trials were as follows: death 20% (n = 108), length of stay in hospital 17% (n = 90), functional status (including time until return to work) 15% (n = 78), quality of life 14% (n = 75), patient satisfaction 9% (n = 49), costs 6% (n = 33) and psychosocial status 3% (n = 15). Non-clinical endpoints were used at least once in 52% (278 out of 531) of the admissible studies – imaging (n = 152), biochemistry (n = 103) and pathology (n = 29).
Table 3 contains summary statistics for the numbers of primary outcome events and statistical comparisons. The trials reported a median of three primary outcome events (interquartile range: 2–5, absolute range: 1–17), and a median of 19 statistical comparisons (interquartile range: 9–32, absolute range: 1–130). Five trials declared more than 10 primary outcome events. Ten per cent (51 out of 532) of the trials reported more than 50 statistical comparisons, and 6 of the trials reported more than 100 statistical comparisons. Only 11 trials made an allowance for multiple testing of statistical significance and nine of these trials used one primary outcome event. Only 39% (205 out of 531) of the trials declared that their analyses were based on an ‘intention to treat’.
|Declared number of primary outcome events|
|1 (n = 244)||2–6 (n = 89)||Nil (n = 198)|
Only 2% (12 out of 531) of the trials failed to declare the time that they followed-up patients. In 79% (422 out of 531) of the trials, the patients were followed up for a declared time after discharge from hospital: 18% (98 out of 531) of the trials only evaluated the patients while they were in hospital. There was no trend to report a standardized definition of operative mortality: 20% (107 out of 531) of the trials declared death as an outcome – only 30% (32 out of 107) of these trials declared a specific period of review, the commonest being 30 days, which was used in 16% (17 out of 107) of the relevant studies.
We found that 24% of the surgical trials under evaluation failed to declare the primary outcome events. This is similar to the non-compliance rate of 26% for similar criteria in a review of 490 surgical trials published in the ANZ Journal of Surgery and the British Journal of Surgery between 1969 and 2003. At first sight, this appears to be relatively good when compared with the report from Altman's group that one-half of clinical trials published in prestigious medical journals specify the primary outcome event.[4, 5] However, in those studies, the criteria were strict. They looked for the ‘explicit’ definition of a primary or main outcome. We were more lenient by avoiding decisions about what was, or was not, ‘explicit’, and rather than just concentrating of the Methods section, we looked for an indication in any part of the article. Eleven per cent of the surgical trials that we surveyed only declared the primary outcome events in the abstract. It is surprising that more than 1 in 10 surgical trials fail to mention such key information in the text of their article.
Laxity in declaring the primary outcome event is a major fault. The effectiveness of clinical trials depends on a tight articulation between the aims, outcome events and conclusions. Vagueness about the main outcome events indicates an unwillingness to reliably evaluate a precise hypothesis. In the absence of predefined primary outcomes, investigators can selectively report outcomes based on post hoc analyses. The International Committee of Medical Journal Editors' ‘Uniform Requirements for Manuscripts Submitted to Biomedical Journals: Writing and Editing for Biomedical Publication’ states that ‘Both the main and the secondary objectives should be clear, and any pre-specified subgroup analyses should be described’. This is consistent with the CONSORT statement's warning that having several primary outcomes ‘incurs the problems of interpretation associated with multiplicity of analyses and is not recommended’. Our study provides evidence that this concern is valid.
We observed a lack of uniformity in the presentation of generic outcome events such as operative death. Ideally, outcomes that are used during clinical trials, surgical audits and the management of surgical services within health-care systems would use common criteria so that reliable estimates could be made of the importance of variations between different services.
Nine per cent of the trials that we reviewed used a composite outcome event. Composite outcome events are usually based on a combination of traditional outcomes. They allow complex outcomes to be expressed as a single quantity.[1, 8] This should enhance the power of studies to reliably detect clinically important differences in outcome. However, when Ferreira-González et al. explored the use of composite endpoints in 114 cardiovascular trials, they came to the conclusion that: ‘Higher event rates and larger treatment effects associated with less important components may result in misleading impressions of the impact of treatment’. Our study indicates that surgical trials add to these concerns by just using composite endpoints as part of the mix of multiple outcome events. The use of composite endpoints was not associated with an anticipated reduction in the number of outcome events and a constrained approach to testing for statistical significance.
Only 6% of the trials that we reviewed used costs as an outcome event, and in many of these studies, the analysis appeared to be trivial. A recent report suggested that less than one-half of surgical trials that provided estimates of costs were formal studies (they only mentioned costs in the Methods and Results sections). It is exceedingly rare to find a surgical trial that is based on a comprehensive economic analysis, which is understandable because it requires considerable effort to determine the monetary value of benefits.
The design of our study has some potential weaknesses. In order to collect a large number of articles, the period under review was 3.5 years. The initial publications appeared in January 2007 and, given that methodological standards tend to improve with time, may not accurately reflect the standards of the contemporary literature for readers (i.e. a chronological bias). We were concerned about the risk of observer variation. This was offset by only collecting discrete information, double-checking the data for each article and resolved discrepancies by discussion. We we careful to adopt an impartial attitude to the data and tried to avoid a ‘group think’ mentality that actively sought ‘errors’. Finally, our study is descriptive in nature, which negates the ability to make analytical comparisons between the specialty interests, the impact factor of the journal or markers of study quality.
In conclusion, surgical trials tend to be too diffuse. Both readers and biostatisticians would appreciate surgical trials that link a relevant aim to a modest number of primary outcome events and carefully targeted statistical comparisons. We found that more than one-quarter of the surgical trials declared five or more primary outcome events and one-half of the trials contained more than 18 statistical comparisons. Furthermore, we found that multiple testing for statistical significance persisted despite the use of composite outcome events and the selective declaration of primary outcome events. Hence, in this instance, compliance with guidelines had a limited ability to indicate the quality of a highly relevant component of the published articles. The ability of well-informed authors to ‘play the game’ should not be confused with the structural integrity of a study.
- 6The International Committee of Medical Journal Editors. Uniform Requirements for Manuscripts Submitted to Biomedical Journals: Writing and Editing for Biomedical Publication . [Cited 27 Mar 2012.] Available from URL: http://www.ICMJE.org
- 7The CONSORT Group. The CONSORT Statement . [Cited 26 Mar 2012.] Available from URL: http://www.consort-statement.org