In John Medina's book, Brain Rules, he highlights that we see with our brains in ways that often fool us into experiencing what we expect to find, rather than what we actually encounter. He describes how a group of researchers dropped red dye into white wines to see how oenologists would assess them. Their noses were fooled and every one of them used the vocabulary of red wines to describe the white wines. Similarly, Jonah Lehrer's book, The Decisive Moment, reported a functional magnetic resonance imaging experiment where expensive wines made parts of the prefrontal cortex more excited (even though the scientists had tricked the participants by swapping the price labels, so the participants' brains became more excited with what they thought was more expensive wine). This neuroscience research highlights that expectations have powerful effects that influence what we believe.
So it is with the controversy on the use of tissue plasminogen activator (tPA) in acute ischaemic stroke. Supporters believe in one study (the National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group [NINDS] New England Journal of Medicine 1995) that ‘proves’ that it should be given within 3 h, and see the evidence in that light. Those who consider the case for tPA still unproven prefer to look at the totality of the evidence.
In 2008, the European Cooperative Acute Stroke Study III (ECASS III) was published, suggesting that the time window for delivery of thrombolysis could be increased to 4.5 h. However, significant methodological and analytical flaws have dogged this study, much like criticisms of NINDS have persisted since 1995. One of the main criticisms of ECASS III was the primary endpoint chosen, a modified Rankin Scale (mRS) of 0–1 versus mRS of 2–6. A mRS of 2 (slight disability; unable to perform all previous activities but able to look after own affairs without assistance) is grouped together with a mRS of 6 (dead). This is not an appropriate end-point stratification. Once the end-point is reclassified into mRS 0–2 versus 3–6, all purported benefits of tPA disappear. It is also an inconvenient truth that the ATLANTIS trial, which had virtually the same design as ECASS III, caused harm and was stopped early.
A recent graphic reanalysis of the original NINDS data suggests that imbalance in baseline stroke severity was likely responsible for most, if not all, the difference in outcome between treatment groups. Each group experienced virtually identical change in National Institutes of Health Stroke Scale (Δ-NIHSS). Δ-NIHSS improved slightly in almost all NINDS subjects, regardless of treatment, whereas a very few improved a lot, and a few (most of whom were extremely sick at baseline) died. Δ-NIHSS, the only metric recorded both before and after treatment, thus allows for estimation of the effect of treatment independent of confounding by severity. Although more tPA patients did end up with only a small deficit, this paralleled the similarly greater number of tPA patients who started with a very mild stroke. Graphic depiction of the response to treatment of every single NINDS subject allows readers to judge for themselves whether there was even a hint that treatment modality, or time to treatment, had any independent effect on outcome above that of initial stroke severity.
Critics correctly note that Δ-NIHSS is not a perfect metric, because it is not truly linear: a change of X points at one end of the scale is not necessarily the same as a similar change at the other end. However, this concern, although theoretically valid, does not appear to be relevant to NINDS, as Δ-NIHSS was the same for all the treatment arms at every area of the scale – it changed equally with (early or later) tPA as with placebo for small strokes, and for moderate ones and for severe strokes.
Change in National Institutes of Health Stroke Scale also measures only discrete elements of neurological function, rather than the more important overall function of the organism. Still, we might at the very least ask ‘just how did tPA lead to better overall outcomes … if it had no effect on any element of neurological function?’
The European Cooperative Acute Stroke Study III is widely touted for ‘corroborating’ the benefit reported by the original NINDS trialists, but the negative randomised controlled trial of thrombolysis (desmoteplase in acute ischaemic stroke [DIAS-2] study up to 9 h after the onset of symptoms) published around the same time has rapidly slipped from memory. There is no theoretical basis, nor any clinical data from the thrombolysis for acute myocardial infarction trials, that would suggest that tPA is less likely to cause intracranial haemorrhage (ICH) or more likely to demonstrate benefit, than any other agent.
Proponents argue it would be unethical to repeat the NINDS trial, because benefit within 3 h is already ‘proven’. In addition to ignoring the NINDS reanalysis, and multiple examples throughout medicine where follow-up studies have forced us to abandon promising claims from initial small trials, this highlights the same double standard: where are the similar complaints about retesting tPA after 3 h in ECASS III, despite far more data ‘proving’ this is harmful?
A recently published summary of Australian participation in the SITS-ISTR registry, an audit of ‘real-world’ effectiveness partly funded by Boehringer-Ingelheim, defined good outcome using mRS. Scoring of mRS, even by a neurologist, is only moderately reliable at best when done face to face. We can only imagine how much misclassification occurred in SITS-ISTR, where scoring was based on telephone interviews or ‘letter-reply form’. It is noteworthy that studies with larger numbers of patients and observers reported poorer reliability. In addition, how much did the perception of non-blinded (and undefined) individuals, who knew they were grading ‘the ideal model of care’, mirror those wine lovers who thought they were drinking expensive wine?
The third international stroke trial (IST-3) describing the use of tPA within 6 h of acute ischaemic stroke has recently been published in The Lancet. Although the largest trial of thrombolysis in stroke, it is still small compared with the trials of thrombolysis for acute myocardial infarction. Nevertheless, the primary outcome was completely negative: that is, there was no improvement in the proportion of patients who were alive and independent at 6 months. This is despite the fact that it was an open-label trial, which should favour tPA, leading some to suggest that it might actually conceal overall harm from such therapy. Even so, there were significant harms identified: an absolute 4% increase in early deaths and an absolute 6% increase in ICH. Interestingly, tPA caused significantly more brain swelling, which goes completely against any physiological mechanism of benefit, and there was no effect of time. Despite the absence of benefit, the authors used a secondary exploratory analysis to promote a supposedly positive outcome. This runs contrary to decades of attempting to improve the quality of reporting and interpretation of clinical trials. If the IST-3 study was used to support regulatory approval for tPA, it is likely that the application would be rejected on the basis that it increases early deaths with no benefit at 6 months.
There are now 12 important trials on the use of thrombolysis in stroke: six show no benefit, four were stopped early because of harm, and two methodologically flawed studies are promoted as positive. Methodological flaws contaminate the literature on this subject, decreasing the value of systematic reviews. Remarkably, it seems likely that IST-3 will be used to promote increasing thrombolysis use, despite not only the totality of the evidence, but its own overwhelmingly negative results.
As one teacher of critical appraisal frequently points out: ‘All the evidence is positive is always true … as long as we reject out of hand any evidence that fails to be positive’. Yet, there are many examples in medicine of smaller trials being positive, but larger, more definitive trials, being conclusively different, such as with percutaneous coronary intervention for stable coronary artery disease or the use of hormone therapy in postmenopausal women. Indeed, medical reversal, whereby a new superior trial contradicts current clinical practice, is seen in as many as 13–16% of trials reported in high-citation count journals. These reversals imply that current medical practice includes unjustified cost and treatment risk without benefit, with delays in changing practice and undermining trust in the medical profession. Reality checks should thus be encouraged for established practice, giving priority to testing practices having limited or no prior randomised evidence for their use.
So what is the way forward? A larger trial that essentially reproduces the NINDS study must be performed. This would retest the single hypothesis regarding the possible benefit of tPA that is not clearly inconsistent with the available evidence – that treatment begun within 3 h of the onset of symptoms of acute ischaemic stroke might be beneficial. Although some might argue that it would be unethical to retest tPA within 3 h, most of the data indicate that it should not be given beyond 3 h, and yet this was retested in ECASS III. You cannot have it both ways!