Recently, Journal of Internal Medicine published a consensus paper regarding chronic fatigue syndrome/myalgic encephalomyelitis (CFS/ME) . Although trying to reach a consensus for improved diagnostic criteria for difficult medical disorders such as CFS/ME is commendable, this paper falls well short in many respects.
It cannot be denied CFS/ME is a controversial condition. The controversy – sometimes deteriorating into overt dispute – is between those that believe that it is a nonexistent illness (‘maladie imaginaire’); those that feel it is a psychiatric disorder; and the activists (comprising patients, doctors and even some scientists) who are convinced of a somatic disease – all are unfortunately simplistic perspectives on a complex disorder. Separately, there are clinicians and scientists with an open mind, who recognize the disability associated with this enigmatic clinical illness, and who seek to engage scientifically in the challenge of defining the pathophysiology, and are therefore motivated to elucidate the biological basis of CFS in a systematic and unbiased fashion. This dispute between the various protagonists recently surfaced with the PACE trial published in the Lancet , which provided evidence for effectiveness of elements of cognitive-behavioural therapy (CBT) and graded exercise therapy (GET) for patients with CFS. This publication triggered unscientific and sometimes personal attacks on the researchers in both the scientific literature [3–10] and via the Internet . Similarly, the recent controversy on the role of the retrovirus, XMRV, in CFS  is a good example of how science and emotion (in this case mostly fear of contagion) commonly collide with regard to CFS [13–20]. Ultimately, only high-quality science will prevail.
Quite a bit of mediocre research has been carried out in relation to CFS – featuring poorly defined subject groups, inadequate sample sizes, utilizing nonvalidated assay systems, etc. Not unsurprisingly, much of that research is irreproducible when performed in a methodologically rigorous fashion. Overall, this implies that the database of solid scientific findings is rather small. Notably, no reproducible diagnostic test or even a consistently correlated biomarker has been unearthed.
Designing diagnostic criteria is very difficult without such a marker, i.e. a gold standard . In the absence of biomarkers, the resultant criteria are very prone to the individual biases of the group formulating them. One of the first questions in this process is: what is the purpose of the criteria? It makes quite a difference whether they intended for research – in which case specificity is paramount (that is, do the criteria reliably identify only subjects with the disorder?). By contrast, if the purpose is epidemiological, for instance to understand the burden of disease in the community, sensitivity becomes paramount. Finally, if the intent is clinical – that is to facilitate the diagnosis in individual patients – a mix of these parameters is relevant. Regardless of the intent, once drafted, the criteria will only achieve validity by empirical testing in different health care and cultural settings [22–24]. In the consensus paper, these issues are discussed under ‘Application of criteria’, but the recommendations have no credible rationale – for example, the recommendation in the application of the criteria in clinical settings is ‘Determine whether symptom cluster patterns are congruent with those expected from dysfunction of an underlying causal system’. Given that the underlying causal system(s) are unknown, this has no clinical utility. Similarly, in a research context, the authors suggest that the proposed criteria: ‘…focus on symptom patterns, which increase reliability. The International Symptom Scale ensures consistency in the way questions are asked and further increases the reliability of data collected in different locations’. These statements are unreferenced, and the nominated scale is unpublished. By contrast, this approach has already been proposed and applied , in relation to the existing international criteria ,
If the authors can be forgiven for omitting this essential starting premise, a more serious concern is that the paper is neither balanced in content nor does it represent a true consensus of the spectrum of credible scientific views. The group that gathered to write this report appears to have carefully avoided many mainstream clinicians and scientists who, like the authors, have many ‘years of both clinical and teaching experience, authored hundreds of peer-reviewed publications, diagnosed or treated.. thousands of patients…’– but who hold divergent views. It is striking that the paper purports 100% consensus in a field where there are so many different opinions.
Although it is a widely held view that ‘a better name is needed’ , it is unfortunate that the authors propose to revert to the term ‘myalgic encephalomyelitis’ (ME), abolishing CFS. By doing this, they ignore the fact that inflammation (-itis) has not convincingly been demonstrated in any organ . This labelling decision risks mistakenly giving patients the impression that the condition is because of inflammation and hence that anti-inflammatory medications may be curative, when clearly they are not [29–32]; and biasing the research agenda at the potential expense of alternative biological hypotheses.
Next, in a kind of ‘Catch 22’ scenario, the authors suggest that the currently accepted international diagnostic criteria, originally formulated by an expert group from several countries under the auspices of the Centers for Disease Control (CDC)  and subsequently modified by a further international group  are too broad (i.e. lack specificity) and allow patients with primary psychiatric disorders, notably major depression to be labelled with CFS/ME. Thus, the authors seek to discard the findings in published studies that have applied the existing international criteria, if the results do not fit with their notions of causation. As an example, if a study shows that CBT is effective, these authors suggest that the study actually included patients with psychiatric disorders, and not ME; hence, the positive results can be ignored. The same ‘logic’ may hold if one cannot reproduce findings of alterations in immune or hormonal pathways, autonomic reflexes, etc. By contrast, if decreased grey matter in the brain is demonstrated (as we did ), the data are arbitrarily included, even though the same diagnostic criteria were used for inclusion.
The literature cited in the paper is heavily biased towards positive findings (but sadly omitting the numerous failed replication studies), thereby creating a pseudo-science of the pathophysiology. A worrisome example is that XMRV is proposed without any mention of the host of scientific criticisms, failed replication studies and further investigations that have now led to the resolution of contaminating mouse DNA containing retroviral sequences as the culprit [13–20]. In a 21st-century consensus document, accounting in a balanced fashion for the strength of the evidence is an essential element. The current document is just a list of findings, most unconfirmed.
Another major problem with the proposed criteria is the suggestive and misleading phrasing: when describing the criteria that assess fatigue, the label becomes ‘post exertional neuroimmune exhaustion’. Where is the Level 1 evidence for the ‘neuro’ and for the ‘immune’? The same holds for ‘neurological impairments’– all the listed items are subjective, none validated by objective abnormalities – for example, ‘muscle weakness’ and ‘ataxia’, which imply a demonstrable neurological deficit when one is present. Also, the labels for ‘immune, gastrointestinal and genitourinary impairments’ and ‘energy production/transportation impairments’ and the elements within them are highly suggestive of a notional pathophysiology (although none is known, and many have been well-refuted – see [11, 35, 36] for reviews). For example, a symptom of ‘dizziness’ warrants a label of ‘cardiovascular impairment of energy production’. Yet no such deficit in metabolism is evident.
We fear that with the publication of this report, that both the clinical and research agendas in relation to CFS will lose their credible scientific base, via introduction of yet another diagnostic criteria set. Under the auspices of the CDC-convened international expert group, we previously demonstrated that chronic fatigue states – regardless of exactly how they are defined, share a common and relatively stereotyped set of symptom domains which can be readily identified in the community, at all levels of health care, and across cultures . We suggest that there is little to be gained by reshaping the diagnostic criteria. For the benefit of our patients, we should rather recognize the intrinsic heterogeneity in syndromal diagnoses , record and stratify by potentially relevant parameters in research studies , and continue to pursue evidenced-based treatment interventions and high-quality studies examining pathophysiology.