Description of the condition
Patients with physical symptoms which cannot be explained by pathologically defined disease are common in primary care (Fink 1999a; Kroenke 1989; Peveler 1997; Toft 2000; Rosendal 2000; Steinbrecher 2011; Verhaak 2006). They represent a spectrum of conditions ranging from mild self-limiting symptoms to severe and disabling disorders (Katon 1991; Rosendal 2007). As the number and severity of symptoms increases, so does the disability and the prevalence of psychological distress and dysfunctional illness cognitions (Kroenke 1994; Toft 2000); there is good epidemiological evidence that the physical and psychological processes are inter-related (Aggarwal 2006; Hotopf 1998).
For the purposes of this review, the term Functional Somatic Symptoms (FSS) is used to refer to the presence of physical symptoms that are not attributable to organ pathology or any conventionally defined disease and which meet additional research criteria such as the number of symptoms, other clinical characteristics or a doctor's assessment. This term is used in preference to the alternative term of Medically Unexplained Symptoms (MUS) in order to cover a more neutral and broad concept (Creed 2010). Patients with severe FSS have impaired health related quality of life (Gureje 1997; Smith 1986), disproportionate healthcare costs (Barsky 2001; Fink 1999a; Smith 1986) and lower satisfaction with their healthcare providers (Lin 1991; Toft 2000).
There is no universally agreed way of classifying FSS (Sharpe 2006). The concept includes functional somatic syndromes, clusters of related symptoms with diagnostic criteria specific to one organ system or medical speciality. Some of the best known are irritable bowel syndrome, fibromyalgia and chronic fatigue syndrome. At least 29 such syndromes have been described (Henningsen 2007); new syndrome definitions continue to appear and old ones to evolve. Importantly, there is substantial overlap between syndromes (Deary 1999; Wessely 1999) and most adult patients with FSS, regardless of which symptoms they present with, experience symptoms in a range of bodily systems (Fink 2007). Psychiatric classifications typically view individuals on a spectrum of severity from the most severe somatisation disorder to undifferentiated somatoform disorder (World Health Organisation 1993; Francis 1994). Some patients on the FSS spectrum will also demonstrate pathological health anxiety or hypochondriasis (Francis 1994; World Health Organisation 1993). While many patients with FSS meet criteria for co-morbid anxiety or depressive disorders, these are not invariably present and the concept that FSS might simply represent somatised psychiatric illness is not tenable.
The prevalence of FSS in primary care depends on the sampling strategy and the definitions used. Studies of the reason for consulting find that approximately 15% of patients seeing their general practitioner (GP) do so for a symptom not obviously explained by organic disease (Peveler 1997; Rosendal 2000), but the proportion consulting repeatedly for an unexplained symptom is considerably lower (Verhaak 2006). Some 20% to 30% of GP consulters aged 18 to 65 years meet criteria for somatoform disorders (de Waal 2004; Fink 1999a; Toft 2000) but not all will present unexplained symptoms at that consultation. Less than 10% of GP consulters have the most severe somatisation disorder (Fink 1999a; Toft 2000). The natural history of FSS varies between individuals, and symptom patterns frequently change over time. Overall approximately half of patients with FSS will improve spontaneously over a year while 10% to 30% deteriorate (Barsky 1998; Craig 1993; Gureje 1999; Lieb 2002; Olde Hartman 2009).
Description of the intervention
Methods of treating functional somatic symptoms
Historically, attempts to treat FSS have used a psychosomatic perspective, whereby physical symptoms are thought to arise from (hidden) mental distress. Mind body interactions are now seen as more complex but can be usefully viewed within a cognitive-behavioural model. This perspective has led to effective treatment by specialists for somatoform disorders (Kroenke 2000) and some specific syndromes (Henningsen 2007). An evolving family of brief therapies for primary care, originally termed "reattribution" (Goldberg 1989) has been developed for use in primary care; such therapies have features in common with the more detailed cognitive behavioural models. The original reattribution model targeted psychiatric illness, that is presenting somatisation (Goldberg 1989), and was built on problem-solving therapy. The original model included three steps, 1) making the patient feel understood; 2) negotiating a change of the agenda; and 3) making the link between physical symptoms and mental health. Later the model was extended by Gask and Morriss to a wider range of somatisation processes (Morriss 2006). A recent extended reattribution model integrated elements from cognitive therapy, such as 'reframing', in a way which integrated mind and body using a descriptive approach to FSS rather than implying causation (Toft 2000).
Enhanced care in the primary care setting
Enhanced care for functional somatic symptoms is defined as the use of a structured treatment model. In the case of functional somatic symptoms, such structured models draw on explanations for symptoms in broad bio-psycho-social terms or encourage patients to develop additional strategies for dealing with their physical symptoms, or both. 'Enhanced care' includes techniques from “the Reattribution Model” (Goldberg 1989) or “reframing” (Fink 2002) from cognitive behavioural therapy, or both. Treatment is delivered by primary care clinicians to their own patients after the training of these physicians in the enhanced care model. Primary care clinicians include doctors and other healthcare professionals providing first contact care across a wide range of clinical domains (Boerma 1999; Starfield 1994; World Health Organisation 2001). Typically, training will involve experienced primary care clinicians taking time from their routine work and being taught both a theoretical framework and practical techniques to use within consultations with their own patients. Following the training, additional time for longer consultations may, or may not, be made available for the trained clinicians.
How the intervention might work
Early studies of reattribution training in primary care used a before and after evaluation with no randomisation. They showed that GPs' skills were improved (Kaaya 1992) in a way which improved patient well-being (Morriss 1999) and reduced healthcare costs (Morriss 1998); these findings were taken to support the notion that making the link between psychological distress and physical symptoms led to better outcomes.
Subsequent studies have suggested various additional mechanisms, including providing extra time for patients, allowing expression of emotions or building useful explanations (Brody 1990; Dowrick 2004; Salmon 2007a). Given that FSS is an umbrella term for a heterogeneous group of patients and disorders, it is likely that different psychological and physiological processes will be relevant for different individuals. It is also plausible that paradoxical effects of psychosocial intervention will be seen whereby some patients benefit while others become more distressed, at least in the short term. There is some epidemiological (Kirmayer 1991), experimental (Graugaard 2003) and interventional (Schweickhardt 2005) evidence to support this conjecture.
Why it is important to do this review
At the start of this review we were aware of several studies which had evaluated enhanced care models for patients with FSS. While these have shown positive effects on GPs' attitude and diagnostic awareness (Rosendal 2000), effects on patient outcomes have been modest and insignificant in individual trials. We wished to carry out a systematic review and meta-analysis of the effectiveness of these interventions in pragmatic trials.
This review fits alongside two other Cochrane reviews, 1) 'consultation letters for medically unexplained physical symptoms in primary care' (Hoedeman 2010), which involves specialist assessment of individual patients; and 2) 'psychosocial interventions by general practitioners', including studies in the broader review of psychosocial interventions delivered by general practitioners (Huibers 2009). Our focus was exclusively on FSS and interventions performed by front line primary care professionals.
We aimed to assess the clinical effectiveness of enhanced care interventions for adults with functional somatic symptoms in primary care. The intervention should be delivered by professionals providing first contact care and be compared to treatment as usual. The review focused on patient outcomes only.
Criteria for considering studies for this review
Types of studies
All randomised controlled trials (RCTs) of enhanced care.
Studies were included without regard to the unit of randomisation, that is whether individual clinicians or clusters of clinicians were randomised.
Crossover studies were not included because the nature of the intervention, a change in practice, permits crossing over in one direction only.
Types of participants
Studies were restricted to those in which treatment was delivered by generalists such as general practitioners, nurse practitioners or other healthcare professionals working within the primary care setting with first contact and ongoing care for patients regardless of their presenting problems. This excludes care provided by mental health professionals. No exclusions were made on the basis of age, years of practice, practice type, and previous psychological training.
Studies were limited to those involving adults (at least 18 years old).
Functional somatic symptoms, identified either by case finding or by the primary care clinician’s assessment of the presenting problem. In the absence of universally agreed criteria for FSS, studies in which participants were included based on their clinician's own assessment of the presenting problem as 'medically unexplained' were eligible. Case finding studies were required to use validated instruments for FSS such as rating scales and interviews ( Table 1).
Studies of a single functional syndrome (for example irritable bowel syndrome) were eligible provided they met the other criteria.
Our definition of FSS excluded factitious disorders and psychiatric illness arising as a complication of an organic disease (for example depression in somebody with heart disease).
Studies where the primary entry criterion was a specific non-somatoform mental health disorder (for example depression, anxiety, post-traumatic stress disorder) were excluded. However the presence, or continuing treatment, of one or more common mental disorders such as depression and anxiety among those with FSS did not lead to automatic exclusion.
Studies were restricted to the primary care setting and to treatment models which are specific to that setting. Studies of specialist interventions hosted in primary care (for example prolonged contact with a psychotherapy specialist at the primary care clinic rather than hospital) were not included.
Types of interventions
Studies were required to include specific training of participating clinicians; in groups where practitioners work together this may be either to individuals or to the whole group. Training focused on the implementation of a model of enhanced care based on reattribution, reframing of symptoms, or could be characterised by psychosomatic explanations for physical symptoms and 'making the link' between symptoms and mental distress. The intervention had to be delivered by either a generalist after additional training or a generalist acting as an intermediate specialist, such as a general practitioner with a specific interest. Delivery could be provided either within routine consultations or during additional dedicated appointments which may be longer than usual or involve specific reimbursement.
We did not pre-specify criteria for the duration, content and degree of formalisation (for example written manual) of training as different methods are applicable in different professional cultures. Studies were restricted to treatment models which are specific to the primary care setting; specialist interventions hosted in primary care (for example treatment by a psychotherapy specialist at the primary care clinic) were not included.
The review does not include organisational changes involving shared care models with mental health professionals as this is dealt with separately in a different Cochrane review (Hoedeman 2010). Trials of pharmacological treatment were excluded but pharmaceutical treatment was permitted as part of the general treatment.
We included studies in which the intervention was compared against usual care in order to enable comparisons between studies.
Types of outcome measures
1. Patient health status as measured by a validated quality of life tool.
We chose quality of life as this includes both symptoms and function. For the Short Form-36 (SF-36) (Ware 1992) and related tools (SF-12 (Gandek 1998), SF-8 (Turner-Bowker 2003)), the primary outcome was specified as the physical and mental component summary scores or, if not available, the items physical functioning and mental health.
2. Measures of symptom load (number or severity, or both, of symptoms), for instance using a checklist such as Patient Health Questionnaire-15 (PHQ-15) (Kroenke 2002), Symptom Checklist (SCL) (Derogatis 1977), or Somatic Symptom Index (Escobar 1989). While symptoms are likely to be related to health related quality of life, the number and intrusiveness of symptoms appear to be a central characteristic of FSS.
4. Depression and anxiety measured by questionnaire (e.g. Hospital Anxiety & Depression Scale (Zigmond 1983)).
5. Functional status measured as sick leave.
6. Patient satisfaction with care.
7. Health care utilisation: the review will analyse medical consumption if stated outcomes can be compared either as number of visits and days in care or as healthcare costs.
8. Discontinuation of follow-up: we examined attrition as a proxy for disengagement with treatment and thus as an indirect proxy for adverse response to the intervention.
We analysed the primary and secondary outcomes as continuous measures except for discontinuation of follow-up.
Timing of outcome assessment
Outcomes were categorised as short term (0 to 5 months) and longer term (6 to 12 months). Where more than one longer term outcome was available we used the one closest to 12 months. The primary focus was on longer term follow-up data where available.
Search methods for identification of studies
We used a broad list of search terms for functional somatic symptoms (FSS) but limited the search to RCTs in primary care.
Electronic searches were conducted on:
- the Cochrane Depression, Anxiety and Neurosis Review Group Specialised Register (CCDANCTR-Studies and CCDANCTR-References) (all years to 13 August 2012), see Appendix 1;
- Ovid MEDLINE (1950 - ), EMBASE (1980 - ) and PsycINFO (1806 - ) (to 13 September 2012), see Appendix 2;
- the Database of Abstracts of Reviews of Effectiveness (DARE), CINAHL (1982 to April 2010), PSYNDEX, SIGLE, and LILACS (to April 2010);
- the Cochrane Central Register of Controlled Trials (CENTRAL) (October 2009).
A supplementary search of international trials registries was conducted to identify unpublished and ongoing studies: ClinicalTrials.gov, and the World Health Organization International Clinical Trials Registry Platform portal (ICTRP) (September 2012).
An author name search was conducted in PubMed (July 2011) for the authors of included studies and a citation search was conducted on the Web of Science (September 2012).
Searching other resources
The electronic searches were supplemented by handsearches in:
- conference proceedings (2004 to 2012): European Association of Consulation Liaison Pyschiatry and Psychology (includes European Psychosomatic Research Conference) and American Psychosomatic Society (September 2012);
- reference lists of retrieved and potentially relevant papers, as well as relevant systematic reviews, dissertations, theses and literature reviews (July 2011).
Furthermore, we contacted the first authors of included studies and other experts in the field for information about published or unpublished studies (August to November 2011) .
Data collection and analysis
Selection of studies
MR and CB independently screened all study abstracts identified by the search strategy. Disagreements about the selection of a trial were resolved by a third review author (MS) who assessed the trial and the evaluations by MR and CB. In view of the relatively small field of interest and the involvement in the review of several trial investigators, blinding of review authors was not possible.
Data extraction and management
Data from selected publications were independently extracted by two of three review authors (MR, CB, AHB) thereby avoiding investigators reviewing their own studies. Disagreements between the authors were resolved by discussion between CB and MR after further checking of published and unpublished results. The data extraction form is presented in Table 2 and was pilot tested on two studies by MR and CB before implementation.
Assessment of risk of bias in included studies
We assessed the risk of bias using an adapted version of the Cochrane Collaboration tool. This asks specific questions about trial quality in several domains. The tool was adapted by adding a specific item about recruitment bias. Two review authors (MR, CB, or AHB) each independently answered all the questions in this tool in one of three ways: 'High', 'Low' or 'Unclear' risk of bias. Domains for bias and related questions were the following.
- Random sequence generation: was the allocation sequence adequately generated?
- Allocation concealment: was allocation adequately concealed?
- Recruitment bias: in cluster randomised trials, was clinician allocation known before recruiting participants?
- Blinding of participants, personnel and outcome assessors for each main outcome or class of outcomes: was knowledge of the allocated intervention adequately prevented during the study?
- Incomplete outcome data for each main outcome or class of outcomes: were incomplete outcome data adequately addressed?
- Selective reporting: are reports of the study free of suggestion of selective outcome reporting?
- Other sources of bias: was the study apparently free of other problems that could put it at a high risk of bias?
Where there was disagreement between review authors, CB and MR resolved this by discussion following further checking of published results and, if necessary, contact with study authors.
Measures of treatment effect
Where possible we extracted measures of treatment effect using continuous outcomes. As the measures chosen for specified outcomes varied between studies, we converted all results of continuous outcomes to standardised mean difference (SMD). Standard rules of thumb in interpretation of effect sizes are that SMD < 0.4 represents a small effect, 0.4 to 0.7 a moderate effect, and > 0.7 a large effect (Higgins 2011).
After inspecting the data from eligible publications we found that all but one (Blankenstein 2001) reported most outcomes as between group differences in change from baseline, with 95% confidence intervals, based on multilevel regression adjusted for GP or practice clustering, or both. Three studies included supplementary covariates in their regression models (age and gender (Morriss 2006; Rosendal 2000) and age, gender and chronic disease (Toft 2000)). In order to include as much data as possible, we obtained raw data from the parallel group design subsets of two studies (Blankenstein 2001; Rief 2005) and calculated adjusted difference using a random-effects multilevel model with outcome measures adjusted for baseline value and clustering at the practice level (using the 'lmer' module in R 2.14). Because some publications only reported the adjusted difference between groups with no absolute value for the change in the control group, we adopted a consistent approach to reporting in the tables such that the change in the control group was centred to zero. For the calculation of the SMD we used standard errors or confidence intervals for the adjusted differences. Standard deviations at baseline were not used due to high attrition rates in several studies and limited availability of standard deviations at follow-up.
We planned to analyse dichotomous outcomes by calculating a pooled risk ratio (RR) for each comparison, with 95% confidence interval. This was used to analyse discontinuation. Where different studies had a mix of continuous and dichotomous outcomes (for instance depression caseness) we converted odds ratios (OR) to SMD using the formula SMD = ln(OR)/1.81 as outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). Adjusted confidence intervals were used to calculate the standard deviations (SDs).
Unit of analysis issues
Cluster randomised trials
The included studies were all interventions at the GP level, but with outcomes at the patient level. Hence we expected, and found, that most studies were conducted as cluster randomised trials. By using published results of multilevel regression, or re-analysing raw data from studies, we were able to make comparisons adjusted for clustering.
Studies with different clinical entry criteria for the inclusion of participants were included in the same analysis, as planned, rather than separated into different patient groups for analyses.
Trials with multiple treatment groups
Trials with multiple treatment groups could only be included if they also contained a group receiving treatment as usual. In order to avoid multiple, correlated comparisons we would preferably choose to include one intervention only. This intervention would be the one assessed to fulfil the inclusion criteria for enhanced care best with regard to: 1) reframing, 2) explanations for physical symptoms linking symptoms and mental distress, or 3) reattribution. Alternatively, if group sizes allow for it, we would split the control group into two, allowing for two comparisons to be included in the analyses.
Dealing with missing data
We approached the original authors to obtain missing data. When missing data were not available for a dichotomous outcome we planned to do intention-to-treat (ITT) analysis in which it would be assumed that participants who dropped out after randomisation had a negative outcome. Furthermore, we planned to calculate best-worse case scenarios for the clinical response outcome assuming that dropouts in the active treatment group had positive outcomes and those in the control group had negative outcomes (best case scenario), and that dropouts in the active treatment group had negative outcomes and those in the control group had positive outcomes (worst case scenario), thus providing boundaries for the observed treatment effect. However, as any observed treatment effects were small and the number of dropouts was high in some studies, we did not conduct this analysis.
Missing continuous data were analysed on an endpoint basis, including only participants with a final assessment, as individual patient data were not available to conduct imputations. Where SDs were missing, attempts were made to obtain these data through contacting trial authors. Where SDs were not available from trial authors, they were calculated from t-values, confidence intervals or standard errors where reported in articles (Deeks 2007a; Deeks 2007b). When these additional figures were not available or obtainable, the study data would not be included in the comparison of interest.
Assessment of heterogeneity
Statistical heterogeneity was assessed to examine whether the variation in treatment effects between studies was greater than expected by sampling variation alone. Assessment included the Chi
Assessment of reporting biases
Possible bias due to selective reporting of results within publications was addressed using the risk of bias tool. Publication bias was not formally investigated (for instance by a funnel plot) because of the small number and heterogeneity of studies.
The protocol specified that a meta-analysis be conducted if sufficient comparable studies with low risk of bias were available. We considered at length the question of whether this was the case; in particular we observed that studies varied in the intensity of the planned intervention and in the illness severity of eligible patients. Given this observed clinical heterogeneity between studies, we took a conservative approach to statistical heterogeneity such that where the I
The statistical analysis included all studies eligible for quantitative analyses. A secondary sensitivity analysis was planned to exclude studies rated as having an overall high risk of bias.
Subgroup analysis and investigation of heterogeneity
As the objective of the review was to compare broadly similar treatments in broadly similar individuals we proposed to carry out only one subgroup analysis comparing two equal groups representing studies with more training of practitioners and those with less. In subsequent reviews it would be appropriate to specify illness severity (either by an agreed score or by the proportion of the practice population eligible for inclusion) and treatment intensity (in terms of number, duration and content of sessions).
We planned to carry out sensitivity analyses by limiting the analysis to studies with low or moderate risk of bias as determined by risk of bias domains, including:
- blinding of outcome assessors;
- allocation concealment; and
- dropout rate lower than 20%.
Description of studies
Results of the search
After removal of duplicates, 2288 references were identified by the searches and assessment led to the checking of 48 full text articles. Seven studies (11 publications) were included in the review and six were eligible for quantitative analysis (see flow diagram in Figure 1).
|Figure 1. Study flow diagram.|
We identified seven studies (Blankenstein 2001; Larisch 2004; Morriss 2006; Rief 2005; Rosendal 2000; Toft 2000; Whitehead 2002) in which practitioners delivered enhanced care to participants with functional somatic symptoms. These varied considerably in terms of design, sample size, setting, participants, interventions and outcomes. The studies and differences between them are summarised in Table 3 and in the Characteristics of included studies table.
All authors were approached to provide clarification of various aspects of their studies. Furthermore, Blankenstein, Rief and Toft were asked for additional data in order to allow their data to be included in analyses. From Blankenstein (Blankenstein 2001) we received mean values for self-rated physical health and symptom load as only medians were published. From Rief (Rief 2005) we received a dataset on the group which could be interpreted as an RCT (cohort 2). Finally, from Toft (Toft 2000) we received data on patient satisfaction for the group of patients with somatoform disorders as only the outcome on all patients in primary care was published.
All included studies were cluster randomised trials with participants seeing their usual doctor. All but one (Rief 2005) of these randomised practices to intervention or control. The remaining study (Rief 2005) used a stepped wedge approach (Brown 2006) in which GPs received training in two phases; this design includes features of a before-and-after study as well as an RCT. We included only data from the randomised cohort in the review. Most studies had a primary outcome (which represented an effect of treatment) and this was appropriately recorded at baseline before intervention and at subsequent follow-up. One study was primarily a feasibility study designed to test the use of reattribution skills by practitioners after training (Morriss 2006).
One study was not included in the analyses as it contained too little information on design and results (Whitehead 2002).
Most studies randomised GPs or practices, with between 50 and 100 participating patients per group; two had around 150 per group (Rief 2005; Toft 2000) and one was larger with approximately 450 per group (Rosendal 2000). This large trial had the least restrictive inclusion criteria and GPs thought that less than one in five of eligible patients had any form of somatisation. For each study eligible for quantitative analyses we calculated the proportion of the total practice population eligible to take part in the study ( Table 3). We did this either by using data from the published papers, for instance when studies stated the proportion of GP practices taking part and the total population of the study area, or by applying the average list size per GP (Germany) or per practice (England) to the participating practices. Using these approximations, we found a 10-fold difference in the proportion of patients eligible to take part in studies from the estimated populations, from 1.3 to 14 per 1000.
All studies were performed in Europe (two in Denmark; two in Germany; two in the United Kingdom; one in the Netherlands) and carried out in primary care, with the intervention delivered solely by a primary care physician who was usually involved in the participant's care.
Participating GPs (a total of 233) and GP practices included both single-handed and group practices. Most practices were small (one to three doctors). All studies with group practices were randomised by practice.
Participants were enrolled in studies only if they were already registered with their GP practice. Studies varied markedly in terms of participant selection both in terms of identification procedures and in whether they were identified before or after GP or practice randomisation and intervention. Identification procedures for some studies involved an independent researcher and for others the GP. Four studies identified participants by the researchers: in three this was by the use of screening questionnaires or standardised interviews (Larisch 2004; Rosendal 2000; Toft 2000) and in the fourth by an independent GP judging the content of the recorded consultation (Morriss 2006). Three studies used GP judgement to identify participants: in one study this was carried out in advance of GP randomisation (Blankenstein 2001), in one study it was after randomisation (Whitehead 2002), and for one study this was not clear (Rief 2005).
Participant inclusion criteria varied considerably between studies such that some included only small numbers of relatively severe cases (Blankenstein 2001; Larisch 2004) while others included a mixture of mildly, moderately and severely affected patients (Rosendal 2000). Most studies only included adult participants within the age range of 16 to 65 years although two effectively had no upper age limit (Morriss 2006; Rief 2005) and one was restricted to adults aged 20 to 45 years (Blankenstein 2001). All had a predominance of female participants (65% to 97% female). A total of 1787 participants were included.
Terms used for the participants included in the studies that were reviewed were: chronic fatigue syndrome (Whitehead 2002), frequent attenders (Blankenstein 2001), functional somatic symptoms (Toft 2000), hypochondriasis (Blankenstein 2001), medically unexplained (physical) symptoms (MUS/MUPS) (Morriss 2006; Rief 2005), somatising patients or somatisation (Blankenstein 2001; Larisch 2004; Rosendal 2000), and somatoform disorders (Toft 2000).
One study used reattribution as originally specified (Goldberg 1989), trained GPs for six hours, and tested fidelity to the model (Morriss 2006). Two studies used The Extended Reattribution Model (TERM) (Toft 2000) taught over 25 hours (Rosendal 2000; Toft 2000). A further two studies described the intervention as 'modified reattribution', taught over 20 hours (Blankenstein 2001) and 12 hours (Larisch 2004). The remainder of experimental interventions were described as modified cognitive behavioural therapy (CBT) (Whitehead 2002) and GP management of MUS (Rief 2005).
One study also included a specification of the number of consultations offered to participants (Larisch 2004), whereas the patient interventions delivered in the remaining studies were arranged individually with their physician during normal consultation hours. In two studies GPs followed a manual when treating included participants (Blankenstein 2001; Larisch 2004).
Three studies included some measure of adherence, for example the number of participants attending the predefined number of consultations (Larisch 2004), but only two studies included a measure of achieved reattribution (Blankenstein 2001; Morriss 2006).
In six studies the comparator group was described as usual care (Blankenstein 2001; Morriss 2006; Rief 2005; Rosendal 2000; Toft 2000; Whitehead 2002), and in one it was described as routine psychosocial care delivered by the GP (Larisch 2004).
Outcome measures at 12 months were the primary outcome for two studies (Larisch 2004; Rosendal 2000); and a secondary outcome for two (Blankenstein 2001; Toft 2000) which set the primary outcome at 24 months. Three of these studies also reported secondary outcomes at three months (Larisch 2004; Rosendal 2000; Toft 2000). Of the remaining studies, one reported outcome measures only at three months (Morriss 2006) and the other at four weeks and six months (Rief 2005).
Trials varied in rates of attrition over time and only one trial (Morriss 2006) imputed values for missing data. Thus, the data used in the review were the scores of participants who successfully completed follow-up, except for Morriss where only imputed values could be extracted from the trial.
One study provided insufficient information on the data and had so poor patient recruitment (Whitehead 2002) that it was not included in the analyses. For each of the remaining six studies we made forest plots and carried out assessment of heterogeneity for outcome measures for which results were available. Supplementary data or analyses were received from three studies (Blankenstein 2001; Rief 2005; Toft 2000) in order to enable the inclusion of relevant patient samples for stated outcomes. The six studies included a total of 1787 participants but numbers varied in the analyses depending on the chosen outcomes.
Thirty-seven publications representing 28 studies were excluded: details are stated in the Characteristics of excluded studies table. We mainly excluded trials (n = 14) where specialist practitioners delivered treatment, including specially trained GPs treating patients after referral (Allen 2006; Barsky 2004; Bernal 1995; Escobar 2004; Huibers 2004; Kennedy 2001; Lidbeck 1997; Lofvander 2002; Magallon 2008; Margalit 2008; Martin 2007; McLeod 1997; Ridsdale 2001; Schade 2011; Schaefert 2011; Schilte 2001; Smith 2009). In these studies the intervention was not delivered by front line primary healthcare professionals who treated their own patients. Another three studies involved specially trained healthcare workers seeing patients in the primary care setting (Kennedy 2001; Smith 2006) or specialist advice to GPs during the trial (Pols 2008). Other reasons for exclusion were: specified GP treatment which did not involve reattribution or reframing of physical symptoms (Alamo 2002; Jellema 2005; van Bokhoven 2009; van der Horst 1997) or included participants who did not fulfil criteria for MUS (Bakker 2007; Klapow 2001). Finally two studies were excluded because they did not include a trial arm with usual treatment (Aiarzaguena 2007; Sumathipala 2008).
Studies on specific functional somatic syndromes, such as irritable bowel syndrome (IBS), were not specifically excluded but eventually we found none that met the inclusion criteria for the review.
We are not aware of any ongoing studies.
Risk of bias in included studies
The assessment of bias is described for each study in the Characteristics of included studies table and summarised graphically in Figure 2 and Figure 3 using the Cochrane Collaboration’s tool for assessing risk of bias (Higgins 2011), amended to include recruitment bias. The criteria used to assess bias are listed in Table 4. One of the seven studies (Whitehead 2002) had greater risk of bias (all items rated as 'High' or 'Unclear') than the others.
|Figure 2. Risk of bias summary: review authors' judgements about each risk of bias item for each included study.|
|Figure 3. Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.|
Most studies had low risk of bias during sequence generation and allocation concealment, although one study provided incomplete information about the randomisation procedure (Whitehead 2002). The risk of recruitment bias was more of a concern: one study in which the physician selected participants after allocation was judged to be at high risk (Rief 2005). Three studies in which participant identification followed physician allocation but was carried out independently by a researcher or using a systematic questionnaire were judged to be at uncertain risk (Larisch 2004; Rosendal 2000; Toft 2000). Two studies were judged to be at low risk of recruitment bias either because participants were identified before physician allocation (Blankenstein 2001) or participants were identified blind to the physician allocation (Morriss 2006). The three studies with identification of participants after practice randomisation showed an imbalance in baseline values for patient health, with poorer mental health or higher degrees of disability in the intervention group compared to the control group (Larisch 2004; Rosendal 2000; Toft 2000); it was not clear if this represented systematic bias or simple chance.
As studies involved teaching in order to change physician behaviour, the participating physicians were clearly not blinded and thus might have changed their behaviour in ways other than intended in the intervention. Hence, all included studies were judged as having high risk of performance bias.
Two studies specified extra consultations as part of the intervention, which could have led to unblinding of the participants. One study compared the intervention with psychosocial primary care without additional physician training but also including extra consultations (Larisch 2004), the other only included additional consultations for trial participants, possibly unmasking the intervention to participants (Blankenstein 2001). In the other five studies trained physicians were not required to recall participants for extra appointments, although they were not prevented from doing so.
Four studies included interviews and clearly described outcome assessments as carried out by researchers blinded to allocation (Larisch 2004; Morriss 2006; Rief 2005; Toft 2000). In the remaining three studies, outcome was assessed using patient questionnaires (Blankenstein 2001; Rosendal 2000; Whitehead 2002).
Incomplete outcome data
Two studies were judged to be at low risk of bias due to attrition (Morriss 2006; Toft 2000); these had less than 15% loss for short term or 20% loss for long term (one year) outcomes. Three studies were judged to have unclear risk of attrition bias: reporting higher longer term (30% to 40% at one year) attrition although the loss to follow-up was balanced between allocation arms (Blankenstein 2001; Larisch 2004; Rief 2005). Two were regarded as at high risk of attrition due to unbalanced and high attrition at one year (Rosendal 2000; Whitehead 2002).
All studies except one only published pre-specified primary and secondary outcomes and showed low risk of reporting bias. However, for one study the findings were only documented in a peer reviewed PhD thesis and not in international publications (Blankenstein 2001). Two studies were judged as unclear: one was only a pilot trial with primary outcomes on communication, yet gave high priority to secondary outcomes (Morriss 2006), and one did not specify a primary outcome (Whitehead 2002).
Other potential sources of bias
Only one study specifically addressed treatment fidelity with the recording of all consultations (Morriss 2006). In this study 35% of eligible consultations met the pre-specified criteria of delivering the intervention (reattribution). One other study obtained recordings on 2 of 10 participants from 7 of 10 participating GPs but also based adherence on GP registration forms about content of consultations with included participants (Blankenstein 2001). In this study the intervention (modified reattribution) was applied in 68% of included participants.
Four studies specifically addressed the cluster randomisation of physician or practice in their analysis (Larisch 2004; Morriss 2006; Rosendal 2000; Toft 2000), but three did not (Blankenstein 2001; Rief 2005; Whitehead 2002). One study used a stepped wedge design in which practices were randomised to receive training at one of two timepoints. We used only data from the stage when only one group had been trained as this most closely resembled a conventional cluster randomised trial. However, physicians were trained (at four months) before the end of the follow-up period (at six months) leading to potential contamination of the results in the control group.
Effects of interventions
Comparison 1: enhanced care versus treatment as usual
1.1 Health related quality of life
Five studies reported a measure of health related quality of life but these varied in type and time intervals between studies. Two studies used the SF-36 (Rosendal 2000; Toft 2000) and one the SF-12 (Larisch 2004) for short and long term follow-up. Data on the physical and mental component summaries were available from all of these studies. Two studies reported a global (that is combined physical and mental) quality of life measure: the EQ5D at three months (Morriss 2006) and a seven-item scale reported as the average score at 24 months (Blankenstein 2001).
With regard to physical functioning, data were eligible for quantitative comparison from four studies (Blankenstein 2001; Larisch 2004; Rosendal 2000; Toft 2000) at long term follow-up (12 to 24 months), and four studies (Larisch 2004; Morriss 2006; Rosendal 2000; Toft 2000) at three months. The two studies reporting global quality of life were included with those reporting physical quality of life. One study used the EQ5D and found different effects between the EQ5D score and the health thermometer component; we included the health thermometer as this was comparable to the visual analogue scale used in another study.
Comparison at long term follow-up ( Analysis 1.1) showed substantial heterogeneity (I
Three studies reported a measure of mental quality of life; two used the mental component summary of the SF-36 (Rosendal 2000; Toft 2000) and one of the SF-12 (Larisch 2004) ( Analysis 1.3; Analysis 1.4). Heterogeneity of outcomes was low at 12 and three months (I
1.2 Measures of symptom load
Five studies reported a measure of symptom load. Three studies used the Hopkins Symptoms Checklist for Somatoform Disorders (SCL-SOM) 12-item measure (Blankenstein 2001; Rosendal 2000; Toft 2000); two used the Screening for Somatoform Disorders in the last seven days (SOMS) measure (Larisch 2004; Rief 2005). Meta-analysis of the five studies showed substantial heterogeneity at 6 to 24 and 1 to 3 months follow-up (I
1.3 Patient illness worry
Four studies used an objective measure of illness worry: one used the 14-item Whiteley Index (Morriss 2006) and three used the shorter Whitely-7 Index (Rief 2005; Rosendal 2000; Toft 2000). Meta-analysis of these studies ( Analysis 3.1; Analysis 3.2) showed little heterogeneity (I
1.4 Depression and anxiety
Five studies reported a measure of depression. Instruments varied between studies: two used the Hospital Anxiety & Depression Scale (Larisch 2004; Morriss 2006); two the eight-item Hopkins Symptom Checklist (SCL-8) (Rosendal 2000; Toft 2000), and one the Beck Depression Inventory (Rief 2005). Quantitative comparison of studies included four studies for depression at 6 to 12 months follow-up (Larisch 2004; Rief 2005; Rosendal 2000; Toft 2000) and a fifth (Morriss 2006) at one to three months follow-up. There was low to moderate heterogeneity for both follow-up periods: I
Three studies reported a measure of anxiety: in two the Hospital Anxiety & Depression Scale (Larisch 2004;Morriss 2006), and in one the Beck Anxiety Inventory (Rief 2005). The two studies which reported outcomes at 6 to 12 months found no effect of intervention: SMD -0.07 (95% CI -0.38 to 0.25) ( Analysis 4.3). The three studies reporting at one to three months showed small but inconsistent effects and as heterogeneity was high (74%) no overall effect was estimated ( Analysis 4.4).
1.5 Time off work
Only two studies reported illness related absence from work during follow-up (Blankenstein 2001; Larisch 2004). In the Dutch study participants reported the number of weeks with absence from work or household duties due to illness in the preceding six months at 12 and 24 months follow-up. In the German study GPs reported on sick leave for participants at six months follow-up. Both studies were small: in the Dutch study only 65% of 149 participants answered the question, and in the German study sick leave was reported for completers only (66 in intervention group and 52 in control group). One study found a significant reduction in the intervention group (median 5 to 0 and 0) and no change in the control group (median 4 to 3 and 4) at 12 (P = 0.002) and 24 months (P = 0.000) respectively (Blankenstein 2001). The German study showed a general reduction in the number of participants on sick leave and the related cost during six months of follow-up but no difference between the intervention and control groups (Larisch 2004).
1.6 Patient satisfaction with care
Three studies reported satisfaction with care using an ordinal scale (Morriss 2006; Rosendal 2000; Toft 2000) but with highly different focus and time intervals such that we did not attempt meta-analysis. One study focused on patients' satisfaction with specified aspects of GP communication and whether they received the help they wanted (Morriss 2006) using a questionnaire developed for this purpose (Morriss 2002), at baseline and three months follow-up (only changes stated in publication). Another study (Toft 2000) applied a different patient satisfaction questionnaire (PSCQ-7, developed for this study) immediately after the GP consultation. Finally, one study used an international measure of overall satisfaction with doctor-patient relationship, medical-technical care, and information and support (Grol 1999) at 12 months follow-up (Rosendal 2000). Although all studies showed a difference between the randomised groups in the direction of intervention effect, none of these were statistically significant (Morriss 2006; Rosendal 2000; Toft 2000).
1.7 Healthcare utilisation
Overall healthcare costs could not be compared as each study included different aspects.
Two studies reported changes in GP consultation rates for participants from index consultation to three and 12 months follow-up (Larisch 2004; Rief 2005), one study reported GP consultations rates at three months (Morriss 2006), and one study reported changes in total healthcare use after two years follow-up (Blankenstein 2001). Furthermore, two studies reported change in healthcare cost after three to six months and two years follow-up (Larisch 2004; Toft 2000). In view of this heterogeneity of outcome timepoints and measures, we did not carry out meta-analysis. Results were inconsistent showing a larger decrease in frequency of primary healthcare use in the trial group of the two comparable studies (Larisch 2004; Rief 2005), whereas the remaining two studies (Morriss 2006; Toft 2000) showed an insignificant higher frequency and larger increase in primary healthcare cost, respectively, in the trial groups. The two studies exploring long term total healthcare use indicated a possible cost reduction due to decreased cost in secondary care (Larisch 2004; Toft 2000).
1.8 Discontinuation from follow-up
Data were available from all six included studies, five at long term and five at short term follow-up. In all studies except for one (Toft 2000) discontinuation rates in the intervention arm were higher than in the control arm at both timeperiods. Pooled effects were similar: RR 1.25 (95% CI 1.08 to 1.46) at 6 to 24 months, and RR 1.28 (95% CI 1.06 to 1.54) at 1 to 3 months.
Qualitative data from included studies
During the review process we identified qualitative analysis carried out as part of two studies to explore patient perspectives in relation to reattribution. As these were designed as explanatory features of included studies we used them as sources of contextual information. One study (Morriss 2006) used a grounded theory-based analysis of interview data with a subset of participants (Peters 2009). The other used conversation analysis of the process of reassurance in a study with relatively intensive intervention (Burbaum 2010; Larisch 2004). Both qualitative reports found resistance among patients with functional symptoms to psychological attribution.
The protocol specified a subgroup analysis comparing groups of studies on the basis of time spent training practitioners. We chose not to conduct this analysis in view of the small number of eligible trials and the difficulty in finding meaningful cut-offs between training models. Instead we made post hoc analyses of outcome on the studies with high intensity intervention including a treatment manual compared to low intensity intervention, as mentioned above in sections 1.1 and 1.2.
We did not carry out a sensitivity analysis based on risk of bias because few studies were eligible for inclusion.
Summary of main results
Patients with functional symptoms and disorders are common in primary care but we found few trials of interventions to improve treatment by front line healthcare professionals in this setting. The trials were only of moderate quality and most were of small size. Trials aimed to teach GPs a variety of interventions ranging from an opportunistic approach to individual consultations to more structured management, for example with patient diaries and planned follow-up. While statistical heterogeneity in some comparisons precluded meta-analysis it was feasible in others, however effect sizes on both physical and mental health were small and without clinical significance at both short and long term follow-up. Patient satisfaction with care appeared to be greater in the intervention group in all three studies evaluating this aspect though the results were statistically insignificant. Effects on healthcare use were inconclusive with regard to GP visits, and effects on overall healthcare costs could not be estimated. Finally, attrition was slightly higher in the intervention groups.
Overall completeness and applicability of evidence
The studies included were effectiveness studies from several primary care settings, but all were from Western Europe. They included front line healthcare workers and adult primary care patients. Follow-up was performed at various time intervals, including long term follow-up (12 to 24 months in most studies). The interventions appeared feasible in routine care. Hence, the interventions and the results may be generalised to the clinical primary care setting, at least in developed countries and for adult patients.
With regard to patient inclusion, functional symptoms and disorders are not clearly defined diagnostic entities and the search strategy could have missed relevant patient groups. We tried to prevent this by applying a broad range of terms for symptoms, disorders and syndromes based on previous reviews on this topic (Henningsen 2007) supplemented by associated terms (van der Feltz-Cornelis 2011).
The varying definitions and inclusion criteria for studies made for a heterogeneous patient population. Mild to moderate functional disorders of recent onset may need a different approach to long term or disabling conditions; in a Cochrane review of consultation letters Hoedeman et al suggest caution in comparing patients with the most severe somatisation disorder with those meeting criteria for milder disorders such as abridged somatisation (Hoedeman 2010). A stratification approach was effective in a recent study of back pain, which triaged intervention group patients to one of three treatments depending on severity (Hill 2011). Table 3 demonstrates the large variation in patient sampling from the primary care populations that was achieved in our included studies. Although it is important to address the whole spectrum of severity seen in primary care, this variation may hamper the comparison of effects on different aspects of patient outcomes of enhanced care.
Our focus was on the effect of training front line primary care workers who deal with all types of patient contacts, that is front line GPs and nurse practitioners. Hence, we did not include studies of specially trained clinicians, for example GPs, nurses or psychologists working within the primary care clinic. A number of such studies are listed in Characteristics of excluded studies and the implementation of interventions by specially trained healthcare professionals in primary care may give rise to different results.
All interventions were directed at primary healthcare professionals but outcome measures were at the patient level. Hence the intervention was implemented in two steps, from trial intervention at the health professional level to the intervention delivered by the healthcare professional to his or her patients. Only half of the analysed studies had manuals guiding this transfer of the intervention and few studies measured the actual adherence during consultations. Hence, the results refer to the effect of applying interventions to primary care and not to the effect of the actual content of interventions on the patients, which would require efficacy studies. Furthermore, interventions were multifaceted and analyses did not allow for the evaluation of specific parts of these interventions, for example the reattribution model alone or the effect of giving a diagnosis to the patient.
Most results were reported or made available by the authors, as regression coefficients adjusted for baseline value and clustering. This enabled accurate and sensitive assessments of individual study effects and facilitated comparison. The review included several different outcome measures, as specific measures in relation to functional disorders are poorly defined. General measures for quality of life are usually applied together with mental health related characteristics (illness worry, depression, anxiety) rather than specific measures of functional disorders. According to a recent study, SF-36 component summaries may not be valid when applied to patients with severe functional disorders (Schroder 2012). Research related to functional syndromes often apply syndrome specific questionnaires, for example Rome III Diagnostic Questionnaires for irritable bowel syndrome (Whitehead 2010) or the Chalder fatigue scale (Chalder 1993) for chronic fatigue syndrome. Symptom specific measures have not been applied in studies unifying functional symptoms and disorders. Hence, the chosen outcome measures are not specific to functional disorders or the content of interventions and more sensitive and specific measures might reach different results. In line with this, meta-analyses of effectiveness of treatment for specific physical symptoms, for example non-cardiac chest pain, obtain higher effect sizes in contrast to therapy for multiple unexplained symptoms (reduction in chest pain RR 0.68, 95% CI 0.57 to 0.81) (Kisely 2005).
The overall effects found in this review were all very small (SMD < 0.2) and health related outcomes were without clinical significance. There was no evidence that the effect demonstrated was caused by a few patients with large effects. However, even small effects may be of importance in the primary care setting. Many patients are treated repeatedly in primary care and the prevalence of functional disorders is high. Hence, even small reductions in, for example, healthcare use for a few patients in each practice may have a large impact on the healthcare system.
Quality of the evidence
The studies included in the review were effectiveness studies carried out in 'real world' settings. As such they suffered from problems with blinding of GPs, recruitment bias with baseline imbalance, and relatively high levels of patient attrition (Figure 3). Other aspects of study design such as allocation; concealment; blinding of patients and assessors; and reporting were generally of high quality.
In order to enable comparisons between studies, we only included those with a control group providing usual care, that is a group of GPs who were not trained. As GPs were not blinded it is possible that they showed patients greater attention, however most studies were carried out in routine clinics with no additional time provided for GPs to manage patients differently. In several studies GPs were not aware in advance of the consultations which patients were included. Two studies encouraged GPs to conduct structured reviews and follow-up of patients and these two appeared to have the largest benefits (Blankenstein 2001; Larisch 2004).
We excluded two studies which blinded participants including GPs by the comparison of two different training programmes; they did not include a group providing usual care (Aiarzaguena 2007; Sumathipala 2008). Although these studies were of high methodological quality, they faced three main problems, 1) the training programmes had to be fairly similar in order to avoid transparency to the participants; 2) the interpretation of results lacked the possibility of a comparison to usual care in order to estimate effects of implementation in routine care; and 3) the high resource expenditures of multiple interventions.
Three of the studies analysed for the primary outcome, physical health, suffered from possible recruitment bias. Furthermore, they had baseline imbalance on parameters showing that patients in the trial groups had poorer self-rated health than control patients at the time for inclusion. The likely impact of this is unknown, it could mean a larger potential for improvement in trial patients or alternatively indicate patients with more severe and chronic disorders that are resistant to treatment.
Studies included for analysis were subject to attrition and the high attrition rate observed in some studies makes it difficult to interpret the effect and generalisability of the results. The observed attrition may be partly explained by study design (high in questionnaire studies, lower when non-responders were contacted by phone). However, attrition was slightly unbalanced between trial groups. This may be interpreted as a sign of side effects of the intervention, but could also be caused by factors such as the recruitment bias mentioned above. The effect was mainly caused by one study (Rosendal 2000).
Finally, primary care studies often include several patients per doctor, resulting in dependent measures within providers or practices and the need for statistical adjustment for this clustering effect. The included studies made appropriate adjustments of values included in analyses but the clustering generally reduces the statistical power of the analyses.
Potential biases in the review process
We believe our review to be exhaustive as we included in our search conference proceedings and a review of protocols, and made contact with all known authors in this research field. We were able to collect unpublished data from authors from several of the included studies.
Our specification that interventions had to be delivered by front line healthcare workers was a limitation that narrowed the results of the search into very few studies. Hence, on one hand our results have a high specificity for primary care interventions and interventions unifying functional disorders, but on the other hand a large number of potentially effective interventions for patients with functional symptoms and syndromes were not evaluated. Furthermore, most of the included studies involved a rather small number of healthcare professionals. We specified analyses a priori and the low number of included studies did not allow for supplementary subgroup analyses. We did identify one post hoc observation, that the studies with more intensive and structured interventions had greater effects, however these were smaller studies and their intensity may have predisposed to non-specific participation effects.
Agreements and disagreements with other studies or reviews
Research into the treatment of functional symptoms and disorders has increased over the last decade in general and specifically in primary care (Kleinstauber 2011; Sumathipala 2007). The latter is important as effects found in specialised care cannot be transferred directly to primary care (Raine 2002; Wearden 2010). However, in spite of the increasing interest, evidence is still sparse.
Two recent reviews focus on functional symptoms and disorders. One review examined 27 studies of short term psychotherapy for chronic multiple MUS (Kleinstauber 2011); this did not include several of the primary care studies included in our analyses (Blankenstein 2001; Rief 2005; Rosendal 2000; Toft 2000). The meta-analyses showed significant but small effects of short term psychotherapy for chronic multiple MUS with regard to symptom load, depression, functional impairment and healthcare use. A narrative review (Gask 2011) of reattribution found good evidence from six studies that the skills of reattribution can be learned by GPs; and other studies have also demonstrated positive effects on GPs' attitudes and awareness of MUS (Rosendal 2000). Hence, reattribution techniques seem to be learned, but their efficacy on patient outcome in a controlled situation has not been tested and evidence from the effectiveness studies in this review suggest that the benefits of more formal psychotherapy may not be seen from models of reattribution delivered by front line healthcare professionals.
The small effects or lack of effects found in these and our review could be caused by a number of barriers at several levels, explored in qualitative studies. For GPs barriers such as lack of skill, negative expectations and concern about dependence may hamper their treatment. Patients view doctors taking a psychological approach to their symptoms as making them uncomfortable for a variety of reasons (Chew-Graham 2011; Peters 2009) and use a range of conversational strategies to resist reattribution (Burbaum 2010). These factors may be particularly important in the context of (a) ongoing care for both physical and mental disorders, and (b) limited time for negotiation, which characterises primary health care. Finally, the diagnostic barrier is substantial as diagnoses in this field are not agreed upon and differential diagnoses towards physical disease are important in primary care (Morriss 2006; Salmon 2007b). Enhanced care has focused on reattribution and other psychosocial interventions without incorporating the uncertainty of the diagnosis in early stages. Functional symptoms have to be followed over time and an approach like watchful waiting should be incorporated into the therapeutic models for primary care (Gask 2011). Future treatment models may need to incorporate a multidimensional rather than a sequential approach (Rask 2012; Rosendal 2012) .
Much research has been performed on single syndromes, mostly in secondary care. However, a review of psychosocial interventions for syndromes in specialised care showed methodological short-comings and lack of evidence of effect on long term outcome. Psychosocial treatments were not shown to have lasting and clinically significant effects on physical functioning. Effect sizes for short term follow-up for 11 studies were moderate: 0.68 (range 0.2 to 4.01), but only published studies were included and publication bias may have been present. Clinically significant changes were found in 44% to 80% of intervention compared with 25% to 55% in pseudo-treatment patients and 0% to 32% in inactive control patients (9 of 11 studies addressed irritable bowel syndrome). Only 26% of trials reported long term follow-up (Allen 2002).
Within the past 10 years several reviews each with a slightly different focus have been published. Cognitive behavioural therapy provided to patients given a diagnosis by specially trained professionals has shown an effect in several trials (Kroenke 2007; Sumathipala 2007), whereas the evidence is limited with regard to effectiveness of CBT and psychosocial interventions provided by front line GPs seeing and caring for any patient (Huibers 2009); this Cochrane review also includes Blankenstein 2001 and Larisch 2004). A few studies exploring the effect of such interventions provided by specially trained professionals, including GPs and nurses, within the primary care setting have emerged and indicate positive effects (Escobar 2004; Huibers 2004; Lidbeck 1997; Magallon 2008; Margalit 2008; Martin 2007; McLeod 1997; Ridsdale 2001; Schilte 2001; Smith 2006; Smith 2009).
These results agree with explorative studies from the recent meta-analysis. In a regression analysis the authors found that the profession of the therapist was a significant moderator of the effects. Mental health professionals seemed to be more successful than non-professionals at treating cognitions, emotions, behaviours and depressive symptoms (Kleinstauber 2011). However, a recent Cochrane review on counselling in primary care after referral from GPs only found evidence for clinical effectiveness on mental health (primarily depression) of counselling compared to usual care in the short term (one to six months) (SMD -0.28, 95% CI -0.43 to -0.13, n = 772, 6 trials) but not after long term follow-up (Bower 2011).
There is increasing evidence, although still limited, that consultation letters involving cooperation between primary care providers and specialists result in improved physical functioning and reduced overall cost for patients with functional symptoms and disorders in primary care. The strongest effects were found on patients with somatisation disorder, whereas intervention on the more frequent subthreshold somatoform disorders showed less improvement (Hoedeman 2010).
These results point towards collaborative care as a possible way forward (Henningsen 2007). Collaborative care has recently been shown to have some effect when treating depression or anxiety (Archer 2012) but needs further development and testing in systematic trials focusing on functional somatic symptoms, and with detailed descriptions of content. So far very few trials besides those focusing on consultation letters have been conducted on functional disorders (Pols 2008; van der Feltz-Cornelis 2006).
Implications for practice
Included studies were few and they were very heterogeneous with regard to selection of patient populations and the intensity of interventions. Furthermore, analysed studies were all effectiveness studies without blinding of GPs, and several had possible recruitment bias and high patient attrition. Current evidence does not answer the question whether enhanced care delivered by front line primary care professionals has an effect or not on functional symptoms and disorders. Enhanced care may have an effect when delivered per protocol to well-defined groups of patients with functional disorders, but this needs further investigation. Attention should be paid to difficulties including limited consultation time, lack of skills, the need for a degree of diagnostic openness, and patient resistance towards psychosomatic attributions. There is some indication from this and other reviews that more intensive interventions are more successful in changing patient outcomes.
Implications for research
There is a lack of research on this topic, in particular with regard to 1) development of new interventions, for instance based on theoretical models of constructive explanations (Dowrick 2004); 2) different formats for interventions which blur the boundaries of primary and specialist care, such as GPs with special skills (Burton 2012); 3) a uniform classification of functional somatic symptoms; 4) better and more consistent outcome measures for functional somatic symptoms and disorders as a whole; 5) research on children and on older adults (all reviewed studies were on adults and only two with age limits above 65 years); and 6) use of health care and social benefits with regard to patients with FSS.
Future trials of GP training would be improved if they a) recruit patients before the randomisation of GPs or practices to prevent recruitment bias; b) achieve better blinding of patient and GP by comparing active interventions; c) take steps to improve and measure the implementation of the intervention that the clinician has been trained in, for example using treatment protocols and measuring adherence in the consultations.
We would like to thank Kurt Kroenke for his contributions to the protocol and feedback on the final review. Furthermore, we thank Morten Frydenberg for providing statistical advice on meta-analyses and Winfried Rief, Lisbeth Frostholm and Eva Oernboel for providing additional data from the original studies in Marburg and Aarhus.
CRG funding acknowledgement
The UK National Institute for Health Research (NIHR) is the largest single funder of the Cochrane Depression, Anxiety and Neurosis Group.
The views and opinions expressed herein are those of the authors and do not necessarily reflect those of the NIHR, NHS or the Department of Health.
Data and analyses
- Top of page
- Authors' conclusions
- Data and analyses
- Contributions of authors
- Declarations of interest
- Sources of support
- Differences between protocol and review
- Index terms
Appendix 1. CCDAN Registers search strategy
The Cochrane, Depression, Anxiety and Neurosis Review Group Specialised Register (CCDANCTR)
The Cochrane Depression, Anxiety and Neurosis Review Group (CCDAN) maintain two clinical trials registers at their editorial base in Bristol, UK, a references register and a studies based register. The CCDANCTR-References Register contains over 31,500 reports of randomized controlled trials in depression, anxiety and neurosis. Approximately 65% of these references have been tagged to individual, coded trials. The coded trials are held in the CCDANCTR-Studies Register and records are linked between the two registers through the use of unique Study ID tags. Coding of trials is based on the EU-Psi coding manual. Please contact the CCDAN Trials Search Coordinator for further details. Reports of trials for inclusion in the Group's registers are collated from routine (weekly), generic searches of MEDLINE (1950-), EMBASE (1974-) and PsycINFO (1967-); quarterly searches of the Cochrane Central Register of Controlled Trials (CENTRAL) and review specific searches of additional databases. Reports of trials are also sourced from international trials registers c/o the World Health Organisation’s trials portal (ICTRP), ClinicalTrials.gov, drug companies, the hand-searching of key journals, conference proceedings and other (non-Cochrane) systematic reviews and meta-analyses.
Details of CCDAN's generic search strategies can be found on the Group‘s website.
The CCDANCTR-Studies Register was searched using the following terms:
Treatment setting = “general practice” or “family practice” or “primary care”
Diagnosis= “medically unexplained” or “frequent attend*” or “high util*” or somat* or neurasthen* or hypochondria* or hysteri* or pain or "chronic fatigue"
The CCDANCTR-References Register was searched to find additional untagged/uncoded references uisng the following terms:
Free-text = “general practi*” or “family practi*” or “primary care” or "primary health*" or (physician* and family) or “primary medical care” or “health practitioner*” or “doctor patient relation*” or Title/Abstract = GP*
Free-text = “medically unexplained” or “frequent attend*” or “high util*” or somat* or neurasthen* or hypochondria* or hysteri* or "chronic fatigue" or “unexplained physical symptoms”
The CCDANCTR was searched on 13 August 2012.
Appendix 2. Search strategies
A native database search was done in Ovid Medline, Embase and PsycINFO in 2009 and updated on 13 September 2012 as stated below.
Supplementary searches (using similar, but translated terms) were conducted in CINAHL,PSYNDEX, SIGLE AND LILACS (April 2010).
These searches yielded no supplementary studies to those identified by the Cochrane databases.
- SOMATOFORM DISORDER/ or NEURASTHENIA/ or HYPOCHONDRIASIS/
- NEUROCIRCULATORY ASTHENIA/
- (somatoform or somati#ation or somati#ing or somati#ed or somatic symptom$ or somatic syndrome$ or symptom syndrome$ or multisomat$ or neurastheni$ or hypochondria$).ti,ab.
- ((medic$ adj3 (unexplain$ or inexplic$)) or unexplained symptom$).ti,ab.
- (((frequent or high) adj1 attend$) or high utili#er$ or repeat$ present$).ti,ab.
- functional symptoms.ti,ab.
- exp ABDOMINAL PAIN/
- stomach ache$.ti,ab
- exp BACK PAIN/
- COLONIC DISEASES, FUNCTIONAL/
- CYSTITIS, INTERSTITIAL/
- painful bladder syndrome.ti,ab.
- urethral syndrome.ti,ab.
- cardiac neuros$.ti,ab.
- ((non cardiac or noncardiac or non-cardiac) adj chest pain).ti,ab.
- ((nonorganic or non organic or non-organic) adj pain).ti,ab.
- effort syndrome.ti,ab.
- FATIGUE SYNDROME, CHRONIC/
- myalgic encephalomyel$.ti,ab.
- (post viral or postviral or post-viral) adj (fatigue or syndrome).ti,ab.
- exp HEADACHE
- exp HEADACHE DISORDERS
- exp HYPERVENTILATION
- exp HYSTERIA
- Briquet's syndrome.ti,ab.
- IRRITABLE BOWEL SYNDROME/
- MULTIPLE CHEMICAL SENSITIVITY/
- exp PELVIC PAIN
- exp PREMENSTRUAL SYNDROME
- PSYCHOPHYSIOLOGIC DISORDERS
- (psychalgia or psychogenic or psychoseizure$ or psychosomatic).ti,ab.
- TEMPOROMANDIBULAR JOINT DYSFUNCTION SYNDROM
- exp PRIMARY HEALTHCARE/
- PHYSICIANS, FAMILY/
- FAMILY PRACTICE/
- FAMILY HEALTHCARE/
- NURSE PRACTITIONERS/
- ((family or community) adj (medic$ or doctor$ or physician$ or nurs$ or health)).ti,ab.
- ((general or family or nurs$) adj1 (practice$ or practitioner$)).ti,ab.
- (primary care or primary healthcare or primary health care or primary health service$ or homecare or care in the community).ti,ab.
- GP$ or generalist$.ti,ab.
- randomized controlled trial.pt.
- controlled clinical trial.pt.
- exp Clinical Trials as Topic/
- (animals not (humans and animals)).sh.
- 54 not 55
- 36 and 46 and 56
International Clinical Trials Registries were searched (Clinicaltrials.gov and the ICTRP (apps.who.int/trialsearch)) using the following terms:
Somatisation OR somatization
Contributions of authors
Marianne Rosendal and Chris Burton wrote the protocol draft, other authors contributed with critical feedback and discussions of methods. All authors accepted the final version of the protocol.
Chris Burton and Marianne Rosendal conducted the literature search, evaluated papers for inclusion and quality of studies, extracted and entered data, performed analyses and wrote the review draft. Annette H Blankenstein provided additional data from the original study in Amsterdam and extracted data. Michael Sharpe evaluated studies for inclusion when disagreements occurred. All authors contributed with critical feedback on the review and accepted the final version of the protocol.
Declarations of interest
Marianne Rosendal has been actively participating in the evaluation of RCTs cited in this review, and involved in the development of treatment guidelines for Danish primary care. The working hours spent on this Cochrane review have been part of her standard employment at the Research Unit for General Practice. No other conflicts of interest known.
Annette H Blankenstein has been the primary researcher in one of the RCTs on reattribution cited in this review. She has contributed to multidisciplinary and GP guidelines on MUS and somatoform disorders and she developed a training course for GPs on cognitive behavioural treatment for MUS. No other conflicts of interest known.
Richard Morriss has been the chief investigator of one of the RCTs cited in this review as well as a previous non-randomised treatment trial. No other conflicts of interest known.
Per Fink has been involved in RCTs on treatment of medically unexplained symptoms in primary care cited in this review, and in the development of treatment guidelines for Danish primary care. No other conflicts of interest known.
Michael Sharpe: no conflicts of interest known.
Chris Burton: no conflicts of interest known.
Sources of support
- Research Unit for General Practice, Århus, Denmark.
- Community Health Sciences, General Practice Section, University of Edinburgh, UK.
- Department of General Practice and Elderly Care Medicine, VU University Medical Center, Amsterdam, Netherlands.
- The Research Clinic for Functional Disorders and Psychosomatics, Århus University Hospital, Denmark.
- Department of Medicine and Regenstrief Institute, Indiana University, USA.
- School of Molecular & Clinical Medicine, University of Edinburgh, UK.
- Department of Biostatistics, Institute of Public Health, University of Århus, Denmark.
- Department of Psychiatry, University of Nottingham, UK.
- No sources of support supplied
Differences between protocol and review
The review has been performed in accordance with the protocol but with limitations as only few studies were included and original data could not be retrieved. Clarifications with regard to inclusion criteria and outcome measures have been made. Sensitivity analysis and subgroup analyses were planned but not performed as very few studies could be analysed. We originally planned to categorise outcomes as short (0 to 5 months), medium term (6 to 11) months, and longer term (12 or more months) but because of relatively small numbers of studies we merged the latter two. The protocol stated that meta-analysis would use a fixed-effect model and did not specify the circumstances in which meta-analysis would be appropriate. In view of the variation in intervention intensity and assumed illness severity between studies, we chose to use a random-effects model meta-analysis with an exclusion threshold of heterogeneity of I
Medical Subject Headings (MeSH)
MeSH check words
* Indicates the major publication for the study