Criteria for considering studies for this review
Types of studies
The International Headache Society (IHS) has provided a useful document setting out guidelines for the conduct of clinical trials in migraine, to which current investigators are encouraged to adhere (Tfelt-Hansen 2012). This document was not used as the sole basis for considering studies in this review, as too many potentially informative past studies would likely have been excluded on methodological grounds. However, many of its recommendations have been used as a basis for what follows.
Included studies were required to be prospective, controlled trials of self administered valproate (valproic acid or sodium valproate or a combination of the two) taken regularly to prevent the occurrence of migraine attacks, to improve migraine-related quality of life, or both. We included trials only if allocation to treatment groups was randomised or pseudo-randomised (based on some non-random process unrelated to the treatment selection or expected response). Blinding was not required. We excluded concurrent cohort comparisons and other non-experimental designs.
Types of participants
Study participants were required to be adults (at least 16 years of age) and to meet reasonable criteria designed to distinguish migraine from tension-type headache. If patients with both types of headache were included in a trial, results were required to be stratified by headache diagnosis. We did not require the use of a specific set of diagnostic criteria (eg, Ad Hoc Cttee 1962; IHS Cttee 1988; ICHD-II 2004), but migraine diagnoses had to be based on at least some of the distinctive features of migraine, eg, nausea/vomiting, severe head pain, throbbing character, unilateral location, phono/photophobia, or aura. Secondary headache disorders had to be excluded using reasonable criteria.
We anticipated that some of the trials identified would include patients described as having mixed migraine and tension-type headaches or combination headaches, and the protocol for this review described detailed procedures for dealing with such trials. In the end, no such precautions were necessary. We excluded studies evaluating treatments for chronic daily headache, chronic migraine, and transformed migraine. The reasons for this are: (a) the definition of chronic migraine is still heavily debated, and a revision of the 2004 IHS criteria for this condition has been proposed (Olesen 2006); (b) transformed migraine and chronic daily headache, although commonly used terms, are insufficiently validated diagnoses; (c) the separation of these conditions from headache due to medication overuse is not always clear in many studies; and (d) there is some evidence that suggests that chronic migraine may be more refractory to standard prophylactic treatment than episodic migraine. We explicitly excluded trials and treatment groups including only patients with tension-type headache.
Types of interventions
Included studies were required to have at least one arm in which valproate (valproic acid or sodium valproate or combination of the two, without concomitant use of other migraine prophylactic treatment) was given regularly during headache-free intervals with the aim of preventing the occurrence of migraine attacks, improving migraine-related quality of life, or both. Acceptable comparator groups included placebo, no intervention, active drug treatment (ie, with proven efficacy, not experimental), the same drug treatment with a clinically relevant different dose, and non-pharmacological therapies with proven efficacy in migraine. The analysis included only drugs and dosages that are commercially available.
We recorded any data reported on treatment compliance in the Characteristics of included studies table. After examination of these data, it did not seem necessary to stratify the analysis by compliance.
We anticipated that most trials would permit the use of medication for acute migraine attacks experienced during the trial period. We therefore recorded descriptions of trial rules concerning the use of acute medication in the Characteristics of included studies table whenever such information was provided. We did not otherwise model or adjust for this factor in our analysis.
Types of outcome measures
We collected and analysed trial data on headache frequency, responders (patients with ≥ 50% reduction in headache frequency), quality of life, and adverse events.
Search methods for identification of studies
Search strategies used in our earlier review (Chronicle 2004; Mulleners 2008) are detailed in Appendix 1 (last search date 31 December 2005). For the present update, trained information specialists developed detailed search strategies for each database searched (Appendix 2). The new searches overlapped the old searches by a full year to ensure complete coverage. The last search date for all updated searches was 15 January 2013.
Databases searched for this update were:
Cochrane Central Register of Controlled Trials (CENTRAL; The Cochrane Library 2012, Issue 12; years searched = 2005 to 2012);
MEDLINE (via OVID), 2005 to 15 January 2013;
MEDLINE In-Process (via OVID), current week, 15 January 2013;
EMBASE (via OVID), 2005 to 15 January 2013.
Additional strategies for identifying trials included searching the reference lists of review articles and included studies, searching books related to headache, and consulting experts in the field. We attempted to identify all relevant published trials, irrespective of language. We handsearched two journals, Headache and Cephalalgia, in their entirety through January 2013.
Data collection and analysis
Selection of studies
Two of us independently screened titles and abstracts of studies identified by the literature search for eligibility. Papers that could not be excluded with certainty on the basis of information contained in the title and/or abstract were retrieved in full for screening. Disagreements were resolved through discussion. We retrieved papers passing this initial screening process, and two of us independently reviewed the full texts. Disagreements at the full-text stage were resolved through internal discussion and, in a few cases, through correspondence with members of the editorial staff of the Cochrane Pain, Palliative and Supportive Care Review Group. We were not blinded to study investigators' names and institutions, journal of publication, or study results at any stage of the review.
The search strategy described above identified a large number of short conference and journal abstracts. The majority of these either (a) reported partial results of ongoing trials; (b) provided insufficient information on trial design or results; (c) were early reports of included studies; or (d) were reproductions of abstracts of papers published in full (for example, the journal Headache reproduces abstracts of interest to readers, and these are found by PubMed). We agreed that short abstracts of this kind would be excluded from consideration.
Data extraction and management
Two of us independently abstracted information on patients, methods, interventions, efficacy outcomes, and adverse events from the original reports onto specially designed, pre-tested paper forms. Disagreements were again resolved through discussion.
We anticipated that trials would vary in length, that outcomes would be measured over various units of time (eg, number of attacks per two weeks versus number of attacks per four weeks), and that results would be reported for numerous different time points (eg, four-week headache frequency at two months versus at four months). We attempted to standardise the unit of time over which headache frequency was measured at 28 days (four weeks) wherever possible. We recorded outcomes beginning four weeks after the start of treatment and continued through all later assessment periods. We made decisions about which time points to include in the final analysis once the data had been collected.
We anticipated that outcomes measured on a continuous scale (eg, headache frequency) would be reported in a variety of ways, eg, as mean pre-treatment, post-treatment, and/or change scores. Among change scores, we preferred the mean of within-patient changes (from baseline to on-treatment in a parallel-group trial) over the change in group means because the first both results in a lower variance (taking into account the correlation between baseline and post-treatment scores in each patient) and adjusts for imbalances in baseline headache frequencies, while the latter has only the second advantage. When neither type of change score was reported, we compared post–treatment means between groups, assuming that baseline data would be balanced due to randomisation. We anticipated that many trials would report group means, without reporting data on the variance associated with these means. In such cases, we attempted to calculate or estimate variances based on primary data, test statistics, and/or error bars in graphs.
When efficacy outcomes were reported in dichotomous form (success/failure), we required that the threshold for distinguishing between treatment success and failure be clinically significant; for example, we interpreted a ≥ 50% reduction in headache frequency as meeting this criterion. In such cases, we recorded, for each treatment arm, the number of patients included in the analysis and the number with each outcome.
The protocol for this review specified rules for dealing with outcome data reported on an ordinal scale (eg, for reduction in headache frequency: 0%, 1% to 24%, 25% to 49%, 50% to 74%, 75% to 99%, 100%) but, in fact, none of the included trials reported ordinal data for outcomes of interest.
We envisaged that the preferred methods of collecting and presenting data on quality of life would most likely be the Migraine-Specific Questionnaire (MSQ) and the Medical Outcomes Study 36-item Short-Form Health Survey (SF-36). However, other instruments and other types of outcomes related to quality of life (eg, work absenteeism) were not excluded a priori, and these data were kept under review before specifying rules for analysing outcome data in this domain.
We recorded the proportion of patients reporting adverse events for each treatment arm wherever possible. The identity and rates of specific adverse events were also recorded. We anticipated that reporting of adverse events would vary greatly across trials with regard to the terminology used, method of ascertainment, and classification of adverse events as drug-related or not and as severe or not.
Assessment of risk of bias in included studies
We completed a 'Risk of bias' table for each study, using assessments of random sequence generation (selection bias), allocation concealment (selection bias), blinding of participants and personnel (performance bias), blinding of outcome assessment (detection bias), incomplete outcome data (attrition bias), and selective reporting (reporting bias). For new studies identified in the present update, two of us completed this assessment independently; for older studies, one of us performed the assessment and a second author reviewed and commented on it. Disagreements were resolved through discussion.
We also assessed the methodological quality of individual trials using the scale devised by Jadad and colleagues (Jadad 1996), operationalised as follows:
Was the study described as randomised? (1 = yes; 0 = no)
Was the method of randomisation well described and adequate? (0 = not described; 1 = described and adequate; -1 = described, but not adequate)
Was the study described as double-blind? (1 = yes; 0 = no)
Was the method of double-blinding well described and adequate? (0 = not described; 1 = described and adequate; -1 = described, but not adequate)
Was there a description of withdrawals and dropouts sufficient to determine the number of patients in each treatment group entering and completing the trial? (1 = yes; 0 = no)
Each trial thus received a score of 0 to 5 points, with higher scores indicating higher quality in the conduct or reporting of the trial. Two review authors scored the studies independently, and a consensus score was then arrived at through discussion. The consensus score is reported for each study in the Characteristics of included studies table and was not used as a weighting in statistical analyses.
Measures of treatment effect
The primary outcome considered for the efficacy analysis was headache frequency. Among headache frequency measures, we preferred number of migraine attacks to number of days with migraine. The latter measure confusingly incorporates attack duration into the measure of headache frequency. Moreover, attack duration is affected by the use of symptomatic medication, which is permitted in most trials. We also analysed headache frequency in terms of a responder rate, or the proportion of patients with a ≥ 50% reduction in headache frequency from pre- to post-treatment.
As noted above (Data extraction and management), we kept patient-reported quality of life data under review as studies were selected. There were no quality of life data available for rigorous analysis, but one study (Afshari 2012) reported Migraine Disability Assessment (MIDAS) scores.
The analysis considered only outcome data obtained directly from the patient and not those judged by the treating physician or study personnel. Efficacy data based on contemporaneous and timed (usually daily) recording of headache symptoms were preferred to those based on global or retrospective assessments.
In addition, we tabulated adverse events for each included study.
Unit of analysis issues
In the case of cross-over trial designs, we anticipated that the data reported would normally not permit analysis of paired within-patient data. We therefore analysed cross-over trials as if they were parallel-group trials, combining data from all treatment periods. If a carry-over effect was found and data were reported by period, then the analysis was restricted to period-one data only. In no trial were complete within-patient data reported, so within-patient improvement scores were not calculated.
Dealing with missing data
Where data were missing or inadequate, we attempted to obtain these data by correspondence with study authors.
Assessment of heterogeneity
We tested estimates of efficacy (both mean differences (MDs) and odds ratios (ORs)) for homogeneity. When significant heterogeneity was present, we made an attempt to explain the differences based on the clinical characteristics of the included studies. We did not statistically combine studies that were clinically dissimilar. However, when a group of studies with statistically heterogeneous results appeared to be clinically similar, we did combine study estimates. We performed all pooled analyses using a random-effects model.
As a sensitivity analysis, we also planned to calculate a pooled effect estimate using a fixed-effect model for major outcomes (headache frequency, responder rate, and any AE) when the random-effects result was near-significant (0.05 ≤ P ≤ 0.15) and the pooled studies were homogeneous (heterogeneity statistics: P > 0.15/I2 < 30%). Such a sensitivity analysis would evaluate whether conclusions might differ based on the statistical model used for pooling in situations where a fixed-effect model might reasonably be considered instead of a random-effects model. In fact, however, no such sensitivity analyses were warranted in the present review.
We anticipated that continuous outcome measures of headache frequency would be reported on different and often incompatible scales. Although we attempted to standardise the extraction of headache frequency data to a 28-day (four-week) period, this was not possible in every case. In our previous review (Chronicle 2004; Mulleners 2008), we therefore analysed these data using the standardised mean difference (SMD, with 95% confidence intervals (CIs)) rather than the mean difference (MD). The introduction of change scores in the newly included studies for some of the reviews in this series necessitated a change in the analysis plan from SMDs to MDs. The latter also has the advantage of giving a result in clinically meaningful units (ie, x fewer migraines per 28 days).
We used dichotomous data meeting our definition of a clinically significant threshold to calculate odds ratios (ORs), with 95% CIs. Although we prefer ORs because of their statistical properties, some readers may find it simpler to interpret the clinical significance of our findings using risk ratios (RRs); we have therefore calculated RRs where appropriate. We additionally computed numbers needed to treat (NNTs), with 95% CIs, as the reciprocal of the risk difference (RD) versus placebo (McQuay 1998).
In the same way, we used data on the proportion of patients reporting adverse events to calculate RDs and numbers needed to harm (NNHs).
Subgroup analysis and investigation of heterogeneity
We undertook subgroup analyses by dose where possible. We considered further subgroup analyses by method of randomisation and by completeness of blinding, but did not undertake them because of insufficient data.