A review and proposal for a core set of factors for prospective cohorts in low back pain: A consensus statement



The Multinational Musculoskeletal Inception Cohort Study (MMICS) Statement is a consensus statement aimed at improving the quality of prospective investigations into the transition from early stages of low back pain (LBP) to persistent problems. The statement aims to help improve the quality of such studies by recommending an agreed minimal list of measures for inclusion in baseline data collection. The MMICS Statement is primarily aimed at researchers who want to investigate prognosis in LBP, and this will allow data from cohorts to be pooled and will facilitate comparisons between different health care systems.

One approach to preventing acute new episodes of LBP (up to 3 weeks from onset) from developing into persistent disabling pain is to identify those individuals with LBP who are most likely to progress to chronic disability. Targeting interventions at those at highest risk could reduce the population burden of chronic LBP. A wide range of baseline parameters have been associated with poor outcome in inception cohort studies (1–6). Few existing studies have been of sufficient size and methodologic rigor to produce conclusive findings. Even in methodologically robust studies, baseline factors only account for a small proportion of the variance in outcome (7), typically around 30%. Systematic reviews of the literature could not pool data because studies used different measurements (1–3). Despite some information on physical, psychosocial, and work-related risk factors, it has not been possible to adequately estimate the comparative impact of individual psychosocial and societal factors on the transition from acute to persistent disabling LBP.

How to address this problem was one focus of the VI International Forum for Primary Care Research in Low Back Pain held in April 2003. A steering group for the collaboration was appointed, and 8 national team leaders volunteered to recruit teams of national experts. Three additional team leaders were recruited by the steering group (Australia, France, and Germany). Independent experts were invited to advise on the quality of the MMICS process. The MMICS steering group, located in the UK, met regularly and included expert researchers in clinical and outcome factors in back pain (AB), work-related issues in back pain (AKB), psychosocial aspects of back pain (TP and RS), and general practice aspects of back pain (MU).

The MMICS Statement set out to include 1) a minimal but comprehensive number of predictor factors based on current evidence and theory; 2) appropriate measurement instruments for agreed predictor factors based on their clinometric properties, availability, and practical characteristics; and 3) a minimum set of followup measures, including recommendations about measurement and timing.

Materials and Methods

The development of the consensus protocol included 4 iterative stages.

Generation of factors for consideration.

First, the original checklists for factors at baseline and at outcome were generated by the steering group, through discussion of current evidence, with an emphasis on published systematic reviews (3, 8), and were informed by the list generated in a dedicated workshop at the VI International Forum for Primary Care Research in Low Back Pain. Second, team leaders were contacted and asked to select their team members, with a recommendation for the inclusion of clinical researchers, health services experts, and epidemiologists. Third, national teams coded items in the checklists as include, exclude, or do not know. Teams were asked to generate new items if they thought the list was incomplete. Teams were asked to use 3 considerations: evidence, theory, and practicality (Figure 1).

Figure 1.

The process of selecting factors for the core statement. MMICS = Multinational Musculoskeletal Inception Cohort Study.

Consensus for inclusion and exclusion.

First, the cut point for agreement was set at two-thirds of the responding teams agreeing (Table 1). If a factor received at least 3 nominations for both inclusion and exclusion, it was placed in the controversial/disagreement list (Table 2). Second, teams produced responses to new items and to controversial factors, or to factors with high numbers of “do not know” responses. Third, the steering group synthesized the factors into 4 lists, based on agreement of two-thirds of responders. These lists comprised baseline factors, followup factors, controversial factors for which no consensus was achieved, and rejected factors. Finally, the factor lists and descriptions of process were sent to independent expert advisors for comments.

Table 1. Core predictors for baseline measurement*
FactorsNo. of teams agreed/teams respondedInformation on measurementMeasurement selectedDescription
  • *

    BMI = body mass index; A = published systematic review; B = published consensus or narrative review; C = expert advice; RDQ = Roland Disability Questionnaire; NPS = Numeric Pain Score; AUDIT-C = Alcohol Use Disorders Identification Test; IPAQ-SF = International Physical Activity Questionnaire Short Form; PCS = Pain Catastrophizing Scale; CES-D = Center for Epidemiologic Studies Depression Scale; FABQ = Fear Avoidance Beliefs Questionnaire; WHO = World Health Organization; LBOS = Low Back Outcome Score; JCQ = Job Content Questionnaire.

 Disability11/11A, B, CRDQ24 items
 Leg pain below knee9/10CYes/no
 Pain intensity10/11A, B, CNPSIn the past week, how bothersome have the following symptoms been? a) low back pain; b) leg pain (0–10, where 0 = no pain, 10 = worst pain imaginable)
 Pain duration of current  episode11/11A, B, C Weeks since current onset (including <1 week) where current onset is defined as a new episode of pain, having been pain free for at least 6 months
 Duration on current  benefits11/11CDaysDays
 Other health-related  insurance8/8CYes/no
 Pending compensation10/11CYes/no
 Sickness benefit10/10CYes/no
 Alcohol consumption8/11B, CAUDIT-C3 questions
 Exercise10/11B, CIPAQ-SF7 items, categorized as low/medium/high
 Smoking10/11B, CPack-yearsAverage number of cigarettes per day × years smoking
 Catastrophizing7/9BPCS13 items, 0–4 Likert scale response
 Depression/distress10/11B, CCES-D20 items, measuring feelings over past week; 6 subscales, with a total between 0–60, and a standard cut point of 16
 Fear avoidance9/10BFABQ16 items, 5 related to physical harm, and 11 related to fears about work; responses on a 0–6 Likert scale
Social, demographic, and  work    
 Education10/11B, CWHO Health Survey (26)What is the highest level of education that you have completed? 1 = no formal schooling, 2 = less than primary school, 3 = primary school completed, 4 = secondary school completed, 5 = high school (or equivalent) completed, 6 = college/pre-university/university completed, 7 = postgraduate degree completed
 Employment status10/11BLBOSAt present, are you working: 8 response options (61)
 Job-related factors: 1. job  satisfaction, 2. social  support, 3. sense of  control10/11A, B, C1. Job satisfaction, 2. JCQ, 3. JCQHow satisfied are you with your work in general? (7 responses from extremely dissatisfied to extremely satisfied); 1. ref.70,2. ref.65,3. ref.65
 Marital/living with7/9B, CWHO Health Survey (26)What is your current marital status? (1 = never married, 2 = currently married, 3 = separated, 4 = divorced, 5 = widowed, 6 = cohabiting
 No. sick days over  previous year8/10B, CSelf-reportDays (where possible confirm in work records)
 Reasons for not workingB, CWHO Health Survey (26)What is the main reason you are not working for pay? 1 = homemaker/caring for family, 2 = looked but can't find a job, 3 = doing unpaid work/voluntary activities, 4 = studies/training, 5 = retired/too old to work, 6 = ill health, 7 = other
 Type of work9/10B, CWHO Health Survey (26)During the last 12 months, what has been your main occupation? 1 = legislator/senior official/manager, 2 = professional (engineer/doctor/teacher/clergy/etc.), 3 = technician/associate professional (inspector/finance/dealer/etc.), 4 = clerk (secretary/cashier/etc.), 5 = service/sales worker (cook/travel guide/shop salesperson/etc.), 6 = agricultural or fishery worker (vegetable grower/livestock producer/etc.), 7 = craft or trades worker (carpenter/painter/jewelry worker/butcher/etc.), 8 = plant/machine operator or assembler (equipment assembler, sewing-machine operator, driver, etc.), 9 = elementary worker (street food vendor, shoe cleaner, etc.), 10 = armed forces (government military)
Table 2. Controversial factors*
FactorsYes (out of 10 responding teams)No (out of 10 responding teams)No in iteration 2 (No./total answers)
  • *

    Values are the number.

 Drug consumption646/9
Social demographic and work   
 Impending “downsizing”535/9
 Social deprivation markers344/9
 Unequal leg length447/9

Selection of measurements.

The selection of appropriate measurements was based on the criteria outlined by Deyo et al (9), and included breadth of coverage, demonstration of reliability and validity, practicality (brevity and low cost), wide use, and available translations. The process of selection is shown in Figure 2. Factors selected by the MMICS teams for which a suitable measurement was not found were removed from the core set.

Figure 2.

The process of selecting measurements for the core statement. MMICS = Multinational Musculoskeletal Inception Cohort Study.

The protocol for selection of a measurement was as follows. A literature search was carried out by 2 members of the steering team (TP and RS) using PubMed/Medline, PsycINFO, psycArticles, Social Science Citation Index, and Science Citation Index PubMed and PsycINFO for each factor. The search combined the variations of the named factor and types of report (e.g., reviews, prospective cohorts, clinimetric properties, etc.), and was carried out first for LBP, and then (if appropriate publications were not identified) for musculoskeletal pain and chronic pain. The search had no language or date restriction. In addition, the related articles in PubMed and the reference lists of selected publications were checked. Abstracts were read by a member of the steering group (TP), who consulted with other group members according to their areas of expertise. Full articles were read by 2 members of the team (TP and RS) who used preset criteria (10) in reference to clinimetric properties. The steering group developed the following hierarchical criteria for selection of measurements. First, good evidence was considered to comprise published systematic reviews of measurements based on clinimetric properties, published evidence-based consensus statements, or descriptive reviews of measurements with recommendations. Where recommendations were not based on clinimetric properties (e.g., in some narrative reviews) a search was carried out to identify studies of reliability and validity for use in LBP populations or related pain populations.

Second, moderate evidence was considered to comprise at least 2 individual empirical studies that had used a measure in a prospective cohort of patients with LBP, and had reported some information on the reliability and validity of the measure in question or where information about reliability and validity of the measure was available from other sources. Preference was given to studies carried out in populations with LBP, but evidence was also considered from other related (musculoskeletal pain) populations.

Third, weak evidence was considered to comprise single studies that used a measure but did not report information on reliability and validity, and where such information was not available elsewhere. Such candidate measures were presented to independent experts (identified through publications in the area of interest, and preferably publications of a good quality prospective cohort study) who were not affiliated with MMICS, with the criteria for recommendations as follows: clinimetric properties (if known), use in LBP populations, length, translations available, and experience of use. Factors were excluded if only weak or no evidence was found on their measurement, when advice from experts was contradictory and could not be reconciled, and when experts could not make a recommendation due to the current state of evidence. For the final stage, the final selected lists and justification were circulated to the original participating team leaders for approval.


Selection of baseline measurement.

The original list, generated by the steering group, and presented to each national team for the first iteration, included 130 factors. The final version comprised 63 factors that were endorsed by at least two-thirds of the respondent teams. The core set of baseline and outcome factors is presented in Tables 1 and 3. The excluded and controversial factors are presented in Tables 2 and 4. The recommended core set is described below.

Table 3. Core outcome measures and their proposed timing*
FactorsNo. agreed/respondedHow often during followupSelected measurement
  • *

    SF-36 = Short Form 36; RMD = Roland Morris Disability; GP = general practitioner; NSAIDs = nonsteroidal antiinflammatory drugs; LBP = low back pain; FMRI = functional magnetic resonance imaging; QOL = quality of life; see Table 1 for additional definitions.

Disability and well-being   
 Back pain disability10/10QuarterlyRMD
Health care utilization, including   referral and treatment   
 Consultation with behavioral   therapist: counselor/psychologist/  pain management6/9At least every 2 weeksNumber of consultations
 Consultation with manual   therapists: physiotherapist,   osteopaths, or chiropractors8/9  
 Consultation with specialists:   neurologist/rheumatologist/other6/8  
 GP appointments5/6  
 Multidisciplinary team9/9  
 Occupation health7/9  
 Consumption of over-the-counter   pain medication8/9At least every 2 weeksUse of analgesics (e.g., acetaminophen, co-proxamol, co-codamol, dihydrocodeine), NSAIDs (e.g., ibuprofen, naproxen, aspirin, diclofenac), strong analgesics (e.g., tramadol)
 Pain intensity8/9QuarterlyNPS (see Table 1)
 Reduction in normal activities due to   LBP7/9QuarterlyDuring the past 3 months, about how many days did you cut down on things you usually do for more than half the day because of back pain or leg pain? (9)
Patient satisfaction   
 With care11/11QuarterlyOver the course of treatment for your back pain or leg pain, how satisfied were you with your overall medical care? (9); 5-point response
 With condition11/11QuarterlyIf you had to spend the rest of your life with the symptoms you have right now, how would you feel about it? (9); 5-point response
Test and examination   
 Return to work11/11QuarterlyIf you were off work due to back pain or leg pain, are you now 1) back to the same job; 2) back, but job modified to accommodate pain; 3) in a new job, more suited to accommodate pain; 4) not working because of pain; 5) not working for other reasons (use list from baseline).
 Sick leave (days off due to back pain)11/11QuarterlyDuring the past 3 months, how many days did low back pain or leg pain keep you from going to work or school? (9)
For studies with QOL as primary   outcome only   
 Patient cognitions   
  Catastrophizing5/6QuarterlyPCS (see baseline)
  Fear avoidance9/9QuarterlyFABQ (see baseline)
 Patient distress   
  Depression/distress11/11QuarterlyCES-D (see baseline)
Table 4. Excluded factors
FactorsNo. of teams rejecting/no. of teams respondingStage of rejection (main reason)
 Blood pressure8/10I
 Blood tests7/10I
 Genetic markers7/10I
 Heart rate9/10I
First appointment  
 Birth weight10/10I
 Segmental spinal mobility6/7I
 Beliefs about cause and recoveryCoordinating team: lack of valid and reliable measureFinal
 General and health anxietyCoordinating team: overlap with fear and catastrophizingFinal
 Life events6/8I

Clinical status.

Body mass index.

The traditional calculation of body mass index is weight (in kg) divided by height (in m2). Some research has classified obesity as a score >30 kg/m2 (11).


There are numerous systematic reviews, consensus reports, and narrative reviews of measurements of disability in LBP populations (12–14). The 2 primary contenders are specific measures, as opposed to generic measures: the Roland Disability Questionnaire (RDQ) (15) and the Oswestry Disability Index (16). Both questionnaires have been extensively translated, have good reliability and validity, and have been shown to predict outcome in LBP. Although there is little information to choose between them, a recent systematic review of back-related outcomes concluded that the original 24-item RDQ is “the tool of choice if combined with a general health assessment and used in mild to moderately affected LBP” (14).

Duration of current episode.

The pattern of LBP for most people tends to be intermittent over long periods (17, 18). The MMICS Statement recognizes this and is aimed at studies researching the progression from early stages of pain to persisting disadvantageous outcomes. Previous prospective cohorts have used definitions of “early” ranging from 3 weeks from onset of current episode to 12 weeks (19), and some have included criteria such as no LBP for 6 months before current episode (19). Accordingly, the MMICS Statement recommends measurement of current episode in weeks from current onset, where current onset is defined as a new episode of pain, having been pain free for at least 6 months.

Leg pain below the knee.

There are no studies comparing measurements of pain radiating below the knee. The majority of prospective cohorts that have measured this factor have used a simple yes/no categorization. Studies have been criticized for failure to analyze subgroups accordingly (3).

Pain intensity.

There are several comparisons of measurements of pain intensity, and the emerging consensus is that a numerical rating scale is at least as reliable and sensitive as the visual analog scale and more complicated measures (20–24). Specific recommendations about wording have been proposed (9), which explain the inclusion of a 1-week time frame (as opposed to “today”) and the use of the term bothersome. There is evidence for the validity of such a single question as a measure of LBP severity (25).

Demographic factors.


We selected the categories developed and used by the World Health Organization (WHO) Health Survey (26).

Marital status.

We selected the categories developed and used by the WHO Health Survey (26).


A systematic review of alcohol consumption as a risk factor for LBP concluded that although an association has not been found, well-designed studies of alcohol and LBP are needed (27). There are no systematic reviews comparing measurements, but there is moderate evidence that the Alcohol Use Disorders Identification Test (28) shows excellent sensitivity and validity both for alcohol use disorders and risk drinking (29–32).

For exercise, the most commonly used measurement with good reliability and validity for adult working populations is the International Physical Activity Questionnaire (33), which has a short-form version. A respondent can be classified either continuously or more simply into 1 of 3 activity categories of low, moderate, or high levels of physical activity. This selection was endorsed by an independent sports/exercise research expert.

A systematic review of smoking as a risk for LBP concluded that although the evidence suggests a link, prospective cohorts are needed to explore this link (34). Although measurement of smoking behavior is a complex process, and many narrative reviews advocate diary methods (35), self-report questionnaires show excellent reliability when measured against biochemical verification (36). However, classification into never smoked, current smoker, or former smoker is inadequate, as this fails to measure duration, intensity, and frequency of smoking. There is moderate evidence to suggest that pack-years incorporate these aspects, and can be used in LBP cohorts (9).



There are no systematic reviews, consensus reports, or narrative reviews comparing different measures of catastrophizing. However, 2 instruments have been used in populations with LBP: the catastrophizing subscale from the Coping Strategies Questionnaire (37) and the Pain Catastrophizing Scale (PCS) (38). Of these, the PCS has a better-developed theoretical model (39), shows excellent validity (40), and has acceptable psychometric properties (factor structure, reliability, and validity) (41), and is therefore recommended.


We could find no systematic review comparing measurement of depression in the context of chronic pain. Two narrative reviews (42, 43) have stressed that commonly used instruments such as the Beck Depression Inventory (44) include somatic items that may contaminate results (45). Neither of the reviews recommends a specific instrument, but both suggest that measurements developed in psychiatric populations are not appropriate. A review of self-report instruments used in prospective LBP cohorts suggests that 4 instruments could potentially be used: the Center for Epidemiologic Studies Depression Scale (CES-D) (46, 47), the Hospital Anxiety and Depression Scale (48), the Zung Depression Inventory (49), and the General Health Questionnaire (50). All have demonstrated reasonable psychometric properties, but none has been tested against the exacting criteria currently required (10). Two independent experts selected the CES-D, based on its relative brevity, application to healthy populations, wide international use, and evidence for predictive quality in cohorts with LBP (51). The CES-D is not considered a diagnostic tool, although a standard cut point of 16 has been established to indicate depressive symptomatology (52). However, we caution researchers to explore the relative weighting of the items related to loss of appetite and sleep disturbance, which could be elevated due to pain.

Fear avoidance.

Relatively little work has been carried out comparing existing measures of fear avoidance. A narrative review (53) points out that self-report measures do not assess avoidance itself (the behavior), but rather cognitions about avoidance. The most commonly used measures in LBP populations are the Fear Avoidance Beliefs Questionnaire (FABQ) (54) and the Tampa Scale of Kinesiophobia (55), and there is a plethora of data on their psychometric properties. Both instruments have problems associated with the wording and the focus of some items (19). However, the FABQ includes a subscale that focuses on fear related to work, which appears to be predictive of work-related outcome (56–58) and to have better clinimetric properties; it was therefore selected.

Work-related factors.

Employment status.

A measurement was adopted from a recent narrative review (59) based on established criteria for quality (60, 61).

Type of work.

We selected the categories used by the WHO Health Survey (26), which provides a list of work categories.

Reasons for not working.

We selected the categories used by the WHO Health Survey (26), which provides a list of reasons for unemployment. In addition, the consensus recommended including a measure of the number of sick days over the previous year.

Financial factors (pending compensation, sickness benefit, insurance, and duration on current benefits).

Sickness benefit and compensation systems differ across health systems. The complex issue of how to standardize factors that differ between nations, across time, and within health systems formed a separate study involving the MMICS steering group. The information from that study informed the MMICS steering group's decision to simplify the measurement of these factors to yes/no responses and duration in days.

Work-related factors (job satisfaction, social support at work, a sense of control at work).

A systematic review of studies in LBP that included work-related factors (62) concluded that there is good evidence for the influence of job satisfaction, workplace social support, and a combination of job content and job control on outcome. Another systematic review provides support for the role of social support at work as a predictor of poor outcome (4). We identified several instruments that had been used in cohorts of populations with LBP. We rejected instruments that were lengthy (e.g., the Job Descriptive Index [63], Instrument zur stressbezogenen Tätigkeitsanalyse [64]). Shorter questionnaires included the Job Content Questionnaire (65, 66), Psychosocial Aspects of Work questionnaire (67), social support at work (68), and, for satisfaction only, the Adaptation, Partnership, Growth, Affection, and Resolve questionnaires (69), and the single-item Kunin (70), all of which have been used extensively in LBP populations. We received recommendations from 4 independent experts. The Kunin measure (70) has been used extensively, presents the information both in words and in pictures (faces), and has demonstrated excellent psychometric properties (71, 72), and was therefore selected for job satisfaction. Social support at work and work-related control are best measured by the Job Content Questionnaire (65), which has the advantage of psychometric testing across large samples, and with national comparisons (66)

Selected factors for which further research on measurement is required.

Our selection criteria for recommending measurement were not met by several factors that were strongly endorsed by the national teams. For many of these, research is in progress, and they are listed below with the intention of inclusion in an updated version should evidence be forthcoming for their support.


In the context of back pain, comorbidity is defined as the identification of comorbid conditions through self-report (questionnaire) that could impact on functional outcome (rather than mortality). Comorbidity in populations with LBP has been studied in several large samples (73–75), but the selection of measurement differs considerably between studies, and clinimetric properties are missing or inadequate. We note that progress is being made toward developing questionnaires about comorbidity that are focused on physical function as the primary outcome (11).

First appointment.

We did not find any systematic or narrative reviews on collecting data about first appointments in LBP. The national teams provided the items for the data they believed were required. These include time to first appointment and consultation length, provision of educational material, medication prescribed, and sick leave certification.

History of pain.

Because of the intermittent nature of LBP (17, 18), accurate measurement of previous episodes of LBP is clearly important, and this has been recognized by the MMICS teams. However, measures of LBP history (76, 77) have demonstrated poor reliability and validity.

Multi-site pain.

Several studies have reported that comorbidity of pain in other sites is very common in patients with LBP, and that those with multi-site pain might do worse than those with single-site pain. The most accepted measurement remains the pain drawing (78, 79), which reliably identifies widespread pain (80). However, coding such pain drawings can be problematic and time consuming. Another approach is to use the Graded Chronic Pain Scale (81), which specifies 6 sites (other than back pain). We sought advice from 3 experts in the field, but consensus could not be reached.



The measurement of cultural groupings is unique and specific to geographic populations. We approached 2 expert epidemiologists to recommend an internationally accepted measure, but they agreed that this was neither available nor practical.


Beliefs about cause and recovery.

We did not find any systematic, consensus, or narrative reviews of instruments measuring beliefs about cause and recovery in LBP. We identified several potential measurements used in back pain populations (e.g., The Back Beliefs Questionnaire [67], The Health Care Providers' Pain and Impairment Relationship Scale [82]), but information on clinimetrics was poor. We canvassed 3 independent experts but no agreement could be reached.

Exposure to chronic illness in the family/exposure to back pain in the family.

We could find no evidence for reliable and valid measurement of these factors.

Work related.

Expectations about return to work.

We identified only one measure of perceived functional capacity for work (83). Information on clinimetric properties was not available.

Factors associated with work place.

Modified work offered, occupation health available, employment policy about recovery-related return, and size of organization were selected by the teams, but no information was available on their measurement.

Followup measures.

Recommendations about measures of outcome for LBP have been previously published by a group of international experts (9). The recommendations by Deyo et al (9) include measurement of pain severity, disability (back-related function), generic well-being, days of work, cut-down activities, satisfaction with care, and condition at followup. In addition to these recommendations, the MMICS teams recommended including outcome measures on psychological factors and diaries measuring utilization of care and medication consumption. We found no systematic reviews comparing methods of assessing health care utilization and medication consumption in populations with either pain in general or back pain specifically. However, there are several empirical studies that have either compared measurements or tested the reliability of specific measurements. We have synthesized the findings from these studies for our recommendations. Health care utilization (e.g., number of consultations with various clinicians) appears to be reliably reported in populations with pain (84–86) via diaries, and there is some evidence to suggest that recall of utilization is reliable (84), although the maximum period for accurate recall is not known. Due to the risk of recall bias, daily diaries are recommended for increasing the validity and reliability of measurement of medication consumption (87, 88). However, recent work (89) found excellent correlations between twice-daily diaries of medication consumption and a twice-weekly questionnaire. To reduce cost and improve data completeness, and for practical considerations, the MMICS Statement recommends that the medication consumption and health care utilization be measured at least once every 2 weeks. In addition, if possible, we recommend the use of electronic diaries (90).


Efforts to prevent the development of chronic LBP require prediction models that can reliably and accurately identify patients at increased risk for focused interventions. This statement will allow identification of such prediction models by standardizing a comprehensive list of predictors.

The selection of factors for the MMICS Statement reflects the opinions and experience of a wide range of internationally acknowledged back pain experts. The recommendations for factor inclusion were therefore influenced by subjective, albeit consensus, opinion. Experts who did not agree to be involved also may not have agreed on aspects of the statement. We attempted to accommodate this by including an independent panel of advisors who did not wish to participate, but who were consulted about the process at each stage. The selection of measurements, in contrast, was evidence based, and factors for which an adequate measurement was not found were excluded from the core list.

It is not intended that these recommendations should be prescriptive. Rather, this is the starting point that prospective studies should include, and is the basis on which the potential pooling of data can start.

Although the consensus achieved on items for inclusion was high, it was seldom unanimous. This probably reflects participants' orientations: clinical and philosophical-scientific. However, it may be seen as remarkable that the degree of consensus was so substantial, given the range and complexity of the subject matter.

The selection of instruments should be updated in a few years, or as better information becomes available. There are several areas the MMICS Statement does not cover, notably, measurement of comorbidity and history of LBP. We recognize that these factors must be included in studies of the progression from early stages of LBP to long-term problems. However, in the absence of adequate measurement fitting the MMICS criteria, we leave the selection of such measures to independent researchers, while urging more research into the development of good measures.

A distinct limitation of the current MMICS Statement is the length and complexity of the battery of questionnaires. This may deter researchers from using the statement and might impact on patients' willingness to take part in studies. The need for succinct measurement is recognized by the steering group and the participating teams, and we encourage researchers to systematically examine data reduction procedures in reference to their data. However, the practice of arbitrarily pulling out single items for use without statistical evidence compromises reliability, validity, and sensitivity.

In conclusion, we present a comprehensive statement of recommended baseline factors, outcome factors, and measures for use in studies of prospective cohorts of individuals with LBP. These have particular relevance to followup prediction and are intended to guide researchers as well as facilitate the pooling of data in future reviews. The main intention was to help investigators construct their data collection batteries with greater confidence, at least for the medium-term future. These factors should not, however, be considered immutable for all time, especially where we have excluded items for lack of evidence or consensus. It is recommended that a revision of the current statement be considered within 5 years.


Dr. Pincus had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Pincus, Santos, Breen, Burton, Underwood.

Acquisition of data. Pincus, Santos.

Analysis and interpretation of data. Pincus, Santos, Breen, Burton, Underwood.

Manuscript preparation. Pincus, Santos, Breen, Burton, Underwood.

Statistical analysis. Pincus, Santos.


We gratefully acknowledge the contribution from the MMICS collaboration teams: Australia: Jennifer Keating (team coordinator), Rachelle Buchbinder, Peter Kent; France: Sylvie Rozenberg (team coordinator), B. Duqursnoy, J. Marty; Germany: Gerd Mueller (team coordinator), Grietje Freudenberg, Kerstin Luedtke; Ireland: Deirdre Hurley (team coordinator), Gerard Bury, Mary Cassel, Clement Leech, Anthony Staines; Israel: Amnon Lahad (team coordinator), Shmuel Reis; The Netherlands: Bart Koes (team coordinator), Marielle Goossens, Mauritz von Tulder; New Zealand: Mark Laslett (team coordinator), Haxby Abbott J, Susan Mercer, Maynard Williams; Spain: Francisco Kovacs (team coordinator), Victor Abraira, Pedro Berjano, Carmen Fernandez, Maria Teresa Gil del Real, Pablo Lazaro; Switzerland: Christine Cedraschi (team coordinator), F. Balague, E. Bodmer, Y. Robert, E. Roux; UK: Tamar Pincus (team coordinator), Alan Breen, Kim Burton, Rita Santos, Martin Underwood; US: Sherri Weiser (team coordinator), Marco Campello, Manny Halpern, Rudi Hieber, Angela Lis, Margareta Nordin. We also thank the independent experts who contributed to the project: Jeff Borkan, Dawn Carnes, Jo Dear, Clermont Dionne, Ron Donaldson, Achim Elfering, Nadine Foster, Umesh Kadam, Nick Kendall, Peter Kent, Chris Main, Stephen Morley, Glen Pransky, Philip Sawney, Paul Shekelle, Steven Vogel, Michael Von Korff, Gordon Waddell, and Robert West. We are grateful to Jill Heyden and Gary McFarlane for commenting on earlier drafts.