Dr. Maksymowych is a Senior Scholar of the Alberta Heritage Foundation for Medical Research.
Development and validation of the Edmonton Ankylosing Spondylitis Metrology Index
Version of Record online: 27 JUL 2006
Copyright © 2006 by the American College of Rheumatology
Arthritis Care & Research
Volume 55, Issue 4, pages 575–582, 15 August 2006
How to Cite
Maksymowych, W. P., Mallon, C., Richardson, R., Conner-Spady, B., JaureguI, E., Chung, C., Zappala, L., Pile, K. and Russell, A. S. (2006), Development and validation of the Edmonton Ankylosing Spondylitis Metrology Index. Arthritis & Rheumatism, 55: 575–582. doi: 10.1002/art.22103
- Issue online: 27 JUL 2006
- Version of Record online: 27 JUL 2006
- Manuscript Accepted: 22 DEC 2005
- Manuscript Received: 30 JUL 2005
- Spinal and hip mobility;
- Ankylosing spondylitis;
Assessment of spinal and hip mobility has been recommended by the Assessments in Ankylosing Spondylitis (AS) Working Group for clinical trials and record keeping, although suggested measures primarily reflect structural damage. Our objective was to validate a simple, 4-item composite measure of spinal and hip mobility, the Edmonton AS Metrology Index (EDASMI).
We assessed the EDASMI and the Bath AS Metrology Index (BASMI) using a total of 263 patients from 3 countries: Canada (n = 205), Australia (n = 29), and Colombia (n = 29). Intra- and interobserver reliability were assessed in a subset of 44 patients. Construct validity with respect to disease activity (Bath AS Disease Activity Index [BASDAI]), function (Bath AS Functional Index [BASFI]), and structural damage (modified Stoke AS Spinal Score [mSASSS]) was analyzed using correlation and hierarchical regression. Responsiveness was assessed in a subset of 33 patients who received either anti–tumor necrosis factor α therapy (n = 26) or pamidronate (n = 7) over 24 weeks.
In contrast to the EDASMI, BASMI scores covered a limited range, with 70% of patients demonstrating a score ≤3 (range 0–10) and 4 of 5 individual measures demonstrating substantial floor effects. Both measures were highly reliable (intraclass correlation coefficient >0.90) and demonstrated similar construct validity (EDASMI correlated with disease duration [0.52], BASDAI [0.24], BASFI [0.61], Bath Ankylosing Spondylitis Radiology Index [0.79], mSASSS [0.75]; P < 0.001 for all). The change in EDASMI score was significant after 24 weeks of therapy (standardized response mean 0.40; P = 0.03), but change in the BASMI was not significant.
The EDASMI is a simple, rapid, and reliable tool for the assessment of spinal mobility in AS that is responsive to therapeutic intervention.
The Assessments in Ankylosing Spondylitis (ASAS) Working Group has recommended that spinal mobility constitute one of the outcome domains assessed both for clinical record keeping and in the assessment of disease-controlling antirheumatic therapies (1). ASAS has specifically recommended the measurement of occiput-to-wall distance, the modified Schober's test, and chest expansion (CE) as the measures that should be used to assess spinal mobility. Although widely used, the occiput-to-wall distance and modified Schober's test primarily reflect irreversible structural damage and demonstrate limited sensitivity to change in studies of intensive physiotherapeutic interventions (2–4). CE is traditionally measured at the fourth intercostal space and often demonstrates poor reproducibility due to factors such as lack of standardization in the assessment technique and difficulties obtaining accurate estimates in some women (5). However, several studies have documented its responsiveness with both physiotherapeutic and pharmacologic interventions (2, 6–9).
The Bath Ankylosing Spondylitis Metrology Index (BASMI) is a validated composite index of 4 spinal measures (cervical rotation [CR], tragus-to-wall distance, modified Schober's test, and lateral lumbar flexion [LLF]) and 1 hip mobility measure (intermalleolar distance [IMD]) that requires 3 tools: a tape measure, a gravity-action goniometer (for CR), and a ruler mounted on a floor stand (for LLF) (10). Each measure is assigned a score of 0–2, with the higher score signifying greater impairment in mobility, so that the maximum aggregate score is 10. The rationale for selection of these specific measures was based primarily on their simplicity as well as the opinion that these measures most accurately reflected axial status. The rationale was not based on the measures' responsiveness or ability to discriminate between disease states, and although the BASMI has since been shown to be responsive, this is largely accounted for by one measure: LLF (11–13). Moreover, only 33% of ASAS Working Group members considered assessment of CR with a gravity-action goniometer to be clinically relevant. A potential limitation to tape-based methods is the measurement error arising from varying degrees of flexion and extension in the neck (14). Nevertheless, measurement of CR is desirable for the following reasons: minimal age-dependent deterioration (15), its impact on functional impairment (16), and its responsiveness (7, 12, 17–19). Assessment of IMD has proven to be rather uncomfortable for patients in our clinics and is often awkward for female patients.
In this report, we describe the development and validation of a new composite measure of spinal and hip mobility, the Edmonton Ankylosing Spondylitis Metrology Index (EDASMI). After conducting a literature review, we selected 4 mobility measures largely based on their clinimetric properties, especially responsiveness. The index also incorporates 2 novel approaches to the assessment of cervical and hip mobility that only require a tape measure, are less susceptible to flexion/extension movements of the neck, and can be largely performed with the patient comfortably seated on the examination table.
PATIENTS AND METHODS
Development of the EDASMI.
The development of the EDASMI was preceded by a review of the literature on spinal mobility assessment that was available up to 2003. We focused on the key elements of a filter proposed by the Outcome Measures in Rheumatology Clinical Trials (OMERACT) to evaluate outcomes: feasibility, reliability, and discrimination (20).
We first selected those measures for which an intraclass correlation coefficient (ICC) score ≥0.80 for interobserver reliability had been documented at ≥2 sites (4–6, 14, 17, 21, 22). A total of 30 measures met this criterion. In the second step, we selected those measures that have also demonstrated at least moderate responsiveness, i.e., an effect size ≥0.5, following a short course (3 weeks) of intensive inpatient physiotherapy (2, 6, 7, 10, 17–19, 22). Five measures demonstrated an effect size ≥0.5 (CR, finger-to-floor distance, cervical forward flexion, CE, and thoracolumbar rotation). We also selected LLF because 2 controlled trials of pharmaceutical interventions, infliximab and pamidronate, consistently demonstrated that this measure discriminated between treatment groups and was the most responsive of the measures comprising the BASMI (11, 13). In the third step, we addressed issues of clinical feasibility. Measures were excluded because of 1) the presence of significant confounders (e.g., hamstring tightness for finger-to-floor distance) and 2) the requirement for specialized equipment (cervical forward flexion, thoracolumbar rotation). Although CR has also typically been measured using specialized equipment, this measure was selected because of its importance to function (23). We developed a new method of assessment that required the use of a tape measure only and that was not susceptible to the confounding effects of cervical flexion/extension movement. CE was selected, although this was assessed at the xiphisternum rather than the fourth intercostal space due to improved feasibility in females. LLF was selected, although we chose a method that minimizes opportunities for confounding forward flexion movements (24).
Our selection criteria did not result in the selection of a hip mobility measure. In view of the functional and prognostic implications of hip involvement in AS (16), we elected to evaluate a measure of hip mobility, although methodologic approaches described to date lack clinical feasibility. Internal rotation of the hip (IRH) is more responsive to physiotherapeutic intervention and demonstrates lower correlation with age than hip abduction (2, 15). We therefore developed a simple tape-based approach to the assessment of IRH. The resultant index therefore comprises 4 measures (CR, CE, LLF, and IRH), 3 of which can be readily performed with the patient comfortably seated on the examination table.
The sample consisted of 263 individuals. Of these, 205 were consecutive outpatients followed by rheumatologists in the city of Edmonton, Canada, at both tertiary (University of Alberta Hospital) and community-based sites; 29 were outpatients at a community-based site in Townsville, Australia; and 29 attended a tertiary-based facility in Bogota, Colombia. All patients met the modified New York criteria for AS (25) and reflected a broad spectrum of patients with axial and peripheral disease. The study was approved by the ethics committee at the University of Alberta, the ethics committee of Townsville Hospital, and the ethics committee of the CAYRE Arthritis and Rehabilitation Clinic of Bogota. All patients provided written informed consent.
Mobility assessments: the EDASMI.
The 4 mobility measures comprising the EDASMI were measured in a standardized fashion by trained clinician nurses using only a measuring tape according to the instructions described below.
The patient is in a sitting position on the examination table. A pen mark is made in the suprasternal notch. The patient is asked to rotate the head as far as possible towards the right shoulder while looking directly ahead. The distance (in cm) between the pen mark in the suprasternal notch and the tragus of the right ear is recorded using a tape measure. The patient is then asked to rotate the head as far as possible towards the left shoulder. The distance (in cm) between the pen mark in the suprasternal notch and the tragus of the right ear is again recorded with the tape measure. Total CR is recorded as the distance (in cm) between the 2 measurements.
The patient is in a sitting position on the examination table with the hands on the hips. A pen mark is made at the xiphisternum and a tape measure is placed around the circumference of the patient's chest at this level. The patient is asked to take a deep breath and to exhale as completely as possible while looking directly ahead. The measurement (in cm) is noted. The patient is then asked to inhale as deeply as possible and the measurement (in cm) is noted. The difference in the 2 measurement points (in cm) constitutes the value for CE.
Lateral lumbar flexion.
We used a method that has been described previously (24). The patient is asked to stand straight with the back, heels, buttocks, and shoulders to the wall; hands by the side; and all fingers straight against the thigh while looking directly forward. A pen mark is made on the thigh adjacent to the tip of the middle finger. The patient is then asked to slide the hand down the right thigh as far as possible without lifting the left foot/heel or flexing the right knee. A pen mark is made adjacent to the tip of the middle finger. The difference between the 2 marks is calculated using a tape measure. The mean of the right and left scores is recorded as the final score.
Internal rotation of the hip.
The patient is seated on an examining table with the knees and hips flexed at 90 degrees and the knees together against the table clasping a piece of card paper. The patient is asked to move the ankles apart as far as possible without releasing the card paper from between the knees. The distance between the medial malleoli is recorded using a tape measure (Figure 1).
Scoring of the EDASMI.
The approach to the scoring of the EDASMI was based on percentiles of the cumulative distribution for each mobility measure. We selected cutoffs representing the 20th, 40th, 60th, and 80th percentiles and assigned a score of 0 for ≥80th percentile, 1 for ≥60–79th percentile, 2 for ≥40–59th percentile, 3 for ≥20–39th percentile, and 4 for <20th percentile. This approach resulted in a scoring range of 0–16 for the 4 measures comprising the EDASMI.
The first 44 consecutive patients with AS (33 men, mean age 42.7 years [range 22–68 years], mean disease duration 14.5 years [range 2–44 years]) studied in Edmonton underwent repeat assessments of the EDASMI and BASMI by a trained clinician nurse and a rheumatologist to assess intra- and interobserver reproducibility. Patients underwent all measurements without warmup starting from mid-morning to allow for resolution of morning stiffness. Each composite index was assessed in its entirety before the alternate index was assessed, although both the order of assessment of each index and the sequence of evaluation by the 2 assessors were randomized for each patient. Any pen marks were removed with alcohol once assessment of the entire composite index had been completed. All subsequent assessments were performed by a clinician nurse at the Canadian site, by a rheumatologist at the Colombian site, and by a trained medical student at the Australian site.
All patients have been recruited to a prospective, longitudinal cohort of patients with AS in which data are systematically recorded on disease-specific health status (Bath Ankylosing Spondylitis Disease Activity Index [BASDAI]  and Bath Ankylosing Spondylitis Functional Index [BASFI] ) and structural damage as recorded on plain radiography of the spine (modified Stoke Ankylosing Spondylitis Spinal Score [mSASSS]  and Bath Ankylosing Spondylitis Radiology Index [BASRI] ). The mSASSS scores lesions in the anterior corners of lumbar and cervical vertebrae on lateral radiographs of the spine (score range 0–72). The BASRI-spine score grades radiographic changes in the sacroiliac joint (0–4 New York method) and lumbar and cervical spines (0–4 for each segment; total score 2–12). These data were used to examine construct validity. Radiographs of 56 individuals were scored independently by 2 observers after a period of training and a scoring exercise in which interobserver reproducibility was demonstrated to be >0.90.
Of the 205 patients studied in Edmonton, 33 were subsequently recruited to clinical trials of open-label treatment with infliximab (n = 4) or pamidronate (n = 7) and randomized, double-blind, placebo-controlled trials of infliximab (n = 11; randomization 3:8 for placebo:infliximab) and adalimumab (n = 11; randomization 1:1 for placebo:adalimumab). Mobility assessments were conducted by the same assessor and were repeated 24 weeks after the start of treatment. These data were used to analyze responsiveness.
Descriptive statistics (mean, median, SD) were used to describe the overall distribution of scores. The frequency distribution of scores for each mobility measure was analyzed by calculating its skewness (30). A skewness value of more than twice its standard error is considered to indicate a departure from normality. Cronbach's alpha was used to measure the multidimensional construct of the EDASMI.
The intra- and interobserver reproducibility were calculated using analysis of variance to provide an ICC. A 2-way mixed effects model with observer as a fixed factor was used. A value >0.6 was designated as representing good reproducibility, a value >0.8 represented very good reproducibility, and a value >0.9 represented excellent reproducibility. Reproducibility was also examined using Bland-Altman plots and 95% limits of agreement. The interrater variance was used to calculate the smallest detectable difference (SDD) between 2 readings by 2 raters for a single patient. The SDD was calculated by multiplying the SD of the differences by 1.96.
Construct validity was assessed by analyzing correlations (Pearson's correlation for normally distributed data, Spearman's rho for nonparametric data, 2-tailed test) between mobility scores and age, disease duration, disease activity (BASDAI), function (BASFI), and structural damage scores on plain radiograph (mSASSS, BASRI). Hierarchical (sequential) linear regression was used to assess the contributions of the EDASMI and BASMI composite scores to the variance in the BASFI, adjusted for age, disease duration, and the BASDAI. The independent variables for the models were entered in 2 sets in the following order: 1) age, disease duration, and the BASDAI and 2) the EDASMI/BASMI composite score. Two separate hierarchical regression analyses were used to assess the contributions of structural damage on radiograph (mSASSS) and the BASDAI to the variance in the EDASMI and BASMI, after adjusting for disease duration. The independent variables for the models were entered in the following order: disease duration, mSASSS, and BASDAI. In all analyses, a P value less than 0.05 was considered statistically significant.
Two statistical methods were used to assess responsiveness: the effect size and the standardized response mean (SRM). Values of 0.20, 0.50, and ≥0.80 were considered to represent small, moderate, and large degrees of responsiveness, respectively. Differences between pretreatment and posttreatment scores were assessed by paired t-test. Discrimination was not assessed because the open-label phase of the clinical trials is still ongoing and treatment codes remain unbroken at this time.
Descriptive data and scoring of the EDASMI.
Compared with patients in the Edmonton cohort, those in the Colombia cohort had shorter disease duration and a higher prevalence of peripheral synovitis, whereas those in the Australia cohort had a higher prevalence of hip involvement (Table 1). The assessment of both the EDASMI and the BASMI required no more than 5–10 minutes per patient.
|Edmonton (n = 205)||Colombia (n = 29)||Australia (n = 29)|
|Age, years||41.5 ± 12||40.6 ± 10||44 ± 15.1|
|Height, cm||172.4 ± 9.7||160 ± 9.3||171.8 ± 8.6|
|Disease duration, years†||17.1 ± 12.4||6.7 ± 6.2||18.9 ± 12.2|
|Peripheral synovitis, %||10.7||37.9||17.2|
|Hip disease, %‡||18.1||27.6||41.4|
|BASDAI||4.7 ± 2.4||4.8 ± 2.1||4.3 ± 2.1|
|BASFI||3.8 ± 2.7||3.6 ± 2.4||3.9 ± 2.7|
|Total back pain||5.2 ± 2.8||4.9 ± 2.6||3.6 ± 2.4|
|BASMI||2.9 ± 2.5||3.1 ± 2.1||3.0 ± 2.7|
|EDASMI||8.2 ± 3.8||8.0 ± 3.3||8.7 ± 3.7|
With the exception of LLF, the frequency distribution of scores for each mobility measure as well as the BASMI composite score was skewed (Table 2). The distribution of scores for the EDASMI composite score covered the entire range, with a mean ± SD score of 8.23 ± 3.75 and a median score of 8.0 (interquartile range 5.0–11.0). The majority of patients had low BASMI scores, with the median score (2.0) being close to the 25th percentile score (1.0). The floor effect observed with the BASMI was substantial, with 14.8% of patients scoring 0 and 70.1% of patients scoring ≤3.
|Mean ± SD||Median||Skewness†||Range||Percentiles||Scores||Highest score, %||Score of 0, %|
|EDASMI||8.22 ± 3.8||8.00||0.15||1.00–16.00||5.0||7.0||9.0||12.0||–||–||–||–||–||2.7||0.0|
|Cervical rotation (EDASMI)||2.65 ± 1.5||2.50||0.67||0.00–8.50||1.4||2.0||2.8||4.0||<1.4||≥1.4–<2.0||≥2.0–<2.8||≥2.8–<4.0||≥4.0||20.2||15.6|
|Lumbar lateral flexion (EDASMI)||11.99 ± 6.08||12.00||0.10||0.75–27.25||5.7||10.2||13.8||18.0||<5.7||≥5.7–<10.2||≥10.2–<13.8||≥13.8–<18.0||≥18.0||19.8||19.4|
|Chest expansion||4.14 ± 2.23||4.00||0.48||0.00–10.50||2.3||3.0||4.5||6.0||<2.3||≥2.3–<3.0||≥3.0–<4.5||≥4.5–<6.0||≥6.0||20.5||16.3|
|Hip internal rotation||37.59 ± 12.15||39.50||−0.41||0.00–74.00||27||36||42||47||<27.0||≥27–<36||≥36–<42||≥42–<47||≥47||20.2||19.8|
|BASMI||2.91 ± 2.46||2.00||0.78||0.00–10.00||1.00||2.00||3.00||6.00||–||–||–||–||–||0.8||14.8|
|Tragus-to-wall||14.61 ± 5.42||12.25||1.65||9.00–37.50||10.96||11.62||13.00||17.88||–||–||>30||15–30||<15||1.9||68.1|
|Cervical rotation (BASMI)||56.44 ± 19.85||60.00||−0.75||0.00–94.00||41.00||55.00||65.00||74.00||–||–||<20||20–70||>70||6.7||26.3|
|Lumbar side flexion (BASMI)||11.92 ± 6.27||12.00||0.06||0.75–25.00||5.95||10.00||13.75||17.75||–||–||<5||5–10||>10||15.6||59.3|
|Modified Schober's||4.66 ± 2.24||5.00||−0.29||0.00–9.00||2.38||4.46||5.50||6.50||–||–||<2||2–4||>4||17.2||62.2|
|Intermalleolar distance||95.85 ± 25.56||101.00||−0.77||13.00–160.00||76.60||92.10||106.0||117.0||–||–||<70||70–100||>100||16.0||50.2|
There was relatively little redundancy between mobility measures comprising the EDASMI, with the strongest correlation observed between CE and LLF (Pearson's correlation coefficient = 0.50) followed by IRH and LLF (Pearson's correlation coefficient = 0.43). Cronbach's alpha was 0.62, indicating that the different measures comprising the EDASMI record different aspects of axial mobility.
Both intra- and interobserver reproducibility were very good to excellent for all mobility measures and were comparable between the 2 composite measures (Table 3). Reproducibility was somewhat lower for the 2 new mobility measures included in the EDASMI (CR and IRH). The 95% Bland-Altman limits of agreement were −2.6, 2.7 for the EDASMI and −1.8, 1.6 for the BASMI. Bland-Altman plots showed that measurement error was randomly distributed across the range of mean scores for both composite measures (data not shown).
|Measures||Intraobserver ICC||Interobserver ICC|
|Observer 1||Observer 2|
|Lumbar lateral flexion||0.99||0.98||0.98|
|Hip internal rotation||0.98||0.88||0.88|
|Modified Schober's test||0.97||0.99||0.97|
|Lumbar lateral flexion||0.92||0.97||0.92|
The EDASMI and BASMI scores demonstrated similar high degrees of correlation with age, disease duration, function (BASFI), and structural damage recorded on plain imaging (BASRI, mSASSS) (Table 4). Correlations were less evident with disease activity (BASDAI) and spinal stiffness (mean of items 5 and 6 of the BASDAI [data not shown]). EDASMI and BASMI scores for CR were significantly correlated (r = 0.34; 95% confidence interval 0.17, 0.45; P < 0.001) and demonstrated significant correlations with scores for item 8 of the BASFI, which primarily reflects the patient's self-reported ability to shoulder check (such as when driving), and with scores for the cervical component of the mSASSS (data not shown). EDASMI and BASMI scores for hip mobility correlated significantly with the BASFI (data not shown).
By hierarchical regression analysis with the EDASMI or BASMI as the dependent variable, the mSASSS added significantly to the variance in each composite after adjusting for disease duration (R2 change = 0.11 [P < 0.01] and 0.26 [P < 0.001] for the EDASMI and BASMI, respectively). Adjusting for disease duration and the mSASSS, the BASDAI added significantly to the variance in the BASMI (R2 change = 0.03 [P = 0.04]) but not the EDASMI (R2 change = 0.03 [P = 0.06]).
With the EDASMI or the BASMI as independent variables, both composite indices added significantly to the variance in the BASFI after adjusting for age, disease duration, and the BASDAI (R2 change = 0.13 and 0.17, respectively; P < 0.0001). When the individual items comprising the EDASMI and BASMI were entered into the regression model as independent variables, their contribution to the variance in the BASFI was 0.14 and 0.18, respectively.
Significant change was demonstrable in the EDASMI after 24 weeks of treatment (Table 5). LLF was the most responsive measure. Only 20% of the sample population had clinical evidence of hip disease, and therefore the responsiveness of the EDASMI and the BASMI was also calculated with the hip score excluded. Responsiveness was increased for the EDASMI (SRM 0.47, P = 0.01) but was unchanged for the BASMI. The differences noted for the BASMI, the BASMI score minus the intermalleolar distance, and the individual components of the BASMI were not significant.
|Parameter||Mean ± SD difference||95% CI||ES||SRM||P†|
|Total||0.94 ± 2.33||0.11, 1.77||0.27||0.40||0.03|
|Cervical rotation||0.30 ± 0.98||−0.05, 0.65||0.19||0.31||NS|
|Chest expansion||0.21 ± 0.86||−0.09, 0.52||0.17||0.25||NS|
|Lateral lumbar flexion||0.33 ± 0.89||0.02, 0.65||0.23||0.37||0.04|
|Internal rotation of hip||0.09 ± 1.01||−0.27, 0.45||0.06||0.09||NS|
|EDASMI (minus hip)‡||0.85 ± 1.79||0.21, 1.48||0.29||0.47||0.01|
|Total||0.42 ± 1.28||−0.03, 0.88||0.18||0.33||NS|
|Tragus-to-wall distance||0.06 ± 0.35||−0.06, 0.18||0.12||0.17||NS|
|Cervical rotation||0.09 ± 0.38||−0.05, 0.23||0.21||0.24||NS|
|Modified Schober's||0.12 ± 0.55||−0.07, 0.31||0.15||0.22||NS|
|Lateral lumbar flexion||0.09 ± 0.46||−0.07, 0.25||0.11||0.20||NS|
|Intermalleolar distance||0.09 ± 0.52||−0.09, 0.28||0.15||0.17||NS|
|BASMI (minus IMD)‡||0.36 ± 1.06||−0.01, 0.74||0.18||0.34||NS|
We have developed an outcome tool for the assessment of spinal and hip mobility that appears to meet the standards of feasibility, truth, and discrimination, which have been used as the key elements of a filter to evaluate outcomes by OMERACT (20). A key attribute of this tool is its simplicity; all measures require a measuring tape only, and 3 of the 4 measures are conducted with the patient seated on the examining table. This may promote its acceptance in routine clinical care.
The wide distribution of scores for the EDASMI contrasts with the data for the BASMI in this study as well as an earlier report from the University of Alberta demonstrating substantial floor effects for the composite index as well as the individual measures (13). Between 60% and 83% of patients scored 0 for tragus-to-wall distance, modified Schober's test, and IMD using the 0–2 scoring scheme in the BASMI (13). Although the developers of the BASMI have also proposed a 0–10 scoring range for each measure in the BASMI (31), substantial floor effects are still observed with 3 measures: tragus-to-wall distance, modified Schober's test, and IMD (13).
The reliability of spinal mobility measures has varied considerably among studies at single sites that have mostly assessed reliability using 2 observers. However, studies where several observers have been used to assess reliability have provided evidence to support the reliability of CR, CE, and LLF (3, 4, 21). A systematic review of the literature came to the same conclusion (32). Although highly reproducible in our study and somewhat more reproducible than the tape-based method, a potential limitation of the BASMI approach to scoring CR is the marked skewing of scores, with only 13% of patients receiving a score of 2 on the 0–2 scale (data not shown). Evidence supporting the responsiveness of CR in patients undergoing short-term intensive physiotherapy (2), in patients reporting a transition in health status (22), and in a phase 3 trial of infliximab in patients with AS (12) indicates that CR should be included in a mobility index.
Although previous reports have raised concerns regarding the reliability of measurement of CE (4, 6), the scores in these studies were largely recorded by placing the tape across the fourth intercostal space or nipples. In contrast, reproducibility was excellent in our study and in a multicenter study (3). CE measurement is minimally affected by age or warmup (15, 32, 33); its responsiveness has been demonstrated in patients undergoing short-term intensive physiotherapy (2); and it has been shown to discriminate between treatment groups in patients receiving nonsteroidal antiinflammatory drugs (34), sulfasalazine (8), infliximab (12), and etanercept (9, 35, 36).
A specific approach to measurement of LLF for the EDASMI (24) was chosen to minimize the measurement error that can occur with flexion and rotational movements of the spine when the approach described for the BASMI is used. However, both methods were equally reliable in our study. Assessment of LLF responsiveness has been limited, although we have previously shown that it was the most responsive measure among the BASMI items in patients receiving pamidronate in a randomized controlled trial (13). It was also responsive to short-term intensive physiotherapy (2) and was shown to discriminate between patients receiving infliximab versus those receiving placebo (11, 12).
Assessment of IMD indicated excellent reliability comparable with that reported previously (10). Reliability of our approach to IRH evaluation was very good and was better than that reported with the use of a goniometer (21). Few studies have examined responsiveness. IRH demonstrated responsiveness comparable with CE, finger-to-floor distance, and CR in one study of short-term intensive physiotherapy (2).
Although the responsiveness of the EDASMI was somewhat better than that of the BASMI, it could be argued that this reflects the more limited scoring range (0–2) of the BASMI. However, we have previously shown that responsiveness is not improved by using a 0–10 scoring range for each item of the BASMI (13), which may be explained by the substantial floor effects noted for all the measures comprising the BASMI. Both composite indices should be further evaluated for responsiveness in clinical trials of highly effective therapies such as anti–tumor necrosis factor agents.
A potential limitation of our study is the fact that assessments were conducted in a sequential manner. This could serve to either increase measurement error through the phenomenon of warmup or, conversely, minimize the degree of intraobserver error. Two previous reports that used sequential methods of assessment with multiple observers have shown, however, that the major portion of measurement error is due to observer variability in assessment of mobility rather than the chronology of assessment (4, 21).
In conclusion, the EDASMI is a simple and reliable 4-item tool for the assessment of spinal and hip mobility in patients with AS that requires the use of a tape measure only and assesses CR, CE, LLF, and IRH. It covers the entire range of possible scores. Construct validity is comparable with the BASMI, whereas responsiveness is more evident using the EDASMI. The clinimetric properties of both instruments should be further examined in settings that include multiple observers.
- 12Ankylosing Spondylitis Study for the Evaluation of Recombinant Infliximab Therapy Study Group. Efficacy and safety of infliximab in patients with ankylosing spondylitis: results of a randomized, placebo-controlled trial (ASSERT). Arthritis Rheum 2005; 52: 582–91., , , , , , et al, and the
- 18Intensive in-patient physiotherapy courses improve movement and posture in ankylosing spondylitis. Physiotherapy 1986; 5: 238–40., , .
- 28A radiographic scoring system and identification of variables measuring structural damage in ankylosing spondylitis [thesis]. Nijmegen, The Netherlands: University of Nijmegen; 1994., , , , , .
- 30Statistical methods in education and psychology. 3rd ed. Needham Heights (MA): Allyn and Bacon; 1996., .
- 32Health outcomes in ankylosing spondylitis: an evaluation of patient-based and anthropometric measures [DPhil thesis]. York, UK: University of York; 2000..
- 35Enbrel Ankylosing Spondylitis Study Group. Recombinant human tumor necrosis factor receptor (etanercept) for treating ankylosing spondylitis: a randomized, controlled trial. Arthritis Rheum 2003; 48: 3230–6., , , , , , et al, and the
- 36Responsiveness and discriminative capacity of the assessments in ankylosing spondylitis disease-controlling antirheumatic therapy core set and other outcome measures in a trial of etanercept in ankylosing spondylitis. Arthritis Rheum 2004; 51: 1–8., , , , .