Stability of caregiver-reported manual ability and gross motor function classifications of cerebral palsy


  • Acknowledgments
    We gratefully acknowledge the contributions of the families who took part in the study and the generous assistance of Sue Reid, manager of the Victorian Cerebral Palsy Register, and the Royal Children’s Hospital orthopaedic department. This study involves secondary analyses of data gathered in a project that was supported by a grant from the Murdoch Childrens Research Institute and the first author’s doctoral scholarships from La Trobe University (2005–2006) and the Australian National Health and Medical Research Council/Cerebral Palsy Foundation (2007–2008). The authors acknowledge the contributions to the larger project by Professors Sheena Reilly and Karen Dodd.

Dr Christine Imms, School of Occupational Therapy, La Trobe University, Melbourne, VIC 3086, Australia. E-mail:


Aim  To examine the stability of caregiver-reported classifications of function of children with cerebral palsy (CP) measured 12 months apart.

Method  Participants were 86 children (50 males, 36 females) with CP of all motor types and severities who were recruited into a population-based longitudinal study. Children were aged 11 years 8 months (SD 6mo) on the first assessment and 12 years 8 months (SD 6mo) on the second assessment. Data were gathered through a postal survey. Caregivers reported on the Manual Ability Classification System (MACS), the Gross Motor Function Classification System (GMFCS), and other demographic characteristics. The percentage absolute agreement and the intraclass coefficient correlation (ICC) equivalent of the weighted kappa were calculated to assess consistency between assessments for the MACS and GMFCS. We also examined associations between changes in classification and background variables.

Results  Fifty-eight caregivers (67%) classified their child at the same MACS level on both assessments (ICC 0.92; 95% confidence interval [CI] 0.87–0.95), whereas 79% did so with the GMFCS (ICC 0.95; 95% CI 0.92–0.96). The evidence suggests that caregivers who were not born in Australia or who spoke a language other than English in the home were more likely to classify their child differently on the MACS at the second assessment, although this was not evident for the GMFCS.

Interpretation  Caregiver-reported MACS and GMFCS levels were generally stable over 12 months.


Manual Ability Classification System


School-Aged Temperament Inventory


Australian Bureau of Statistics Socio-Economic Index for Areas

The Manual Ability Classification System (MACS) is a recently developed tool to assess children’s ability to handle objects in daily life.1 The MACS was designed for children with cerebral palsy (CP) aged 4 to 18 years. It is commonly used together with the Gross Motor Function Classification System (GMFCS), which classifies children’s ability in relation to walking and sitting.2,3 The classifications add valuable functional information to the CP diagnosis.4 Both classifications group individuals into one of five levels, which reflect differences within the child’s daily life that are meaningful for families (see Appendix SI published online, for brief descriptors of each level for both tools). Gross motor function and manual ability in CP are not equivalent entities. Only around 50% of children are classified at the same MACS and GMFCS level, confirming the need to use both classifications.1,5 Both classifications aim to enhance communication between families and professionals when describing a child’s motor function, setting goals, and making management decisions.4 Therefore, it is important to investigate the validity, reliability, and stability of the classification between and within users to make sure that understanding and interpretation of the classifications are similar and consistent over time.

The GMFCS is well established, with evidence of validity, reliability, clinical utility, and stability of scoring for professionals.2,3,6 There is also high interrater reliability between caregivers and professionals,7,8 but so far there is no evidence of stability of the scoring for caregivers. The MACS is relatively new but has already gained much international attention and has been translated into 15 languages ( There is some evidence for validity and reliability of clinician and caregiver ratings.1,9 To our knowledge, to date, no one has investigated the stability of the measures over time for either professionals or caregivers.

Stability of measurement in the absence of change in the child is very important. We do not expect children to change levels within the GMFCS or the MACS, and there is evidence supporting this expectation for the GMFCS: 73% of children remained in the same level over 6 months when scored by professionals.6 We aimed to test our expectation that the MACS level would remain the same over 1 year and to examine whether it was possible to identify children who were more likely to be rated differently at the second assessment.

Because the MACS is new it has been applied in different ways. Data have been gathered following verbal introduction of the tool1 and by postal survey, chart review, and observation.9–11 Emergent results need to be examined to build a body of evidence regarding the measurement properties of the tool. We need to understand whether there are variables that make it more difficult to classify children consistently or whether any variables influence raters’ ability to classify consistently.

The purpose of the present study was to examine whether caregiver-reported classification of function of children with CP was stable between two assessments 12 months apart. In addition, we aimed to identify which variables associated with the child or the caregiver appeared to be important when caregivers reported a different classification level at the second assessment.


Participants in this study were enrolled in a longitudinal study of the participation of children with CP in social and recreational activities. All eligible children were born in Victoria, Australia, in 1994 or 1995 and were identified using the Victorian Cerebral Palsy Register. Figure 1 shows the flow of participants through the longitudinal study. Eighty-six children provided data for the present analysis, accounting for 40% of the known living population of children born in Victoria in those 2 years who had CP. Ethical approvals were obtained from the Royal Children’s Hospital and La Trobe University, Melbourne; children and families gave informed consent for their participation and the publication of results.

Figure 1.

 Flow of participants through the study. VCPR, Victorian Cerebral Palsy Register.

As participants were geographically dispersed, data were gathered through a postal survey, with three reminders, in September and October of 2006 and 2007. Respondents for this study were parents or caregivers of the children. For consistency of reporting, the term caregiver is used throughout the paper.

Caregivers were provided with a description of the MACS and instructions for completing the scale, including distinctions between the levels,12 and the GMFCS family report questionnaire for 6- to 12-year-olds.13 Independent variables selected for inclusion in this study were those that we judged may make a child more difficult to classify consistently, as well as variables that may influence a caregiver’s ability to classify. Age was not included as the study was restricted to two birth years. The variables related to the child were sex, communication ability, presence of epilepsy, specific learning disability, or intellectual disability, type of school attended by the child, and whether the child’s health had been stable in the previous 4 months. Child’s health status included major surgical events and was requested for the previous 4 months as this was the time-frame for the primary outcome of the longitudinal study. In addition, the School-Aged Temperament Inventory (SATI)14 was completed, which provides a measure of temperament in the following four domains: negative reactivity, task persistence, approach/withdrawal, and activity. Scores for each domain range from 1 to 5, with a high score indicating that a child is highly reactive, lacks persistence, is shy, or is highly active respectively. The SATI is a valid and reliable measure for Australian children.15,16 In the present study, the SATI was slightly modified for use with children with CP (modifications available on request) and was completed only by caregivers of children at GMFCS levels I to IV.

The variables related to the caregivers were whether they were born in Australia and whether they spoke a language other than English in the home. In addition, a socio-economic index was obtained for the local area of residence of each family, using the Australian Bureau of Statistics Socio-Economic Index for Areas (SEIFA) advantage/disadvantage score. This score was extracted at the level of the national census collector district (including around 200 households) and has a national mean of 1000 (SD 100).17 High SEIFA scores indicate relatively high proportions of people with high incomes or in skilled work.


Participant characteristics (sex, year born, GMFCS level, CP type and distribution, and SEIFA score) were compared with those of non-participants using data obtained from the Victorian Cerebral Palsy Register (de-identified for non-participants) to examine the extent to which participants were representative of the eligible population.

Absolute agreement between caregiver ratings on the MACS and GMFCS at each assessment is reported as a percentage. Stability of caregiver ratings was evaluated using the intraclass correlation coefficient (ICC), estimated using the standard method of moments from the one-way analysis of variance. McNemar’s test for paired proportions was used to evaluate the evidence for a systematic shift to higher or lower classifications between assessments. Children were categorized according to whether their classification changed between assessments, and the association between this indicator variable and background characteristics was examined, using the χ2 test for dichotomous characteristics and the independent-samples t-test for continuous variables. Data were analyzed using SPSS, version 14 (SPSS Inc., Chicago, IL, USA). p values were interpreted according to Sterne and Davey-Smith18 as indicating the strength of the evidence against a null hypothesis (i.e. the smaller the p value, the stronger the evidence), rather than using arbitrary thresholds to declare significance or otherwise.



Data were available at both assessments for 86 children, who had a mean age of 11 years 8 months (SD 6mo) at the first assessment and 12 years 8 months (SD 6mo) at the second assessment. Only minor differences were found in the comparison between participants and non-participants using data from the Victorian Cerebral Palsy Register (Table I). Table II provides caregiver-reported participant characteristics and data obtained at the first assessment for each variable considered in the study.

Table I.   Comparison between participants and non-participants using Victorian Cerebral Palsy Register data, n (%)
CharacteristicParticipants (n=86)Non-participants (n=133)p
  1. All data de-identified for non-participants; p value from χ2 analyses except for SEIFA score where it is from t-test. SEIFA, Socio-Economic Index for Areas.

 Male50 (58)74 (56)0.797
 Female36 (42)59 (44)
Year born
 199436 (42)66 (50)0.151
 199550 (58)67 (50)
Gross Motor Function Classification System level
 I21 (24)3 (32)0.336
 II24 (28)27 (20)
 III9 (11)15 (11)
 IV12 (14)16 (12)
 V13 (15)22 (17)
 Missing7 (8)10 (8)
Cerebral palsy distribution
 Monoplegia01 (1)0.529
 Hemiplegia23 (27)47 (35)
 Diplegia27 (31)30 (23)
 Triplegia2 (2)3 (2)
 Quadriplegia31 (36)46 (34)
 Missing3 (3)6 (5)
Cerebral palsy motor type
 Spasticity74 (86)120 (90)0.316
 Dystonia/athetosis8 (10)5 (4)
 Ataxia2 (2)6 (5)
 Hypotonic1 (1)2 (1)
 Missing1 (1)0
SEIFA score, mean (SD)999.5 (88.7)999.1 (87.6)0.987
Table II.   Caregiver-reported characteristics for the 86 participants at the first assessment
Child characteristicsn (%)
  1. aTemperament scores are from the School-Aged Temperament Inventory; range for each domain 1–5. MACS, Manual Ability Classification System; GMFCS, Gross Motor Function Classification System; SEIFA, Australian Bureau of Statistics Socio-Economic Index for Areas (population mean score 1000, SD 100).

 Male50 (58)
 Female36 (42)
Year born
 199436 (42)
 199550 (58)
 Age, y, mean (SD),11.69 (0.54)
GMFCS level
 I17 (20)
 II32 (37)
 III9 (11)
 IV8 (9)
 V20 (23)
MACS score
 I17 (20)
 II29 (34)
 III14 (16)
 IV8 (9)
 V18 (21)
Cerebral palsy distribution
 Hemiplegia33 (39)
 Diplegia or triplegia20 (23)
 Quadriplegia32 (37)
 Missing1 (1)
Cerebral palsy motor type
 Spasticity59 (69)
 Dystonia5 (6)
 Ataxia3 (3)
 Mixed10 (12)
 Hypotonic5 (6)
 Missing4 (4)
Difficulty communicating39 (45)
Epilepsy24 (28)
Specific learning disability40 (47)
Intellectual disability33 (38)
School type
 Mainstream56 (65)
 Special25 (29)
 Mixed5 (6)
Stable health73 (85)
Temperament score, mean (SD)a
 Negative reactivity (n=74)3.05 (0.87)
 Task persistence (n=67)3.06 (0.77)
 Approach/withdrawal (n=76)2.60 (0.82)
 Activity (n=71)2.60 (0.90)
Caregiver characteristics
 Born in Australia64 (74)
 Language other than English10 (12)
 SEIFA score, mean (SD)999.5 (88.7)

Stability of caregiver MACS and GMFCS ratings

Overall, 58 caregivers rated their child at the same MACS level at both assessments, demonstrating a 67% absolute agreement in ratings (Table III). Nine caregivers (10%) rated their child one level higher (indicating less ability) at the second assessment, and 19 caregivers (22%) rated their child one level lower, but the evidence for a change towards lower ratings was inconclusive (exact McNemar p=0.09). No caregiver changed the MACS rating by more than one level. Stability was high, with an ICC of 0.92 (95% confidence interval [CI] 0.87–0.95).

Table III.   Cross-tabulation of MACS ratings for assessments 1 and 2
 MACS assessment 2 rating, nTotal
  1. Absolute agreement counts are highlighted. MACS, Manual Ability Classification System.

MACS assessment 1 rating, n

For the GMFCS, 68 caregivers rated their child at the same level at both assessments, providing an absolute agreement of 79% (Table IV). Eight caregivers (9%) rated their child one level higher (indicating less ability) at the second assessment, nine rated the child one level lower, and one caregiver rated the child two levels lower (exact McNemar for systematic change p=0.8; ICC 0.95; 95% CI 0.92–0.96).

Table IV.   Cross-tabulation of GMFCS classifications for assessments 1 and 2
 GMFCS assessment 2 rating, nTotal
  1. Absolute agreement counts are highlighted. GMFCS, Gross Motor Function Classification System.

GMFCS assessment 1 rating, n

Although there was a clear relationship between MACS and GMFCS ratings (Spearman’s correlation 0.74 at the first assessment and 0.72 at the second assessment), the absolute agreement between MACS and GMFCS levels was only 55% across both assessments. There was no evidence that caregivers who rated the MACS level differently at the second assessment were more likely to rate the GMFCS level differently (p=0.52).

Variables associated with a change in MACS rating

There was no evidence that the child-level variables (see Table V for categorical data), including health status at time of scoring, were associated with an unstable MACS rating. There was slight evidence that children who had higher negative reactivity scores on the SATI were more likely to be rated differently on the MACS at the second assessment (p=0.08), but no evidence that the other three SATI domains were important (approach/withdrawal p=0.21; task persistence p=0.55; activity p=0.15). Caregivers who were not born in Australia (p=0.04) and those who spoke a language other than English in the home (p=0.05) appeared to be more likely to classify their child at a different MACS level at the second assessment (see Table V). The family’s SEIFA score did not appear to be related to the stability of MACS ratings (p=0.22).

Table V.   Characteristics of participants classified at a different level on the MACS and GMFCS at the second assessment, n (%)
CharacteristicMACS rating changed at assessment 2p value (χ2)GMFCS rating changed at assessment 2p value (χ2)
  1. aMACS ratings are missing from four participants for communication difficulty and one participant for health stability. bGMFCS ratings are missing from two participants for communication difficulty, two participants for caregiver’s place of birth, and two participants for caregiver’s language at home. MACS, Manual Ability Classification System; GMFCS, Gross Motor Function Classification System.

 Male (n=50)17 (34)0.748 (16)0.19
 Female (n=36)11 (31)10 (28)
MACS level
 I (n=17)5 (29)0.36  
 II (n=29)8 (28) 
 III (n=14)7 (50) 
 IV (n=8)4 (50) 
 V (n=18)4 (22) 
GMFCS level
 I (n=17)  5 (29)0.10
 II (n=32) 8 (25)
 III (n=9) 2 (22)
 IV (n=8) 3 (38)
 V (n=20) 0
Difficulty communicatinga, b
 Yes (n=39)14 (34)0.955 (13)0.05
 No (n=41)12 (31)13 (32)
 Yes (n=24)6 (25)0.353 (12)0.23
 No (n=62)22 (35)15 (24)
Specific learning disability
 Yes (n=40)12 (30)0.645 (12)0.07
 No (n=46)16 (35)13 (28)
Intellectual disability
 Yes (n=33)9 (27)0.414 (12)0.11
 No (n=53)19 (36)14 (26)
School type
 Mainstream (n=56)21 (37)0.1816 (29)0.02
 Special (n=30)7 (23)2 (7)
Stable health at assessment 1a
 Yes (n=73)25 (34)0.6417 (23)0.44
 No (n=12)3 (25)1 (8)
Stable health at assessment 2a
 Yes (n=72)24 (33)0.7716 (22)0.75
 No (n=13)4 (31)2 (15)
Caregiver born in Australiab
 Yes (n=64)17 (27)0.0414 (22)0.75
 No (n=20)9 (45)4 (20)
Caregiver’s language other than Englishb
 Yes (n=10)5 (50)0.051 (10)0.49
 No (n=74)21 (28)17 (23)

Variables associated with a change in GMFCS rating

Children with more severe problems with communication (p=0.05) or learning (p=0.07) and those who attended special schools (p=0.02) tended to have more stable ratings on the GMFCS (see Table V). Scores on the SATI were not associated with changes in GMFCS rating (negative reactivity p=0.56; approach/withdrawal p=0.86; task persistence p=0.30; activity p=0.74). There was no evidence that caregivers who were born in Australia (p=0.75) or who spoke a language other than English (p=0.49) rated their child at a different level at the second assessment, nor that the family’s SEIFA score (p=0.25) was related to stability of GMFCS ratings over time (see Table V).


Caregivers provided a stable rating for both the MACS and the GMFCS over time, with ICCs >0.9. This suggests that the caregivers’ rating is sufficient for clinical use with individuals and more than adequate for population-based research.19 The stability of caregiver reporting of the GMFCS in this study is almost the same as that reported by professionals: Palisano et al.6 reported an 86.7% agreement at a second assessment 12months after the first.

GMFCS ratings were a little more stable than MACS ratings in the present study. This may be because the GMFCS is built on a simpler concept than the MACS, but the GMFCS also showed much lower stability in an earlier study20 than in a more recent study.6 This suggests that increasing familiarity with the GMFCS may result in increased stability of scoring. The MACS is a newly developed classification, and caregivers in the present study may have never seen it before, whereas they might previously have discussed GMFCS ratings with their child’s therapists. In addition, interrater reliability between caregivers and professionals was previously shown to be lower when the MACS leaflet was sent out by mail, as in the current study,9 than when families and professionals were provided with verbal information to ensure that they had understood the construct behind the MACS.1 Our results suggest that this additional information about how to rate the MACS might be more important if the tool is going to be provided in a language that is not the native tongue of the rater.

When investigating the stability of ratings over time there are always some differences between assessments.21 It is important to understand whether this variability depends on variation within the participants themselves or within the raters’ understanding of the classifications or assessments. Therefore, we were interested in using additional information about both the children and the caregivers for further analysis in this study. The most important finding for rater stability was the need for the MACS to be in the respondent’s own language. Twenty-six percent of the caregivers were born outside Australia, and 12% spoke a language other than English at home; these groups were more likely to change their ratings. However, only the stability of the MACS ratings, and not of the GMFCS ratings, was associated with language. One explanation might be that the MACS information sent out to the families was more detailed than the GMFCS parent form.

The MACS has already been translated into 15 languages, demonstrating that clinicians have understood the importance of language to the scoring process. It also highlights the need for cultural validation when used in environments other than typical cultures in developed countries. So far, cultural validation has been carried out for the MACS only in Turkey (Akpinar, P personal communication 2008), where it was demonstrated that after careful translation the MACS can be used with high inter- and intrarater reliability. Although respondent language was an important consideration for the MACS, the stability of ratings was not associated with the SEIFA score, which provides a proxy measure of parental education, for either the MACS or the GMFCS.

The caregivers who assigned a different MACS rating were not always the same as those who changed the GMFCS rating at the second assessment, which suggests different reasons for variation in ratings for the two tools. When the children were rated differently at the second assessment, it may be because they had changed their performance, becoming more or less able, or that the children’s performance sat on the cusp between two levels. It may also be that caregivers rated them incorrectly at the first or second assessment (or both), or that the caregivers changed their perspective of the child’s performance between assessments.

Few of the child-related variables that we considered might have played a role in influencing stability of scoring appeared important, although the statistical power available for these comparisons was limited. For the MACS children with more difficult temperaments who may not perform consistently were more likely to be scored differently, and this makes sense. For the GMFCS most changes in rating occurred in those with higher cognitive and communication ability who were at mainstream schools. This result is harder to understand. Further examination of this outcome requires a larger sample at each level of the GMFCS.

The children’s general health over the previous 4 months and the presence of epilepsy did not appear to influence the stability of ratings on either tool. Thus, caregivers were able to distinguish between usual performance and health-affected performance in the children.

The study findings are limited by the inclusion of children within a restricted age range (11–12-year-olds) and would be enhanced by replication that included a cohort of participants across the age range of the measures and more than two assessments. In addition, the study sample included only 40% of the known living population of children with CP in Victoria, which may result in selection bias. However, there was little evidence that participants were different from non-participants on variables such as sex, GMFCS level, CP type and distribution, or SEIFA score.

In summary, caregivers of children with CP were consistent in their reporting of both MACS and GMFCS levels for their children across two assessments 12 months apart. The findings of the current study contribute to the growing body of evidence supporting the use of both the MACS and the GMFCS for clinical practice and research.