With increased knowledge from neonatal neurological examination and the development of neuroimaging, prediction of cerebral palsy (CP) has for the last three decades been the focus of a large amount of research. Neonatal neurological examination is a complex process and several groups have proposed standardized approaches towards the goal of producing valid and consistent results between examiners. However, there is a lack of a criterion standard instrument for identifying neurological abnormality in neonates. The Prechtl Method, with the assessment of general movements, is one of the methods of neurological examination. It has been extensively evaluated and is considered to have good sensitivity and specificity. However, the interrater reliability is low unless intensive training is available. Consequently, the search for an automatic assessment of general movements continues.
Philippi et al., using a magnetic tracking system, first tried to identify the paradigms which appeared to differentiate between infants who did and those who did not develop CP. Secondly, they compared the predictive value of the different kinematic assessments with that of clinical Gestalt perception with respect to abnormal neurodevelopmental outcome at the age of 2 years. Kinematic assessment was transformed as a continuous variable, while clinical assessment was a qualitative variable. The population studied was highly selected with 49 infants out of 66 at high risk of CP and a prevalence of CP of 14.9%. The results showed that the sensitivity and specificity of the two assessments were very similar, but with a higher positive predictive value for the kinematic assessment. This might suggest that kinematic assessment could be more relevant than clinical evaluation for the prediction of CP, but the results should be interpreted with caution.
Sensitivity relates to the test's ability to identify correctly infants with CP while specificity measures the proportion of infants free of CP who are correctly identified by the test. Sensitivity and specificity are prevalence-independent tests, where positive and negative predictive values are highly dependent of the prevalence of the disease. The prevalence of CP in the population studied does not represent the true prevalence of CP, which is around 2 to 3 per thousand live births in term infants and 5% to 10% of births in preterm infants. Consequently, extrapolation of these results to the general population is difficult. The observed differences in the positive and negative predictive values, although sensitivity and specificity are very similar, could be related to the fact that clinical assessment is a qualitative variable while kinematic assessment is a continuous variable. But this deserves further evaluation.
Nevertheless, the stereotypy score defined with kinematic assessment appears promising in predicting CP. The next step could be to define the population for which the kinematic evaluation is of interest and the respective place of clinical and kinematic assessment.
The goal of diagnostic tools during the last few decades has been to enhance the human eye with technology in order to have a more objective measure of a phenomenon. Neonatal behaviour has been the subject of such research as it is very difficult to interpret. However, we now increasingly realize that arriving at a prognosis is a complex process, and prediction of CP could be a paradigm of this complexity. When predicting CP in a neonate, we often encounter vulnerable families, with a history of a stressful neonatal period. These families need evidence-based information but also a relationship-based approach which requires time to develop. Observing the behaviour and the motoric organization of an infant is a powerful tool to build trust with families and to support them in the demands of parenting an infant at risk of developmental anomalies.
In Philippi et al.'s study the interrater reliability of general movements observation was moderate but appeared excellent among three raters, while one of them rated differently. Do we conclude that the interrater reliability is low, or that one of the examiners needs further training? One of our challenges for the next generation could be to give practitioners time to learn how to interpret the neurobehaviour of an infant while integrating the strengths and limitations of technical tools.