Prof D Ayres-de-Campos, Departmento de Ginecologia e Obstetrícia, Faculdade de Medicina do Porto, 4200-309 Porto, Portugal. Email email@example.com
Please cite this paper as: Ayres-de-Campos D, Arteiro D, Costa-Santos C, Bernardes J. Knowledge of adverse neonatal outcome alters clinicians’ interpretation of the intrapartum cardiotocograph. BJOG 2011;978:985–984.
Objective To evaluate the impact of knowledge of neonatal outcome on clinicians’ interpretation of the intrapartum cardiotocograph (CTG).
Design Prospective evaluation of pre-recorded cases.
Setting Five maternity hospitals.
Population From a database of intrapartum CTGs acquired with a scalp electrode in singleton near-term fetuses, 20 tracings were sequentially selected from cases with newborn umbilical artery pH < 7.05 and 20 from cases with pH > 7.20.
Methods Five experienced obstetricians practising in different maternity hospitals were asked to analyse the 40 tracings individually, according to the International Federation of Gynaecology and Obstetrics guidelines. In a first round, clinicians were given no information on neonatal outcome. In a second round, carried out 2 months later, clinicians were asked to analyse the same tracings, but the order was randomly altered and information on the newborn’s arterial pH was provided. Clinicians were not informed of the purpose of the study or whether the tracings were the same.
Main outcome measures The incidence of individual fetal heart rate feature identification and tracing classification, before and after neonatal outcome was made available.
Results In the group with pH < 7.05, repetitive decelerations and reduced variability were more common in the second round (P < 0.001 and P = 0.001, respectively), as was a pathological classification (P = 0.002); variable decelerations were less common (P = 0.008). In the group with normal pH, less tracings in the second round had prolonged decelerations (P = 0.013) and no accelerations (P = 0.013), but more had pronounced decelerations (P = 0.031) and reduced variability (P = 0.007); there was a reduction in pathological classifications, but this difference failed to reach statistical significance (P = 0.051).
Conclusions A knowledge of adverse neonatal outcome leads to a more severe classification of the intrapartum CTG, which derives mainly from the evaluation of decelerations and variability.
Intrapartum asphyxia is responsible for an important number of perinatal deaths and long-term sequelae,1 and reviews carried out in the context of confidential enquiries have shown that problems related to cardiotocograph (CTG) interpretation are frequently reported.2 Retrospective analysis of CTG tracings also assumes an important role in situations involving insurance payments or malpractice claims, where adverse outcome is usually the rule. Case reviews within hospital departments or at regional medical boards, or even in a research context,3–5 also frequently focus on cases with poor perinatal outcome. In all these settings, a failure or delay in reacting to CTG changes judged to be suggestive of hypoxia is a common finding.2–6
However, it is possible that the a posteriori analysis is influenced by the existing knowledge of adverse outcome. In a technology in which poor intra- and inter-observer agreement on interpretation has been consistently demonstrated in the past,7–9 it is possible that a knowledge of adverse perinatal outcome has a unidirectional effect on this phenomenon, leading to a consistently more severe interpretation. In other areas of medicine, a knowledge of adverse outcome has been shown to have a negative influence on reviewers’ opinions over the appropriateness of care.10
The implications of this phenomenon in medical–legal cases, as well as in clinical and research settings, are extremely relevant. A negative a posteriori evaluation of clinical judgements has consequences that can go beyond the loss of personal self-esteem and a decrease in motivation to enter the economical and legal realms.
The aim of this study was to evaluate the impact of prior knowledge of neonatal outcome on clinicians’ interpretation and classification of the intrapartum CTG. The International Federation of Gynaecology and Obstetrics (FIGO) guidelines for CTG interpretation11 were used for this purpose, as they represent the largest international consensus effort in this area, and are still widely used throughout the world.
Population and methods
A database of intrapartum CTG tracings, acquired during a previously reported trial,12 was sequentially searched in order to select cases with a duration of at least 90 minutes and with an interval between tracing end and birth of less than 20 minutes. Participants in this study were women with singleton fetuses, in cephalic presentation, in active labour at more than 36 completed gestational weeks and in whom a clinical decision was made to apply a fetal scalp electrode and external tocodynamometer for continuous fetal monitoring, using the STAN® 21 monitor (Neoventa Medical, Gothenburg, Sweden). Monitoring with STAN® 21 was the preferred method of surveillance in high-risk pregnancies, women with suspicious or abnormal external cardiotocography, induced or augmented labour, meconium-stained amniotic fluid or epidural analgesia. Local research ethics committee approval was obtained for the trial, and all participating women signed a written informed consent allowing their data to be used for research purposes. Only anonymised data and CTG tracings were used in the present study; no ST information was made available.
As no previous data were available to estimate the sample size needed for this study, a pragmatic approach was chosen, selecting the maximum number of cases that observers were thought to be capable of analysing in a single session.
All selected cases had valid paired umbilical blood gas results,13 performed within 30 minutes of birth. Twenty tracings were sequentially selected from cases with newborn umbilical artery pH <7.05 and 20 from cases with arterial pH >7.20. The order by which tracings were presented to clinicians was determined by computer-generated randomisation.
The last 90 minutes of the 40 CTG tracings were sent by email to five obstetricians who had between 9 and 27 years (9, 18, 20, 22 and 27 years) of experience in labour ward management, and were practising in five different maternity hospitals. Two had previously been expert witnesses in court cases. They were requested to fill in a questionnaire (Figure 1) with their evaluation of CTG features and general classification of tracings according to FIGO guidelines.11 These guidelines were regularly used in their centres, and they were also supplied with the tracings. Clinicians were informed that tracings had been acquired in term singleton pregnancies, but no explanation was given with regard to the study’s purpose. The tracings contained the last 90 minutes of recording before birth, and were printed at a paper speed of 1 cm/minute (Figure 1).
In the first round, clinicians were asked to evaluate the 40 tracings individually, with no information on neonatal outcome. Two months later, the order of the tracings was altered, using computer-generated randomisation, and the clinicians were asked to evaluate the same 40 tracings, but this time each case contained information on the newborn’s umbilical artery pH (pH < 7.05 or pH > 7.20). Clinicians were again not informed of the purpose of the study; nor were they told whether or not the second set of tracings was the same as the first. The FIGO guidelines were once more supplied.
To compare the differences between the two rounds with regard to the identification of fetal heart rate baseline, accelerations, decelerations, variability and overall tracing classification, the McNemar and McNemar–Bowker tests were used. Significance was set at P < 0.05.
All tracings were evaluated by all five clinicians in both rounds, giving a total of 400 analyses. Table 1 displays the identification of individual CTG features (baseline, accelerations, decelerations, variability), and Table 2 displays tracing classification by the clinicians in both rounds.
Table 1. Identification of fetal heart rate features by the five obstetricians in the two rounds. The first and second round columns display the number of tracings in which the feature was identified by the observers. The statistical significance of the differences between the two rounds was evaluated using McNemar* or McNemar–Bowker** tests. P values reaching statistical significance are highlighted in bold.
Table 2. Overall cardiotocograph (CTG) classification by the five obstetricians in the two rounds. The first and second round columns display the number of tracings in which the classification was attributed by the observers. The statistical significance of the differences between the two rounds was evaluated using the McNemar–Bowker test. P values reaching statistical significance are highlighted in bold.
In the group with umbilical artery pH <7.05, a significantly larger number of repetitive decelerations (P < 0.001) and reduced variability (P = 0.001) were identified in the second round, together with a decreased identification of variable decelerations (P = 0.008). Clinicians changed their overall tracing classification 42 times, 33 (79%) times to a more severe class (including one change from normal to pathological). A significantly larger number of pathological classifications were observed when clinicians were informed that the newborn umbilical artery pH was <7.05 (P = 0.002).
In the group with normal pH, less tracings in the second round were identified as having no accelerations (P = 0.013) and prolonged decelerations (P = 0.013), but more tracings were identified as having pronounced decelerations (P = 0.031) and reduced variability (P = 0.007). Clinicians changed their overall classifications 46 times, 24 of which (52%) times to a less severe class (including two changes from pathological to normal). Although there was a reduction in the classification of tracings as pathological, this difference failed to reach statistical significance (P = 0.051).
The knowledge of a low umbilical artery pH led to a significantly increased identification of abnormal CTG features, such as repetitive decelerations and reduced variability, as well as to a significantly larger number of tracings being classified as pathological. We did not establish an opposite effect in cases with a normal pH.
Repeated evaluation of the same tracings by the same clinicians ensures that the differences observed between rounds are a result of different interpretations, most probably conditioned by the new information provided. The availability of real neonatal outcomes also ensures that results can be generalised to real clinical situations. The relatively large number of cases analysed in each round, the 2-month interval between rounds and the different sequence of tracings presented to clinicians seem to have been sufficient to avoid recall bias, as all clinicians stated that they were unaware of having evaluated the same tracings twice.
As far as we are aware, this is the first study to demonstrate an impact of the knowledge of neonatal outcome on the interpretation of the intrapartum CTG. Figueras et al.14 evaluated the effect of a knowledge of a fictitious perinatal outcome on the interpretation of 100 CTG tracings obtained before labour, in a mixed low- and high-risk population, by an experienced midwife and a junior resident. A significantly different evaluation of the baseline, variability, accelerations and decelerations was found between cases with an alleged normal and an alleged adverse neonatal outcome. Zain et al.15 selected ten intrapartum CTGs of good signal quality, which were considered to involve an important judgment by the managing obstetrician with regard to the route and timing of delivery. Thirty-six obstetricians reviewed these cases, blind to the study’s objective and, 1 month later, re-evaluated the same tracings, with identical clinical information, but with a sham opposite neonatal outcome. No significant differences were found in the identification of reduced variability or late decelerations (using no predefined interpretation criteria) when the alleged neonatal outcome was poor, but there was a significantly larger number of opinions that there was evidence of fetal hypoxia (P = 0.07) and that obstetricians had made an incorrect decision (P < 0.001). Almström et al.16 demonstrated that the knowledge of normal umbilical artery Doppler velocimetry in small-for-gestational-age fetuses influenced the way in which clinicians interpreted intrapartum CTGs, and subsequently managed labour. When a knowledge of a normal Doppler velocimetry was available, only eight of 121 CTG tracings were considered to be abnormal, whereas this occurred in 18 of the same tracings when Doppler data were unavailable.
In our study, there was a high percentage of decelerations identified in cases with both low and normal umbilical artery pH, and a small number of tracings were classified as normal; this can perhaps be explained by the high-risk population included, the use of combined CTG + ST monitoring during the second stage and the study’s low caesarean section rate (under 5%). However, it also suggests that the FIGO classification has a limited capacity to discriminate between normal and acidaemic fetuses,17 possibly related to the subjective definitions of many CTG features, such as the various types of deceleration,18 which have a direct impact on overall tracing classification.
The lack of definition of the several types of decelerations considered in the FIGO guidelines may also have contributed to the differences between rounds in the identification of this parameter, even in cases with normal umbilical artery pH. It is possible that clinicians identified a larger number of pronounced decelerations in the second round, because they attributed less clinical significance to this entity. Less clear is the reason why they also identified more normal pH cases as having decreased variability in the second round. One possible explanation for this is that observers, when presented with neonatal outcome, suspected that the study would be evaluating their predictive capacity, and therefore analysed tracing features more strictly and with greater care.
Cases were selected for this study in order to illustrate the point of observer bias, and the resulting prevalence of low umbilical artery pH is much higher than that observed in everyday clinical practice. Results cannot therefore be extrapolated to this setting. However, similar prevalences may be found in confidential enquiries, case reviews and court cases.
The findings of the present study have important implications in medical–legal, clinical and research settings. CTG interpretation continues to play a major role in labour ward care, and intrapartum hypoxia remains a leading cause of obstetrical litigation, where tracing review is frequently the key element.19–21 A possible way to overcome the effect of a priori knowledge of adverse neonatal outcome would be to introduce a blind evaluation of several tracings, among which the index case is included. In clinical and research settings, case reviews involving CTG analysis should avoid the disclosure of neonatal outcome at the start, and observations should not be limited to cases with an adverse outcome.
The current study leaves many related questions unanswered. The impact of using other widely disseminated CTG interpretation guidelines may not be similar, and needs to be evaluated further. Adverse neonatal outcome was defined in this study as umbilical artery acidaemia, but would a knowledge of the more severe outcomes usually occurring in litigation cases provide similar results? Do different levels of observer experience influence the observed changes in classification? Finally, other areas of obstetrics and gynaecology that have poorly reproducible tests and frequent medical litigation may warrant similar evaluations to assess whether analogous phenomena occur.
Disclosure of interest
All authors and their relations have no financial connections with companies that may have an interest in the submitted work, and no nonfinancial interests that may be relevant to the article.
Contribution to authorship
Diogo Ayres-de-Campos had the original idea for the study, proposed the initial version of the study design, selected the cases, checked the results and wrote the first version of the manuscript. Diana Arteiro contributed to the study design, collected and transcribed clinicians’ analysis, evaluated the results and contributed to the manuscript review. Cristina Costa-Santos contributed to the study design, performed statistical analysis and contributed to the manuscript review. João Bernardes provided several suggestions for study design and reviewed the final version of the manuscript.
Details of ethics approval
Approval by the research ethics committees of the University Hospital Lund, University Hospital Malmo and Sahlgrenska University Hospital, Gothenburg, was obtained for the trial from which the tracings were taken,12 and all participating women gave informed consent for use of their data for research purposes. No person-identifiable information was disclosed to clinicians in the present study.
No external funding was obtained for this study. The authors’ institutions sponsored the authors’ time dedicated to the research.
The authors would like to thank Professor Sousa Barros (Hospitais da Universidade de Coimbra), Dr Nuno Clode (Hospital de Sta Maria, Lisbon), Doutora Filomena Nunes (Hospital de Cascais), Dr Agostinho Carvalho (Hospital de Viana do Castelo) and Dr Susana Pereira (Hospital de Viseu) for analysing the tracings, and Dr Isis Amer-Wåhlin, MD, PhD, and her co-authors from the Swedish randomised trial for allowing the use of their tracings.
1 Background: Describe and compare National Institute for Health and Clinical Excellence (NICE),1 International Federation of Gynaecology and Obstetrics (FIGO)2 and American College of Obstetrics and Gynecology (ACOG)3 guidelines for the interpretation of electronic fetal heart tracings.
2 Methods: Describe the study methods. Refer to the slides that accompany this BJOG article (Supporting Information available online via bjog.org).The tracings used in this study were taken with a STAN monitor in the second stage of labour. Discuss the difference from usual practice in your unit. How could the results or their interpretation have been different if the cardiotocographs (CTGs) had been obtained from the first stage of labour? Pay particular attention to Table 1.Debate the use of umbilical artery pH < 7.05 as proxy for adverse outcome with reference to the essential and nonspecific criteria of the cerebral palsy template.4 Reflect on the predictive value of pathological1,2 (abnormal3) CTG tracings and low umbilical artery pH for cerebral palsy.
3 Results and implications: Discuss ways to improve the standardised interpretation of CTGs in practice5 or the analysis of perinatal adverse events.6Brainstorm possible techniques for the use of CTGs for educational or medicolegal purposes, avoiding a biased interpretation. What would you do differently in your next CTG discussion meeting in the light of the findings of this study? (Data S1) ▮
D Siassakos University of Bristol and Southmead Hospital, Bristol, UK Email firstname.lastname@example.org
1National Collaborating Centre for Women’s and Children’s Health. Intrapartum Care of Healthy Women and Their Babies During Childbirth. Clinical Guideline 55. Commissioned by the National Institute for Health and Clinical Excellence, editor. London: RCOG Press, 2007.
2FIGO News: Rooth G, Huch A, Huch R. Guidelines for the use of fetal monitoring. Int J Gynecol Obstet1987;25:159–67.
3ACOG Practice Bulletin No. 106. Intrapartum fetal heart rate monitoring: nomenclature, interpretation, and general management principles. Obstet Gynecol2009;114:192–202.