Commentary: Does ‘cry it out’ really have no adverse effects on attachment? Reﬂections on Bilgin and Wolke (2020)

In their recent paper published in JCPP, Bilgin and Wolke (2020a) argue that leaving an infant to 'cry it out', rather than responding to the child's cries, had no adverse effects on mother-infant attachment at 18 months. This finding opposes evidence across a wide range of scientific fields. Here, we outline several concerns with the article and argue against some of the authors' strong claims, which have already gained media attention, including a report on the NHS website. We suggest that the authors' conclusions should be considered one piece of a larger scientific whole, where 'cry it out' seems, overall, to be of detriment to both attachment and development. Crucially, we are concerned that this study has issues regarding power and other analytical decisions. More generally, we fear that the authors have overstated their findings and we hope that members of the public do not alter their parenting behaviours in line with such claims without further research into this controversial topic.

Deciding on the best way to raise your child is a veritable minefield, to say the least. One of the many controversial topics in developmental psychology is how parents deal with their crying infant. Is it better to leave your infant to cry it out, or should you respond immediately in order to soothe them? Drawing on evidence from a wide variety of fields, the answer seems to be that 'cry it out' should be avoided. Research has shown that the practice may negatively impact children in a number of ways. A lack of maternal response to crying (a) may be associated with elevated infant cortisol (Middlemiss, Granger, Goldberg, & Nathans, 2012), which can have long-term effects on brain development and cognitive function; (b) causes issues with the development of self-regulation (social contact is necessary for self-regulation skills to develop); and (c) is negatively associated with independence and confidence throughout childhood (for a summary of these arguments, see Narvaez, 2011). Indeed, in a recent paper that serves as the focus of this commentary, the authors acknowledged that a lack of detrimental effects due to 'cry it out' found in their work was surprising in the context of both attachment theory and learning theory (Bilgin & Wolke, 2020a).
Perhaps crucially, choosing not to respond to an infant's crying goes against a fundamental evolutionary drive. Species survival depends on the wellbeing of the next generation, and children do not have the necessary language skills to communicate their pain, hunger, or discomfort. Instead, they cry. To ignore infant crying is to ignore potential danger to the infant, and attachment theorists have argued for decades that this is why caregivers are driven to respond to such cues. From the child's perspective, by learning that parents will respond to their needs in a loving and attentive way, infants typically enjoy better social, emotional, and educational outcomes (Richter, 2004).
Problematically, however, few studies have addressed the issue of 'cry it out' as a specific facet of maternal (un)responsiveness directly (Bell & Ainsworth, 1972;Van Ijzendoorn & Hubbard, 2000) and results about this parenting behaviour remain inconsistent. Bilgin and Wolke's (2020a) recent article has provided new fuel for this debate. In their study, the authors collected information regarding maternal use of 'cry it out' from 178 mothers at term, 3, 6, and 18 months, while attachment type was assessed at 18 months with the strange situation procedure. The authors' analyses showed no association between leaving infants to cry it out and infantmother attachment at 18 months, and this conclusion received widespread media coverage. As already noted, stating that 'cry it out' has 'no adverse effects on attachment' goes against theoretical and intuitive expectations, and could have serious consequences if taken as fact by parents, which is why the evidence must be compelling before such conclusions should be drawn. Here, we argue that this may not be the case.

Analytical concerns
The lack of a power analysis reported by the authors is concerning in the current climate of results that fail to replicate. Although they state that combining very preterm/very low birthweight with full-term infants provided 'sufficient statistical power' for their analyses, no power calculations are presented. This also raises concerns about whether the supporting analyses excluding preterm children were sufficiently powered. A sample that is too small may not be able to detect a 'cry it out' effect on attachment type. To illustrate, let us simplify the design and imagine two groups ('cry it out': Y/N) and two outcomes (insecure attachment: Y/N). Next, suppose that half of the infants were not left to cry it out while half were, and that 30% of the former group were insecure (plausible) while 50% of the latter group were insecure (a very large effect). In order to achieve 80% power at an alpha level of .05, our calculations in G*Power suggest a required sample of 190 participants. If only 30% of infants were left to cry it out (similar to the value reported by the authors) then 222 participants would be required. For a smaller effect caused by 'cry it out' (e.g. only 40% of those infants were insecure), we would require 716 participants if the groups were equal in size and 841 participants if unbalanced (30% left to cry it out, rather than 50%). In all cases, power would be lower still to detect effects on lowerfrequency outcomes, such as disorganized attachment (see later).
As such, the study is clearly underpowered for realistic and important effect sizes. Thus, all one can legitimately conclude is that the authors failed to reject the null hypothesis. To claim that 'cry it out' has no effect requires a demonstration of sufficient power to detect such an effect if it were present. Indeed, the extremely wide confidence intervals around the reported odds ratios are evidence that these measurements were likely not precise enough to draw strong conclusions. (This issue is further exacerbated in the supporting analyses of only the full-term subsample of infants.) Finally, we might predict the presence of important interactions, whereby the frequency of leaving one's child to 'cry it out' may influence attachment differently depending on whether infants were preterm or not, for example. As such, information about the presence or absence of these interactions would be both important and welcome, although again, power severely limits the ability to detect any such interactions.
Bilgin and Wolke used the strange situation procedure to measure infant attachment, which allows for the categorization of an infant into one of four attachment styles: (1) secure, (2) anxious-avoidant insecure, (3) anxious-ambivalent/resistant insecure and (4) disorganized/disoriented. Often, behaviours denoting (4) occur only briefly and so these infants are also given a secondary classification (1-3). In Bilgin and Wolke's study, at least two groups (infants categorized into the two 'insecure' attachment styles) were combined under a single label in order to compare 'secure' with 'insecure' styles. Crucially, although both attachment names (2) and (3) include the word 'insecure', these represent very different types of attachment. Therefore, a combined group may obscure potentially important subgroup differences. Further, some infants received a primary classification of 'disorganized', but it is unclear whether such infants were included as 'insecure' for the purposes of the authors' 'secure versus insecure' analysis or were excluded entirely (resulting in a decrease in statistical power). If they were indeed included in this way, the following issue arises:

organized [types (1-3)] attachment
In reality, these two analyses are the complement of each other, with the only difference being where types (2-3) appear. Importantly, in the current work, we have reason to believe that types 2 and 3 combined represent only 10% of the total sample (see below), meaning the two analyses are virtually identical.
As noted above, attachment types were categorized as secure versus insecure prior to analysis. Although the authors do not state the number of infants placed into each attachment group for the whole sample (full-term and preterm), they report elsewhere that 28% of infants were classified as not secure in their full-term subsample (Bilgin & Wolke, 2020b). In addition, mother-child attachments were also classified as organized versus disorganized, resulting from a low versus high score on a continuous scale of attachment disorganization. Although frequencies were not provided, reporting of the fullterm subsample suggests around 17% of attachment types may have been classified as disorganized (Bilgin & Wolke, 2020b). Our concern regarding the latter binary classification in particular, and its subsequent appearance as an outcome in a logistic regression model, is that it is notably unbalanced. In this case, disorganized attachment represents only a small proportion of cases (termed 'rare events'), which can lead to biased outcomes in logistic regression models (Salas-Eljatib, Fuentes-Ramirez, Gregoire, Altamirano, & Yaitul, 2018). The overall model performances were not reported, and it would be beneficial for readers to learn how the authors addressed the unbalanced nature of these data in their analyses.

Lack of clarity
Measuring 'cry it out' behaviour relied on asking mothers, 'Have you ever tried leaving your baby to cry it out during this time?' with the following response options: never, once, a few times, and often. It is not clear whether the authors explained what 'cry it out' meant to mothers. Were they given a specific definition? As a key variable, readers would appreciate further clarification. Further, if the meaning of 'cry it out' was not specified then although the most common usage relates to bedtime crying, mothers may have formed their own, differing interpretations of the term. These interpretations may involve the context in which a child has been left to 'cry it out'. Were children left indefinitely or for a fixed period? Some parents may include leaving their child in a car while tending to a second child, for instance. Or perhaps leaving a child alone and crying may not be deemed by parents as 'cry it out' unless it was deliberate?
Indeed, this issue of context is acknowledged in a paper cited by the authors. In Van Ijzendoorn and Hubbard's (2000) replication of Ainsworth and Bell's classic attachment study, they fail to show a relationship between secure attachment and maternal responsiveness to infant crying. However, Van Ijzendoorn and Hubbard note that the type of infant crying is important when considering these findings. They state, 'We hypothesize that only those forms of crying indicating severe distress (e.g. crying with a long expiration pause) can be considered evolutionarily biased (pre-) attachment behaviour. From the perspective of attachment theory, the only adequate response in the case of severe distress vocalizations is closer proximity between protective adult and infant' (Van Ijzendoorn & Hubbard, 2000, p. 388). Van Ijzendoorn and Hubbard go on to acknowledge that this type of crying is likely more harmful than other types of infant crying, such as everyday fussing or brief distress. In other words, the context in which a child is left to 'cry it out' may alter attachment outcomes significantly.
Following on from this point, Bilgin and Wolke (2020a) conclude that leaving infants to 'cry it out' has no harmful impact 'while a parent is present. . . and they monitor the infant's crying ' (p. 1192). This appears as a 'key point', highlighted in the article, but the authors do not state that parents were asked to report whether they were present or absent during the crying episodes. Therefore, readers would benefit from additional clarification regarding the instructions given to mothers.

Our recommendations
Bilgin and Wolke (2020a) make strong claims based on their findings, including there being 'no adverse effects on attachment' (p. 1184) and 'no harmful impact of leaving their infants to cry it out' (p. 1192). In turn, these have been featured on the NHS website and have received media coverage, which will likely impact parenting behaviours as a result. We believe that their results contradict a substantial body of psychological literature and that the methodological and statistical issues described above cause such unambiguous conclusions to be unwarranted. We acknowledge the impressive amount of time and effort involved with collecting data from 178 infants and their mothers over 18 months, including the coding of behaviours observed during the strange situation. However, the finding of a null result from analyses which appear to lack sufficient statistical power prevents the authors from ruling out effects of 'cry it out' on attachment. On this basis, the confidence with which the authors present their conclusions may be misplaced. We therefore recommend that these findings be considered in the context of a large body of literature, where evidence generally supports the argument that 'cry it out' could lead to, or be related to, nonsecure attachment styles. We hope that in time, several more, and ideally larger, studies on this topic can be conducted and combined using meta-analytical techniques, providing more conclusive evidence regarding this controversial question. For now, our hope is that interested parents who come across such claims do not decide to alter their parenting practices as a result.