Narrative coherence in multiple forensic interviews with child witnesses alleging physical and sexual abuse

Summary This study investigated the narrative coherence of children's accounts elicited in multiple forensic interviews. Transcriptions of 56 police interviews with 28 children aged 3 – 14 years alleging physical and sexual abuse were coded for markers of completeness, consistency and connectedness. We found that multiple interviews increased the completeness of children's testimony, containing on average almost twice as much new information as single interviews, including crucial location, time and abuse-related details. When both contradictions within the same interview and across interviews were considered, contradictions were not more frequent in multiple interviews. The frequency of linguistic markers of connectedness remained stable across interviews. Multiple interviews increase the narrative coherence of children's testimony through increasing their completeness without necessarily introducing contradictions or decreasing causal-temporal connections between details. However, as ‘ ground truth ’ is not known in field studies, further investigation of the relationship between the narrative coherence and accuracy of testimonies is required.


| INTRODUCTION
The story-telling model of legal decision making (Pennington & Hastie, 1992) suggests that the narrative coherence of witnesses' accounts plays a crucial role in forensic investigations, as coherent narratives, defined in legal terms as stories which are complete, consistent and causally and temporally connected, are more credible than stories lacking in coherence (McAdams, 2006;Pennington & Hastie, 1988;Rideout, 2008). Although the story-telling model was initially conceived to describe the decision-making processes of jurors during a trial, the model has implications for credibility judgments during police investigations too, as cases judged non-credible at this stage never reach a courtroom. Bennett and Feldman (1981) suggest that the narrative coherence of witnesses' stories is especially critical in situations where credibility judgements need to be made in the absence of corroborating evidence, which is often the case for child abuse investigations.
Children's ability to provide coherent narratives in forensic interviews may be limited by several factors. Firstly, Reese et al.'s (2011) research on the narrative coherence of children's autobiographical accounts suggests that although young children can give truthful and detailed accounts of events that happened to them, the ability to provide chronologically, contextually and thematically coherent narratives does not fully develop until late childhood or adolescence. Constructing a consistent timeline of events may be particularly difficult for young children due to their difficulty with understanding and using temporal concepts (Graffam Walker, Kenniston, Inada, & Caldwell, 2013), especially in cases where they need to give accounts of repeated experiences (Brubacher, Powell, & Roberts, 2014). Despite the central role of storytelling in forensic investigations (Pennington & Hastie, 1992) and children's difficulty with recounting their past experiences in a coherent manner (Reese et al., 2011), few studies applied a story-telling framework to evaluate the narrative quality of children's accounts elicited in forensic interviews.

| Models of narrative coherence in developmental psychology
In contrast to the story-telling model's definition of narrative coherence as an umbrella term comprising of the completeness, consistency and connectedness of arguments (McAdams, 2006;Pennington & Hastie, 1992;Rideout, 2008), psychology studies generally use a narrower concept of narrative coherence, broadly defined as a framework for organising event details (Feltis, Powell, & Roberts, 2011). In this form, narrative coherence is described as independent from other measures of the quality of children's stories, including accuracy, consistency and descriptive detail (Brown, Brown, Lewis, & Lamb, 2018;Reese et al., 2011).
Recent laboratory and field studies analysing the narrative coherence of children's accounts of real-life experiences fall into three broad categories. Story-grammar approaches categorise the information provided by the speaker into six grammar elements, including the setting of the story, the initiating event, the protagonist's internal responses, the attempt or action, the consequence of the action and the protagonist's reaction. (Feltis et al., 2011;Feltis, Powell, Snow, & Hughes-Scholes, 2010;Westcott & Kynan, 2004). In contrast, multidimensional models rely on the broader categories of context, chronology, theme and evaluation to measure the narrative coherence of autobiographical memories (Habermas & Bluck, 2000;Habermas and de Silveira (2008); Peterson, Morris, Baker-Ward, & Flynn, 2014;Reese et al., 2011;Morris, Baker-Ward, & Bauer, 2010;Nelson & Fivush, 2004). Finally, the narrative cohesion approach measures the 'degree to which event details are presented in a connected form through temporal and causal relations' (Kulkofsky, Wang, & Ceci, 2008, p. 23).
Whilst multidimensional models, story schema approaches and the narrative cohesion approach are distinct from one another, elements of their conceptualisation of narrative coherence overlap (Brown et al., 2018). For instance, Brown et al. (2018) mapped the multidimensional model's dimensions of chronology, context and evaluation onto the temporal element of setting, the physical and social elements of setting and the reaction and internal response of protagonist, respectively, in the story schema approach. Similarly, temporal markers in the narrative cohesion approach measure the same aspect of recall as the element of temporal setting in the story schema model and the dimension of chronology in multidimensional models.

| Narrative coherence in forensic interviews with children
Initially, the story grammar approach was used to measure the narrative coherence of forensic interviews with child witnesses (Feltis et al., 2010;Westcott & Kynan, 2004). Analysing Memorandum of Good Practice interviews with children alleging sexual abuse in the United Kingdom, Westcott and Kynan (2004) found that although children's accounts did conform to a basic story structure, the scarcity of details and the lack of a causal-temporal structure in young children's testimonies limited the coherence of their narratives. A third of the cases examined was rated as 'disordered' in terms of causal and temporal relations and the extent of disorder was especially high for pre-schoolers (Westcott & Kynan, 2004). Another aspect of narrative coherence which children of all ages struggled with was describing 'subjective details'; their personal reactions to the abuse. Only 10% of children spontaneously described their physical perceptions and 20% provided descriptions of their emotional state. However, this proportion was increased to 33 and 46.6%, respectively, when interviewers asked questions focused on subjective details. Noting the low proportion of open-ended questions in their sample, Westcott and Kynan (2004) suggested that the use of overly specific questions compromised narrative coherence, especially the temporal and causal organisation of children's accounts.
In addition to the impact of question type on the frequency of story grammar elements in children's testimony (Westcott & Kynan, 2004), laboratory research suggests that children's age and the characteristics of the event they are questioned about also affect the narrative coherence of recall (Feltis et al., 2011;Reese et al., 2011). Relying on the multidimensional model of narrative coherence, Reese et al. (2011) found that even pre-schoolers were able to maintain a topic of conversation, but they found it difficult to establish events in time and place and organise them according to chronology. Similarly, Habermas and de Silveira (2008) found that temporal coherence increased dramatically between the ages of 8 and 12, causal coherence increased most between 12 and 16 years, and thematic coherence was still in development between 16 and 20 years. Using a story grammar approach, Feltis et al. (2011) reported that the frequency of schema elements in children's accounts correlated positively with age. In addition, the children in Feltis et al.'s (2011) sample also recalled more story grammar elements when the delay between the interview and the events children were asked to recall was shorter, and when children experienced repeated occurrences of the same event rather than a single event.
Overall, research on children's ability to provide coherent narratives in forensic interviews suggests that children struggle with providing a causal-temporal framework and describing their subjective reactions to the events they are questioned about (Westcott & Kynan, 2004). Furthermore, children's age and the type of abuse they experienced may limit their ability to form coherent narratives (Feltis et al., 2011;Reese et al., 2011). 1.3 | The risks and benefits of multiple forensic interviews with child witnesses Many children interviewed for legal purposes in the United Kingdom are questioned more than once. Multiple interviews with child witnesses are associated with both risks and benefits with regards to their impact on the quality of the testimonies provided. The existing literature on the impact of multiple interviews on the quality of children's testimonies was summarised and evaluated by two separate reviews published in 2008 and 2009 (Goodman & Quas, 2008;La Rooy, Lamb, & Pipe, 2009). Both reviews concluded that children's recall can remain highly accurate in multiple interviews when the interviews are conducted using open-ended questions. However, the use of suggestive questions and long delays between interviews were associated with a decrease in the accuracy of children's accounts (Goodman & Quas, 2008;La Rooy et al., 2009).
Recent research supports the conclusion of these reviews in terms of the risks associated with multiple interviews, showing that all kinds of suggestive interviewing methods, including cross-examination techniques (Fogliati & Bussey, 2014;Jack & Zajac, 2014;Righarts, Jack, Zajac, & Hayne, 2015;O'Neill & Zajac, 2013), false memory paradigms (Otgaar, Verschuere, Meijer, & van Oorsouw, 2012), misleading information (London, Bruck, & Melnyk, 2009) and leading questions (Melinder et al., 2010) reduce the accuracy of children's testimony when used in combination with multiple interviews. However, recent studies have also provided further evidence of the potential benefits of multiple interviews, demonstrating that children are able to recall stressful events even after several interviews separated by yearlong delays (Peterson, 2011(Peterson, , 2015 and that under some conditions, providing children with an opportunity to recall events in response to openended questions can ameliorate the effects of previous suggestive interviews (Melinder et al., 2010;Righarts et al., 2015).
The benefits of multiple interviews are also supported by a costeffectiveness analysis showing that the likelihood of convicting a child sexual abuse offender is increased by an estimated 6.1% as a result of multiple interviews, raising the percentage of offenders who are convicted from 22.8 to 28.9% (Block, Foster, Pierce, Berkoff, & Runyan, 2013). The costs associated with multiple interviews in this analysis involved the additional resources required by law enforcement whilst the key benefit was the prevention of further victimisation through identifying, convicting and incarcerating offenders. However, as Block et al. (2013) note, the risks and benefits associated with multiple interviews are contingent upon the quality of the interviews, with additional suggestive or otherwise substandard interviews having few advantages and many risks, including a decrease in children's credibility due to inconsistencies in their reports across interviews, and a potential increase in false conviction rates (Block et al., 2013). Thus, in order to establish the impact of multiple interviews on the outcome of child sexual abuse cases, it is essential to consider the quality of the testimonies elicited, including their completeness, consistency and connectedness.

| The impact of multiple interviews on the narrative coherence of testimonies
Whilst the accuracy of children's testimonies provided over the course of multiple interviews has been studied extensively, less is known about the impact of multiple interviews on the narrative coherence of testimonies.
Multiple interviews generally increase the completeness of children's testimony as repeated recall occasions allow witnesses to recall new information (La Rooy et al., 2009). In laboratory studies, the completeness of children's recall in a second interview has sometimes exceeded the number of details reported in the first interview (e.g., Knutsson et al., 2011;La Rooy, Pipe, & Murray, 2005). However, even when the number of details reported in each successive interview declines, the overall amount of information reported in the interviews may increase through reminiscence (Erdelyi, 1996). Consistent with laboratory findings indicating that reminiscence leads to an increase in completeness over multiple interviews, several field studies found that a significant amount of new forensically relevant information was recalled in subsequent interviews (Hershkowitz & Terner, 2007;Katz & Hershkowitz, 2012;Leander, 2010;Waterhouse et al., 2016).
As multiple interviews lead to the recall of new information, they inherently decrease the consistency of children's accounts through the addition of new details and the omission of previously mentioned details from later interviews. Examining cross-examination techniques in Scottish courts, Szojka, Andrews, Lamb, Stolzenberg, and Lyon (2017) found that defence lawyers often challenged the credibility of children's statements on the basis of inconsistencies in their accounts, despite laboratory research showing a lack of correlation between the consistency and overall accuracy of witness statements elicited through multiple interviews (Baugerud, Magnussen, & Melinder, 2014;Gilbert & Fisher, 2006). Defence lawyers most commonly referred to contradictions between details mentioned by the witness, rather than additions of new details or omissions of previously mentioned details (Szojka et al., 2017). Previous research on the frequency of contradictions in children's accounts elicited over multiple interviews has been inconclusive; while Katz and Hershkowitz (2012) Katz and Hershkowitz (2012).
Although the connectedness of children's accounts elicited over the course of multiple forensic interviews has not yet been addressed, Habermas and de Silveira's (2008) study analysed the impact of repeated recall on the extent to which events in children's and adults' life narratives were causally and temporally connected. Comparing the thematic, causal and temporal coherence of 8, 12, 16 and 20 year olds in two 15-minute interviews conducted 2 weeks apart, Habermas and de Silveira (2008) found that multiple interviews did not affect the frequency of temporal and causal elements of narrative coherence.
However, the extent to which Habermas and de Silveira's results can be interpreted in a forensic context are limited. First, forensic interviews are based on a question-answer format, and lengthy uninterrupted narratives are not frequent in interviews with children.
The question-answer format might modify the effect of multiple interviews on narrative coherenceit can either serve as a scaffold, as suggested by Bennett and Feldman (1981), or have a disjointing effect on children's accounts, when the questions asked are overly specific (Westcott & Kynan, 2004). Secondly, children are asked to recall events in much more detail during forensic interviews than when questioned about their life stories, which may allow a larger role for reminiscence in later interviews. Third, the youngest children in Habermas and de Silveira's (2008) sample were 8 years old, but the testimonies of younger children may be particularly lacking in causal and temporal connections due to their limited ability to organise their recall according to context and chronology (Reese et al., 2011).

| The present study
Whilst the impact of multiple interviews on the quality of children's recall has been the focus of a wide range of research, only a few studies have analysed real-life forensic testimonies provided over the course of multiple interviews due to the challenges of accessing and analysing such data (Hershkowitz & Terner, 2007;Katz & Hershkowitz, 2012;Leander, 2010;Waterhouse et al., 2016). Unfortunately, the principal measures used in laboratory research to evaluate the quality of children's testimony, such as the number of correct target details or the proportion of accurate details, are not applicable in field research due to researchers' ignorance to 'ground truth'. This study seeks to address the gap in the literature through analysing multiple interviews using a narrative coherence framework to evaluate the quality of children's testimony with forensically relevant measures designed specifically for field research, including the content of children's testimony, the consistency of their recall and the extent of causal-temporal connections in their accounts. To the authors' knowledge, no previous research has examined the narrative coherence of children's accounts across multiple forensic interviews.
The purpose of this study was to investigate the extent to which children construct coherent narratives in multiple forensic interviews and explore the impact of multiple interviews on each of the three components of narrative coherence, as defined by the story-telling model (McAdams, 2006;Pennington & Hastie, 1992;Rideout, 2008); completeness, consistency and connectedness. When conducting multiple interviews, investigators can use the cumulative amount of information recalled in the two interviews rather than only the contents of the second interview, therefore, the overall amount of information available to the interviewers after the two interviews was compared with the amount of information available after the first interview.
Although a new coding scheme was developed to conform more closely to the legal definition of narrative coherence and to allow for the analysis of very long narratives elicited over the course of multiple interviews, the components of narrative coherence in this study can be mapped onto multidimensional models of coherence (Reese et al., 2011). Specifically, the content types of time and location indicate context, the proportion of sensitive information conveys a measure of theme, subjective details measure evaluative content and chronology is indexed by the frequency of temporal markers.

| Hypotheses
Previous field research strongly suggests that multiple interviews increase the completeness of children's testimonies (Hershkowitz & Terner, 2007;Katz & Hershkowitz, 2012;Leander, 2010;Waterhouse et al., 2016). Therefore, it was expected that (a) multiple interviews will increase the overall number of details, (b) multiple interviews will increase the number of time and location details, (c) multiple interviews will increase the number of subjective details and (d) multiple interviews will increase the number of sensitive details. Based on the results of previous studies investigating the impact of repeated recall on the consistency of witnesses' accounts (Baugerud et al., 2014;Gilbert & Fisher, 2006, Krix et al., 2015, it was expected that children's testimonies will become less consistent as a result of multiple interviews. Due to the lack of previous research on the connectedness of children's testimonies elicited over multiple forensic interviews, analyses regarding this aspect of narrative coherence were exploratory and no specific hypotheses were established. Scaffolding provided by interviewers and the 'practice effect' associated with repeated recall may facilitate children's recall, leading to a higher frequency of markers of causal and temporal connectedness in multiple interviews. However, it is also possible that children have difficulties with integrating details mentioned over different recall occasions on the same causal-temporal scale. Alternatively, Habermas and de Silveira's (2008) finding that the frequency of causal-temporal connections in children's accounts is not affected by multiple recall occasions may be replicated in forensic interviews. Due to the challenges associated with accessing data from real-life police investigations, the sample size was small, but comparable to other field studies involving multiple interviews (Leander, 2010: N = 27;Waterhouse et al., 2016: N = 21). Interviews were conducted and transcribed for the purpose of police investigations. Permission to conduct the study was granted by the ethics committee of the authors' institution ahead of the start of data collection. Children were interviewed between 2 and 11 times, however, only the first and second interviews with each child were examined. The delay between interviews ranged from less than an hour to over a year and a half

| Coding
Only the substantive part of the interviews was analysed, defined as questions asked after the interviewer initially transitioned to probes related to the context of the allegations. The rapport, narrative practice and closure phases of the interview were not analysed. To provide a measure of interview quality, interviewers' utterances were coded according to question type. Codes for children's responses were divided into three main categories; completeness, consistency and connectedness codes. Completeness codes included the number of details in each utterance, their content, their subjectivity and their relevance to the investigations. To measure consistency, each detail was coded according to their novelty and new details were coded as either consistent or inconsistent with previously reported details. Connectedness was coded by identifying local markers of linguistic coherence.

| Question type
The type of question eliciting each detail reported by children was coded according to a modified version of the question type coding guide developed by Lamb, Orbach, Hershkowitz, Esplin, and Horowitz (2007). Examples of question types are presented in Table 1.

1.
Invitations. Broad open-ended questions encouraging free recall.
2. Summaries. Statements summarising details previously mentioned by the child, either verbatim or paraphrased.
3. Directives. Open-ended questions encouraging cued recall focused on a topic previously mentioned by the child. 4. Option-posing questions. Closed-ended yes/no or forced choice questions.

Suggestions. Leading questions and statements referring to details
that the child has not mentioned previously.

| Number of details
The number of details, approximately corresponding to the number of clauses, was counted in each of children's utterances. Both independent clauses and subordinate clauses were coded as separate details.

| Content
Children's responses were coded according to the type of information they referred to. Content codes were not exclusive, however, multiple mentions of the same detail type in the same utterance were only coded once. Examples of content codes are presented in Table 2.

| Subjectivity
Details referring to subjective, personal descriptions of the event from the point of view of the victim or other persons were identified and categorised into one of three categories.

| Novelty
Each detail mentioned by children was coded according to whether they were new or repeated. Repeated details were also categorised according to whether they were repeated within the same interview or repeated across interviews. Details could be coded as repeated both within the same interview and across interviews.

| Consistency
Each repeated detail was coded either as consistent or inconsistent with previous mentions of the same detail. Inconsistent details were also coded on the basis of whether they contradicted details mentioned within the same interview or across interviews. Details could be coded as contradictory both with details within the same interview and across interviews. If a detail was once mentioned inconsistently, all future mentions were coded as inconsistent, unless an explanation was provided to resolve the contradiction.

| Connectedness
Local markers of linguistic coherence were identified using a modified version of the coding scheme developed by Kulkofsky et al. (2008).
Marker types were not exclusive, however, multiple mentions of the same marker type in the same utterance were only coded once. Examples of marker types are presented in Table 3. 3 | RESULTS

| Preliminary analyses and excluded variables
Initial analyses using mixed ANOVAS showed that measures of the completeness, consistency and coherence of children's testimony (the number of details reported, the consistency of details, the relevance of details for the investigation, the content of details and the frequency of markers of connectedness) were not affected by the gender of the witness, the identity of the interviewer, the delay between interviews, the frequency of the abuse and the type of abuse reported. Therefore, these variables were excluded from further analyses.
Children's age was normally distributed (Kolmogorov-Smirnov: D [28] = 0.13, p = .20) and included in the analyses as a continuous variable. Children's age was added as a covariate to analyses where previous research suggested a relationship between age and the dependent variable, and the scatterplot also suggested a linear relationship. Although ANCOVAs are often used by researchers to statistically control for the effects a confounding variable on the dependent variable (Schneider, Avivi-Reich, & Mozuraitis, 2015), in the current study, the aim was to assess potential interactions between the effects of the covariate and the dependent variable. Therefore, an interaction term between age and the dependent variables was Examples of linguistic markers of connectedness

Type of marker Examples
Simple temporal markers First, next, then, before, after, etc.
Complex temporal markers When, until, while, etc.
Markers of causal relations Because, so, in order to, as, etc.
Markers of optional states Sometimes, usually, always, probably, etc. included in each analysis. In within-subjects designs, the covariate needs to be centered by subtracting the mean of the covariate from each covariate value to avoid an increase in Type 1 error rates or a loss of power (Schneider et al., 2015). Thus, children's age was centered when used as a covariate in within-subjects analyses. As the probability of Type 1 errors is elevated in within-subject designs involving an interaction between the covariate and the dependent variable (Schneider et al., 2015), alternative statistical tests were conducted following each ANCOVA, without the covariate. These confirmed the main effect in each analysis (Appendix).

| Interview quality
Although interviewers' questions were not the focus of the present study, the type of question eliciting each detail was coded to provide a measure of interview quality. Most details were elicited using directives, followed by option-posing questions, invitations, suggestions and summaries (Table 4). In contrast to best practice guidelines, 7.6% (SD = 12.8%) of details in the first interview and 10.6% (SD = 14.7%) in the second interview were provided in response to suggestions.
Pairwise comparisons using the Bonferroni adjustment (adjusted alpha levels p < .005) for multiple comparisons revealed that overall, significantly more details were elicited by directives and option-posing questions than by invitations, summaries or suggestions. More details were elicited using invitations than summaries. All pairwise comparisons are presented in Table 5.
The two-way interaction between question type and interview number was followed up with 25 paired samples t tests comparing the proportion of details elicited by each question type in the first interview and the second interview (adjusted alpha level p < .002). In the first interview, a higher proportion of details were elicited using directives and option-posing questions than summaries and suggestions.
Invitations elicited a lower proportion of details than directives, but a higher proportion than summaries. In the second interview, directives and option-posing questions elicited more details than invitations, summaries and suggestions. There was no difference between the first interview and second interview in the proportion of details elicited by any of the question types. No other comparisons were significant.

| Completeness
On average, witnesses reported 121.75 (SD = 147.49) new details in the first interview and 99.29 (SD = 123.14) new details in the second. Table 6 contains the number of details children reported in each content category and the proportion of those details from all new details (the same details could belong to multiple content categories). The number of details reported correlated positively with children's age in both Interview 1 and Interview 2 (Figure 1).

| Number of details
To assess whether significantly more details were reported over the course of the two interviews than in the first one only, a RM-ANCOVA was conducted assessing the potential effect of interview number (first interview, overall) on the number of details witnesses reported. Children's age was added to the analysis as a covariate. Significant main effects were found for interview number, F(1,26) = 20.44, p < .001, η p 2 = 0.44, and children's age, F(1, 26) = 11.74, p = .002, η p 2 = 0.31.
There was a significant interaction between interview number and children's age, F(1, 26) = 4.32, p = .048, η p 2 = 0.14. Inspection of Figure 1 suggests that the difference between the number of details reported in the first interview and over the two interviews increased as the age of witnesses increased.  Pairwise comparisons using the Bonferroni adjustment (adjusted alpha levels p < .001) for multiple comparisons revealed that significantly more details were reported related to actions, the victim and the suspect than related to objects, witnesses, co-victims, locations and time. Significantly more location-related details were reported than temporal details. All pairwise comparisons are presented in Table 7.

| Content
The two-way interaction between interview number and content was followed up with eight paired samples t tests comparing the number of details reported in each category in Interview 1 and overall (adjusted alpha level p < .006). Significantly more details were reported overall than in the first interview only in all content categories with the exception of information related to witnesses.

| Subjectivity
On average, witnesses reported 16.71 new subjective details in both the first interview and the second interview (Table 3). In both interviews, emotions were the most frequently mentioned subjective details, followed by perceptions and cognitions.
To assess whether significantly more subjective details were reported altogether in the two interviews than from the first interview only, a two-way RM ANOVA was conducted investigating the potential effect of interview number (first interview, overall) on the number

T A B L E 6
The number of details reported in each content category in Interview 1 and Interview 2 Note: * denotes significance at p < .05, ** denotes significance at adjusted alpha level p < .001, *** denotes significance at adjusted alpha level p < .006.
Pairwise comparisons using the Bonferroni adjustment (adjusted alpha levels p < .017) for multiple comparisons revealed that significantly more subjective details focused on emotions than on perceptions or cognitions. All pairwise comparisons are presented in Table 8.
The two-way interaction between interview number and subjectivity was followed up with three paired samples t tests comparing the number of details reported in each subjective content category in Interview 1 and overall (adjusted alpha level p < .017). Significantly more details were reported overall than in the first interview about emotions, perceptions and cognitions.

| Consistency
In the first interview, 23% (SD = 12%) of all information reported was repeated, whilst in the second interview this percentage increased to 38.5% (SD = 15%). Overall, 32% (SD = 0.12%) of details reported in the two interviews were repeated. In the second interview, information was more frequently repeated within the interview (M = 0.28, SD = 0.12) than across the two interviews (M = 0.17, SD = 0.16). Some information was repeated both within the same interview and across the two interviews.
The percentage of consistent repeated details decreased from 87.1% (SD = 15%) in Interview 1 to 84.0% (SD = 16%) in Interview 2. Considering details reported over the course of the two interviews, 84.1% (SD = 16%) of repeated details were consistent with previously reported information. Accordingly, inconsistencies within the same interview (M = 0.08, SD = 0.12) were also more common than inconsistencies across interviews (M = 0.04, SD = 0.06). Some repeated information was inconsistent with previously reported information in the same interview as well as in the previous interview (M = 0.04, SD = 0.06).Older children reported a higher percentage of consistent information than younger children in the first interview, although not overall (Figure 2).
To assess whether the percentage of consistent responses was affected by interview number (first interview, overall), a RM ANCOVA was conducted investigating the potential effect of interview number (first interview, overall) on the percentage of consistent responses.
Children's age was included as a covariate in the analysis. One witness provided no repeated information in either interview, therefore this witness was excluded from the analysis. The effect of interview number on consistency was non-significant, F(1,26) = 0.47, p = .50, η p 2 = 0.02. There was no significant effect of age, F(1,26) = 2.50, p = .13, η p 2 = 0.09. There were no significant interactions.

| Markers of connectedness
In the first interview, 15.6% (SD = 9%) of details mentioned by children contained markers of connectedness whilst this number increased to 17.9% (SD = 9%) in the second interview. The frequency of simple temporal markers, complex temporal markers and causal connections increased from the first to the second interview whilst the frequency of markers of optional states decreased (Table 9).
Markers of connectedness were more frequently used by older witnesses than younger witnesses (Figure 3). Note: *Denotes significance at p < .05, ** denotes significance at adjusted alpha level p < .017.

T A B L E 8 Summary of analyses of the subjective content of children's responses in Interview 1 and overall
To assess whether children's accounts were significantly more connected in the second interview than in the first one, a RM-ANCOVA was conducted assessing the potential effect of interview number (first interview, overall) on the frequency of the four types of connectedness markers (simple temporal, complex temporal, causal, optional) witnesses reported. Children's age was added to the analysis as a covariate. Significant main effects were found for connectedness, F(3, 78) = 4.26, p = .008, η p 2 = 0.14, and children's age, F (1, 26) = 25.02, p < .001, η p 2 = 0.49. No main effect was found for interview number, F(1, 26) = 1.41, p = .25, η p 2 = 0.05. No significant interactions were found.
To follow up the main effect of connectedness, six pairwise comparisons were conducted on the types of connectedness markers using the Bonferroni adjustment (adjusted alpha levels p < .008) for multiple comparisons. Markers of causal connections were significantly more frequently used than markers of optional states. All pairwise comparisons are presented in Table 10.

| DISCUSSION
Consistent with the hypotheses, results of the present study indicated that the completeness of children's testimonies increased significantly when information was collected over two interviews rather than a single interview. Not only did children report more new details over the course of two interviews, they also mentioned a higher number of crucial time and location details as well as subjective details related to emotions, cognitions and perceptions, and sensitive details which were directly related to the alleged abuse. In fact, children reported a higher proportion of forensically relevant details in the second interview than in the first interview. Contrary to expectations, when both contradictions within the same interview and contradictions across interviews were considered, there was no significant difference in the proportion of contradictory repeated details between the first and second interviews. Exploratory analyses of linguistic connectedness revealed that the proportion of markers of connectedness remained stable across interviews. Children's age influenced the completeness and connectedness of their testimonies, and the impact of multiple interviews on the completeness of narratives.

T A B L E 9
Percentage of responses including linguistic markers of connectedness in Interview 1 and Interview 2  coherence; in multidimensional models, they provide the context of the narrative (Reese et al., 2011), whilst in story grammar approaches, they describe the temporal and physical setting for the story (Brown et al., 2018). Westcott and Kynan (2004) reported that in 26% of single forensic interviews analysed, children did not provide sufficient information about the physical setting of the events, and this number rose to 50% for temporal setting. Time and location details are vital for the particularisation of the alleged offenses, which is a prerequisite of the successful prosecution of child abuse cases (Powell & Thomson, 1997). Reflecting frequencies found in previous research (Connolly & Read, 2006), most child witnesses in this sample alleged that they have been victims of multiple instances of abuse.
Particularisation is especially difficult when adults and children testify about multiple offences (e.g., Brubacher et al., 2014;Connolly & Gordon, 2014), as memories about repeated occurrences of an event are often organised into 'scripts' of general features and selecting a specific occurrence from several repetitions poses a source monitoring challenge (Johnson, Hashtroudi, & Lindsay, 1993). Therefore, the increase in the number of time and location details resulting from a second interview may have important implications for developing techniques to aid the process of particularisation when interviewing child witnesses alleging multiple abuse.
In addition to time and location details, subjective details relating to children's emotions, cognitive states and physical perceptions also increased in number over the course of two interviews. In the storytelling framework of legal decision making, first-person descriptions are essential components of credible testimonies (Pennington & Hastie, 1992). Subjective descriptions are also required for coherent narratives in the multidimensional model (Peterson et al., 2014;Reese et al., 2011) and in story grammar approaches (Brown et al., 2018). In the multidimensional model, subjective first-person descriptions are aspects of the 'evaluative' or 'emotional' dimension (Peterson et al., 2014;Reese et al., 2011), while in story grammar approaches, they are divided into the categories of 'internal response', referring to the emotions, cognitions and goals of the characters before the event and 'reaction', describing the emotions, cognitions and goals of the characters in response to the event (Brown et al., 2018). Previous field studies have shown that children only infrequently describe subjective reactions when describing sexual or physical abuse (Lyon, Scurich, Choi, Handmaker, & Blank, 2012;Westcott & Kynan, 2004), despite their ability to verbalise emotions in laboratory settings (Ahern & Lyon, 2013). The results of the present study indicate that in some contexts, multiple interviews can increase the number of subjective reactions children describe, including references to the emotional states, cognitions and perceptions of themselves and others.
Not only did child witnesses report a higher number of sensitive details overall than in a single interview, they also referred to a higher proportion of details directly related to the alleged offence in the second interview than in the first one. This tendency could potentially relate to children's increased comfort with the interview situation and increased trust in the interviewer during the second interview compared with the first one. This explanation is consistent with recommendations of the Extended Forensic Interview guide that interviewers only approach the topic of the abuse in the second or third interview when talking to children alleging sexual abuse to allow sufficient rapport building (Carnes, Wilson, & Nelson-Gardell, 1999).
Alternatively, the increased proportion of sensitive details might reflect children's increased understanding of the format of forensic interviews and the type of information interviewers are interested in. The way witnesses are expected to recall their memories is highly unusual for children, who are rarely asked to describe past events to this level of detail, especially to listeners who themselves are not knowledgeable about the event. Due to the question-answer format of forensic interviews, skilled interviewers can guide the conversation towards crucial topics even when using exclusively open-ended questions and carefully chosen follow-up questions (Lamb et al., 2007). In addition, previous research suggests that interviewers may ask more sensitive questions in further interviews, shifting from contextual to abuse-related details gradually over the course of multiple interviews (Patterson & Pipe, 2009). Whether due to better rapport between the child and the interviewer or to children's increased understanding of forensic interviews, the increased proportion of sensitive details in the second interview suggests that interviewers may gain a large amount of abuse-related information in multiple interviews which would not come to surface in single interviews.

| Consistency and connectedness
Based on the low consistency found in multiple interviews with children in previous research (Baugerud et al., 2014;Price et al., 2016), contradictions in children's recall were expected to increase in frequency across the two interview occasions. In contrast, when both contradictions within the same interview and across the two interviews were taken into account, results showed no significant difference in the proportion of consistent repeated details between the first and second interview. In line with previous findings (Katz & Hershkowitz, 2012;Waterhouse et al., 2016), there were few contradictions between interviews. The frequency of within-interview contradictions was similar in the two interviews and higher than the frequency of contradictions between interviews. Thus, although the overall number of contradictions increased in multiple interviews, the proportion of contradictions remained constant.
Research in forensic psychology cautions against the use of inconsistencies as correlates for accuracy (Gilbert & Fisher, 2006) but pointing out inconsistencies in witnesses' accounts is a common credibility challenging strategy during cross-examination (Szojka et al., 2017). The present findings suggest that multiple interviews do not necessarily damage the credibility of children's accounts by increasing the proportion of contradictions. However, inconsistencies are also introduced to children's accounts via omissions of previously mentioned details and additions of new details, and the proportion of these types of inconsistencies was very high in the current sample. Children's credibility may also be challenged on the basis of details that were 'left out' from earlier interviews and emerged in later interviews only (Szojka et al., 2017), but this is an inherent result of multiple interviewingadditional interviews are conducted to reveal new details, but new details always mean more inconsistencies.
As no previous research investigated the extent to which forensic interviews with children elicit linguistically connected accounts, hypotheses regarding this component of narrative coherence were exploratory.
In the present study, the proportion of linguistic connections in children's accounts remained stable across interviews. This result is consistent with those reported by Habermas and de Silveira (2008) who found that repeated recall opportunities did not affect linguistic connectedness in children's autobiographical accounts.

| Age and narrative coherence
The present study provided an insight into developmental differences in children's ability to provide coherent testimonies in response to multiple forensic interviews. Consistent with other studies, older children provided more complete responses in both interviews than younger children. Additionally, the linguistic connectedness of testimonies also increased as the age of witnesses increased. The age differences in the narrative coherence of children's testimony found in the present study are consistent with findings from research using a multidimensional model of narrative coherence indicating that thematic, chronological, contextual and evaluative elements of coherence develop at different rates across the lifespan (Habermas & de Silveira, 2008;Reese et al., 2011).
Although children of all ages provided more complete accounts over the course of the two interviews than in the first interview only, the difference in the number of details increased with children's age. These results are contrary to the increased benefit of multiple interviews for the youngest children reported by Baugerud et al. (2014) and Katz and Hershkowitz (2012). However, even though the increase in the number of details was proportionally higher for older children, a small increase can be valuable in forensic investigations where younger children often provide only brief descriptions in a single interview (Faller, Cordisco-Steele, & Nelson-Gardell, 2010;Langballe & Davik, 2017).  (Lamb, 2016), few details in the present study were elicited by invitations and a substantial minority of details were provided in response to suggestive questions. Consistent with previous research (Waterhouse et al., 2016), there was no difference between the first interview and the second interview in the proportion of details elicited by any of the question types. However, in the second interview, invitations were less frequent than option-posing questions. This was not the case in the first interview, suggesting that interviewers may rely more on closed-ended questions and put less emphasis on free recall in the second interview. As children's free recall is generally more accurate than their response to closed-ended questions (Lamb et al., 2007), this may raise concerns about the accuracy of children's recall in the second interview.

| Limitations and future research
However, the field research design of the current study prevented analyses involving accuracy, as the 'ground truth' regarding the events children described was not known to the researchers. The impact of multiple interviews on the accuracy of children's accounts is a much-debated topic and research in the laboratory needs to explore the relationship between accuracy and narrative coherence further before any recommendations can be made about using multiple interviews as a method to scaffold the narrative coherence of child witnesses' accounts. Gaining complete, coherent and consistent accounts is essential for investigations, but inaccurate details can lead to miscarriages of justice (Ceci & Bruck, 1993).
Second, the sampling method of this study did not allow for the control of age and other witness characteristics. Due to the small number of witnesses and the large age differences in the current sample, it was not possible to investigate the relationship between age and more complex measures of narrative coherence, such as the content of children's responses. A larger sample size would allow researchers to categorise witnesses into meaningful age groups and offer more nuanced conclusions about the impact of multiple interviews in different age groups. In addition to age, there was also a large variability in other case characteristics, including the type of abuse, the interview protocol and the identity of the interviewer, thus, it is not possible to draw general conclusions from the present data.
Third, the type and quantity of complex real-life data provided in multiple forensic interviews prevented the use of coding schemes developed for multidimensional models of coherence (Reese et al., 2011), and local, quantitative measures of narrative coherence were used instead. Further research is needed to investigate the relationship between different models and coding approaches of narrative coherence and to determine whether legal and psychological models of coherence can be integrated.
In addition to exploring the relationship between narrative coherence and accuracy in multiple interviews, future research should clarify the impact of the three components of narrative coherence on the credibility of children's testimony. The story-telling model of legal decision making suggests that completeness, consistency and connectedness contribute independently to the credibility of witnesses' accounts (McAdams, 2006;Pennington & Hastie, 1988;Rideout, 2008). Although previous studies have reported that adults judge narratively coherent accounts as 'better stories' (Schneider & Winship, 2002), research has not yet investigated the effect of narrative coherence on mock jurors' credibility judgements of children's reports elicited in multiple interviews.

| Conclusion and implications
The present study was the first to investigate the narrative coherence of children's accounts elicited in multiple forensic interviews. Results suggest that multiple interviews increase children's ability to provide coherent testimonies, through providing children with an opportunity to report more details related to crucial aspects of the allegations. Multiple interviews differentially affected the components of narrative coherence; while the completeness of children's accounts increased, their consistency and connectedness remained stable.
These results imply that investigators may conduct an additional forensic interview to increase the completeness of children's testimony, particularly if it lacks forensically relevant details related to the location and timing of the events and children's emotions, perceptions and cognitive appraisals. The findings also indicate that multiple interviews do not inherently increase the proportion of contradictions in children's accounts, suggesting that concerns about the potential negative effect of additional interviews on children's credibility may not be justified.
However, as 'ground truth' is not known in field studies, the results of the present study should be interpreted in the context of laboratory research showing that the accuracy of children's recall across multiple interviews is contingent upon interview quality. Further research is needed to establish the relationship between narrative coherence and the accuracy and credibility of children's testimonies.

CONFLICT OF INTEREST
The authors declare no conflict of interest in relation to this manuscript.

DATA AVAILABILITY STATEMENT
Data sharing not applicable due to confidentiality and ethical considerations.

APPENDIX
As the probability of Type 1 errors is elevated in within-subject designs involving an interaction between the covariate and the dependent variable (Schneider et al., 2015), alternative statistical tests were conducted corresponding to each ANCOVA, without the covariate.

Number of details
To assess whether children reported significantly more details overall in the two interviews than in the first one only, a paired t test was con-

Connectedness
To assess whether children's accounts were significantly more connected in the second interview than in the first one, a RM-ANOVA was conducted assessing the potential effect of interview number (first interview, overall) on the frequency of the four types of connectedness markers (simple temporal, complex temporal, causal, optional) witnesses reported. A significant main effect was found for connectedness, F(3, 81) = 4.04, p = .01, η p 2 = 0.13. No main effect was found for interview number, F(1, 27) = 1.39, p = .25, η p 2 = 0.05. No significant interactions were found.