Training flexibility in fixed expressions in non-fluent aphasia: A case series report

Background: Many speakers with non-fluent aphasia (NFA) are able to produce some well-formed word combinations such as ‘ I like it ’ or ‘I don’t know ’, although they may not use variations such as ‘ He likes it ’ or ‘ I don’t know that person ’. This suggests that these utterances represent fixed forms. Aims: This case series investigation explored the impact of a novel intervention aimed at enhancing the connected speech of individuals with NFA. The intervention, motivated by usage-based principles, involved filling open slots in semi-fixed sentence frames. Methods & Procedures: Five participants with NFA completed a 6-week intervention programme. The intervention trained participants to insert a range of different lexical items into the open slots of high-frequency phrases such as ‘ I like it ’ to enable more productive sentences (e.g., ‘ they like flowers ’). The outcomes and acceptability were examined: The primary outcome measure focused on changes in connected narrative, and the availability of trained constructions (e.g., ‘ I like it ’) was explored through a story completion test. Two baseline measures of behaviour were taken prior to intervention


INTRODUCTION
Despite reduced ability to build grammatically correct sentences, many speakers with non-fluent aphasia (NFA) are able to produce some well-formed word combinations.Many of these residual combinations represent familiar, high-frequency phrases (Bruns et al., 2019;Van Lancker Sidtis, 2012;Zimmerer et al., 2018).Investigating the use of the familiar expression 'I don't know', Bruns et al. (2019) found that individuals with Broca's aphasia employed the phrase for a variety of conversational functions, mostly with a turn-constructional use.However, it was predominantly used with the same linguistic form.The present study investigates whether residual, familiar expressions can be used as a starting point to stimulate greater creative capacity in speakers with NFA, such as from 'I don't know' to 'I don't know [PERSON/LOCATION/SOMETHING]'.
Familiar expressions, often referred to as formulaic expressions, are at the heart of usage-based grammar theories, where linguistic form such as 'I don't know' is paired with semantic-pragmatic meaning (e.g., expressing lack of knowledge).This is referred to as a form-meaning pairing or construction.Constructions can be of any length and degree of abstractness.For instance, the familiar phrase 'I like it' may represent a stored exemplar of a more abstract schema '[REFERENT] like-TENSE [THING]'.This means that storage is not limited to single words, but whole sentences, parts of sentences and schemas (e.g., "frames with slots"; Dąbrowska, 2014, p. 619) can also be stored as constructions.In usage-based theories such as Construction Grammar (CxG; Goldberg, 2003), lexicon and syntax are not strictly separated.If a construction is encountered frequently, it is more likely to be represented in the speaker's 'construct-i-con' (Goldberg, 2003).This makes CxG a fruitful framework to inform aphasia therapy as it allows targeting words, word combinations or sentences, depending on a speaker's resources.Some elements of existing speech and language therapy interventions for NFA are in accordance with usagebased assumptions.For instance, familiar expressions such as 'thank you' are targeted in Melodic Intonation Therapy (Helm-Estabrooks et al., 1989) and are also employed in Intensive Language-Action Therapy (Stahl et al., 2016); semi-fixed frames with open slots (e.g., 'I'd like X') are practised in script training (Bilda, 2011); and Reduced Syntax Therapy (Ruiter et al., 2010;Schlenck et al., 1995;Springer et al., 2000) trains abstract constructional schemas (e.g., the elliptical [DOING/DONE] + [WHAT] construction).
Such sentence-level therapies for aphasia often show improvement of trained structures, but limited generalization to spoken discourse.For example, a summary of 13 script training studies by Kaye and Cherney (2016) concluded that while scripts were successfully acquired in all studies, some of which found maintenance effects, there was limited generalization to untrained scripts or interactions with different communication partners.Similarly, Murray et al. (2007) evaluated Treatment of Underlying Forms (TUF; Thompson & Shapiro, 2005) within a case study and found generalization of treated sentences to connected speech, but limited (cross-modal) generalization to untreated structures.
As yet no intervention programme has explicitly tested application to aphasia of usage-based principles.For example, constructional grounding (Israel et al., 2001;Riches, 2013) refers to a lexically specific and high-frequency construction (e.g., 'I like it') which may act as a source construction, representing an instance of a more abstract constructional schema (e.g., '[REFERENT] like-TENSE [THING]').Superimposition (Dąbrowska, 2014) is a process where lexical items or chunks are inserted into an open slot of a semi-fixed construction (e.g., 'all of us like cake').Employing such principles in the design of sentence therapies might help speakers with aphasia produce longer and more flexible utterances.
The present study tested a novel behavioural intervention for NFA aimed at increasing the productivity of multiword expressions.Introducing flexibility into source constructions was approached by aiming to loosen open slots in semi-fixed constructions such as '[REFERENT] like-TENSE [THING]', starting with a high-frequency phrase (e.g., 'I like it') and then superimposing lexical items or chunks (e.g., 'You like it' → 'You like watching football').
The intervention incorporated psycholinguistic and neuroscientific learning principles such as structural priming (Kaschak et al., 2014) and errorless learning (e.g., Conroy & Scowcroft, 2012).Structural priming is a phe-nomenon where a speaker is more likely to reuse a structure they have recently encountered (Kaschak et al., 2014).In the structural priming paradigm, a participant is primed with a specific syntactic structure or a schema (e.g., 'he gave me a book'), for instance by reading aloud.The second part of a structural priming task typically uses a picture description task where the participant, despite no explicit instruction, tends to describe the event employing the structure with which they were primed (e.g., 'she gave him an apple' as opposed to 'she gave an apple to him'). Lee and Man (2017) used implicit structural priming training with an individual with Broca's aphasia and found improved sentence production of prepositional dative structures (e.g., 'the boy is giving a guitar to the singer').Errorless learning methods employed in aphasia studies include decreasing cues (Conroy & Scowcroft, 2012) or, as in a 'pure' errorless learning method, direct repetition of a target structure.Similar principles were applied in the 'Sheffield WORD-Structured speech therapy' program (SWORD; Varley et al., 2016;Whiteside et al., 2012), which was designed for individuals with acquired apraxia of speech (AoS).Here, fluent and flexible use of constructions (words in isolation and embedded in phrases such as 'Where is ___?') is practised in a stepwise way, using errorreduction strategies.In the first step, the participant with AoS is presented with a video clip of a neurotypical speaker producing a target word.This is followed by imagined production of the word.In a third step, the participant attempts production (repetition) of the target.This procedure seeks to facilitate fluent language production by minimizing errors.A description of how these two principles (structural priming and error-reduction principles) were incorporated into the current intervention can be found in the Methods section.Moreover, the current study incorporated social-motivational learning elements (Varley, 2011).Constructions were practised by making use of a participant's own voice: Target phrases were recorded and modified to enhance fluency (which we will refer to as 'self-voice').Listening to one's own voice producing a more fluent target sentence may be particularly motivating for people with NFA.Moreover, self-voice, which brain imaging data suggests is processed in the right hemisphere (Kaplan et al., 2008), has been found to positively affect learning and remembering in university students (Forrin & MacLeod, 2017) and may have the same benefits for other populations.
Finally, the intervention was computerized to allow opportunities for self-managed home practice.This may increase the frequency with which participants engage with intervention activities (Nobis-Bosch et al., 2011).High-intensity aphasia therapy (8.8 h per week for 11 weeks) has been found to be more effective compared with lower intensity therapy, administered over a longer time span (Bhogal et al., 2003).The principle of increasing therapeutic dose (total number, length and frequency of sessions) through home practice has become a common component of aphasia interventions.Varley et al. (2016), for instance, found a positive relationship between dose (the time spent on a self-administered computerized intervention) and treatment outcome (correctly named words post-intervention) in a large group of participants with AoS.
Change in connected speech production was explored through use of the automated Frequency in Language Analysis Tool (FLAT; Zimmerer et al., 2016Zimmerer et al., , 2018)).This is the first use and exploration of the value of this automated software as an outcome measure in intervention research.It has particular advantages in that it characterizes features of connected speech in a fast and blind fashion.We examined the value of combination ratio as an outcome measure (number of trigrams (three-word combinations) produced by a speaker divided by the number of words).
The current study set out to evaluate the outcomes of a case series, to test the potential of this computerised, usage-based intervention for improving sentence processing abilities in NFA.In addition, participants with NFA and their regular conversation partners (CPs) were asked to comment on the acceptability of the intervention.The following main research questions guided the analysis:

Participants
Following ethical approval from the UCL Language & Cognition Departmental Ethics Committee (Project ID: LCRD.2017.01),five participants and their regular CPs were recruited via a university research register.Written consent to participate in this study was obtained both from participants with aphasia and their CPs.1 Three participants were female and two were male.They presented with chronic, post-stroke NFA (average of 91 months post-stroke onset, range = 24-165) as identified by experienced SLTs, had no known neurodegenerative illness, used English as their main language andbased on self-report -had sufficient auditory and visual acuity to interact with a laptop.The mean age was 60 years (range = 48-68).They did not receive other speech and language therapy interventions directed at impairment level during their involvement in the study, although they might continue participation in social support groups.Participants' language output was characterized by grammatically impoverished, non-fluent, and effortful spontaneous speech and narrative production.Table 1 presents an overview of participant demographic characteristics and performance on background/profiling assessments.All participants showed relatively preserved auditory lexical comprehension (range = 24-28 out of 30), as determined by the spoken word comprehension subtest of the CAT (Swinburn et al., 2004).However, the group was heterogeneous with regard to object naming, digit span and non-verbal IQ performance, as measured by the BNT (Kaplan et al., 2001), subtest 13 from the PALPA battery (Kay et al., 1992) and the WASI-II Matrix Reasoning subtest (Wechsler, 2011), respectively.The background profiles revealed variability with respect to aphasia severity: P4 presented with the most severe aphasia, followed by P3 and P2 (Table 1).P1's and P5's background profiles indicated less severe aphasia compared with the rest of the group, although P1's object naming was considerably more affected by her aphasia than P5's naming performance.

Design and intervention
Employing a case series design, each participant was involved in a 16-week study, consisting of three phases: baseline (weeks 1-3), intervention (weeks 4-9) and postintervention probes (weeks 10-16).The intervention was delivered by a researcher with a speech and language therapy qualification (first author).During the 6-week intervention phase, five individual 60-min sessions alternated with five 60-min home visits.Home visits facilitated home practice and ensured that participants could complete self-managed activities on a laptop supplied by the research team.A description of the intervention using the Notes: a Boston Naming Test (BNT; Kaplan et al., 2001).
b Subtest 13 from the Psycholinguistic Assessments of Language Processing in Aphasia (PALPA; Kay et al., 1992). c Comprehension of spoken words subtest from the Comprehensive Aphasia Test (CAT; Swinburn et al., 2004).
d Matrix Reasoning Subscale from the Wechsler Abbreviated Scale of Intelligence-Second Edition (Wechsler, 2011).
TIDieR checklist (Hoffmann et al., 2014) is presented in Appendix A.
The main aim of the intervention was to train flexible use of 12 constructions.Each represented a familiar expression with high usage frequency (a source construction such as 'I like it') which mapped onto a semifixed frame with open slots (e.g., '[REFERENT] like [THING/PERSON/LOCATION/PROCESS]', as in 'you like cake').The high-frequency source constructions were derived from the spoken British National Corpus (BNC, 2007).The constructions corresponded to overarching interactional functions (e.g., giving an opinion, asking a question) relevant to everyday talk about experiences, opinions and exchanging information.To reduce lexicalsemantic demands, open slots were mostly filled by proforms (e.g., 'that' or pronouns such as 'I')-words of high frequency and versatile as to the conversational contexts in which they can be used.
The same high-frequency constructions were employed for all participants, although when practising inserting new lexical material into the open slots, personalized items could be used (e.g., 'I like CDs/Elton John').The intervention was divided into three phases (Figure 1), and the following underlying techniques and principles were applied: self-voice, structural priming, superimposition, error-reduction principles and increasing dose through home practice.Phase 1 elicited fluent as possible versions of a participant's own productions of the target phrases (self-voice).Each participant recorded the same set of 59 phrases (names were individualized, e.g., '[NAME] had dinner') supported by the researcher (e.g., by direct repetition or reading aloud).If parts of a recording were effortful and non-fluent, the researcher modified the phrase using the sound-editing software Audacity (version 2.1.2;https: //www.audacityteam.org/).Modifications through waveform editing included deleting fillers (e.g., 'um' and 'er') and reducing long vowel durations or long pauses between words, similar to the procedures described by Harmon et al. (2016).In Phase 2 (structural priming), participants were exposed to variations of the target constructions within a gamified reaction time task using an auditory word monitoring paradigm.In this word monitoring game (WMG), the participant reacts as quickly as possible via button press to a prespecified target word (e.g., 'like') once it is encountered in an auditory sentential context ('I like it when spring is coming').The WMG was self-managed (i.e., practised at home) and the program automatically recorded when and for how long the WMG was practised.The aim was to activate constructional schemas by repeated exposure, which may prime and thus facilitate production of these constructions in Phase 3. Phase 3 involved production training.The self-voice sound files from Phase 1 were incorporated into a program (using

Outcome measures
Table 2 shows the assessment battery.All assessments were conducted by the first author who was not masked to phase (i.e., baseline versus outcome).Personal narratives (e.g., 'Can you tell me about the last time you went on holiday, or the last trip you took?') and picture-based narratives ('Dinner Party' and 'Jogging' cartoon series; retelling eight black-and-white line drawings that together depicted a complex event; Fletcher & Birt, 1983) were audio-recorded and served as primary outcome measures to assess participants' language production at the multiword level.These were administered at two baselines (approximately 2 weeks apart), once after the intervention and after a 6-week no treatment period to assess the stability of any communicative change.
Outcome variables on narratives were frequency-based measures from the FLAT (version 2; Zimmerer & Wibrow, 2015;Zimmerer et al., 2018).The main outcome variable was combination ratio, which quantifies the amount of connected speech by dividing the number of well-formed trigrams by number of words.For instance, the sentence 'A man phones his friend' includes five words and three trigrams (a man phones; man phones his; phones his friend; combination ratio = 0.6).A higher combination ratio indicates a higher proportion of trigrams in a speaker's connected speech.Since several samples were collected at each time point (Table 2), average combination ratio per assessment point were calculated across each participant's samples.
The Story Completion Test was originally developed by Goodglass et al. (1972) to study the availability of abstract syntactic structures in NFA.It was adapted to probe a participant's ability to produce the target constructions and was administered twice at baseline and immediately posttreatment and at a 6-week maintenance assessment.As in Goodglass et al. (1972), the researcher presented a brief scenario where the final sentence or phrase was missing, for example, to elicit the trained 'I like it' construction: 'I'm fond of ball games.My friend asks me: "What are your thoughts on football?",so I say. . .?'. Participants were asked to complete each of the 12 stories with a single sentence with the aim of eliciting the trained source construction.Responses were scored by the research team (unmasked) by applying two criteria: An 'expected answer' and a 'wellformed utterance'.As an example, for the item where the expected construction was 'I like it', the answer 'it's fun' would be scored with a 0 in the 'expected answer' category, while a 1 would be noted in the 'well-formed utterance' category.
Although story situations were designed to elicit trained constructions within predictable sentential contexts, a norming sample of 41 native speakers of English revealed that the probabilities with which the target constructions were produced varied considerably across items (range = 7-100%). 2The 'expected answer' scores were therefore weighted according to the normative cloze probability (i.e., the proportion of participants that finished a probe sentence with the expected response).For instance, if an item elicited a normative cloze probability of 39% (i.e., 39% of neurotypical speakers answered with the target construction), the score that a participant with NFA would receive for producing the expected construction for that item was 39, while a normative cloze probability of 100% would mean that a score of 100 would be assigned if the participant answered with the target construction.The score was given for the target construction or a construction with the same grammatical structure, disregarding semantics (e.g., 'I like it'/'I enjoy football'/'I hate football'), and a score of zero was assigned if the answer represented another structure (e.g., 'it's fun' instead of 'I like it').The maximum score was 708 across the 12 items.
Similarly to Goodglass et al. (1972), in addition to the 'expected answer' score, the number of well-formed utterances was recorded in the following way.For every answer reflecting a well-formed utterance (disregarding lexical semantics), a score of 1 was given.Ellipsis (a one-word response) for target 'I like it' was allowed (e.g., 'fantastic').Formulas such as 'oh dear' and 'come on' were not scored as well-formed utterances.Other formulas ('I don't know', '(I'm) sorry', 'no way') however, counted as a well-formed utterance for certain items (for more details, see Bruns, 2018).
The TROG-2 (Bishop, 2003), a receptive language test to identify the areas of grammatical comprehension difficulty, was administered once before and after the intervention, to investigate whether there was change in spoken sentence comprehension (number of blocks correct, out of 20).We selected the TROG-2 as the phase 2 and 3 intervention components involved listening to (and reading) sentences, and as it probes a wide range of sentence structures and is less subject to ceiling effects than screening assessments of sentence comprehension.Additionally, the AIQ-21 (Swinburn et al., 2018) was used as a participant-reported measure of communication abilities, emotional state/well-being, and participation.It was administered at baseline 2 and immediately after the intervention.The AIQ-21 uses a Likert-type scale from 0 (most positive rating) to 4 (most negative rating) for all questions.
Finally, a shortened form of a written Synonym Matching Task, taken from the ADA (Franklin et al., 1992), was created as a control measure.This was administered twice at baseline, post-intervention and at maintenance.It was devised to assess written word semantic processing which was not directly targeted in the intervention.Performance was measured according to number of items correct (out of 40).
In addition, participants videotaped weekly 10-20-min conversation samples with a regular CP during the baseline and post-intervention phases.These conversation samples are not analysed as part of this report.

Acceptability of the intervention
Acceptability of the intervention to both participants with NFA and their CPs was investigated through post-intervention study-specific questionnaires (see Appendix B).There were separate questionnaires for participants with NFA and CPs.These were designed to capture the views of participants regarding different aspects of the intervention (e.g., which elements were helpful/unhelpful; the overall usefulness of the intervention).Open questions allowed both participants with NFA and their CPs to add further comments.Participants filled in the questionnaires in their own time without the researcher present, and participants with NFA were encouraged to fill in the questionnaire together with their CP.The questionnaires were returned anonymously in pre-stamped addressed envelopes directly to the second author.

Data analysis
Audio-recorded connected speech samples (personal and picture-based narratives) were orthographically transcribed by a research assistant (MSc student of Linguistics) blinded as to the sample collection point.Each participant's pre-, post-intervention and maintenance data were analysed by the first author using the FLAT version 2 (Zimmerer & Wibrow, 2015;Zimmerer et al., 2018).Before performing the frequency-based analysis, the first author checked all transcriptions for accuracy and annotated them in line with FLAT transcription conventions.For instance, clause boundaries (e.g., in the case of false starts or ungrammatical word combinations) were marked with separators ('<.>') to ensure that only grammatically well-formed multiword utterances were included in the FLAT analysis (Bruns et al., 2019;Zimmerer et al., 2018).For example, the utterance 'upstairs, him is, erm, tie and everything' is separated into 'upstairs <.> him is (erm) <.> tie and everything'.
Since the group represented a relatively small, heterogeneous sample of individuals with NFA, all results were evaluated at the individual rather than the group level.Group means were plotted for the main outcome variables and the control measure, to identify patterns across individuals.Individual recurrent phrases (e.g., 'oh dear me', frequently produced by P4) were included in analysis of spontaneous speech samples as each participant acted as their own control.For the primary outcome measure, the mean combination ratio was compared within participants across the four assessment points (baseline 1, baseline 2, post-intervention and maintenance) using descriptive statistics.The mean combination ratio at each time point was calculated by averaging values for each narrative task (e.g., time point 'baseline 1' consisted of the average of the two mean combination ratios of the 'Last Holiday' and 'Dinner Party' narratives).In terms of the effectiveness of the intervention, we were interested in the difference between baseline 2 and outcome 1.This difference was compared against the pre-therapy stability (difference between baseline 1 and 2).Stability of post-intervention and maintenance performance was also assessed.
Performance on secondary outcome measures was analysed by comparing pre-and post-intervention scores for each participant.For the adapted story completion test, McNemar tests for related samples were conducted individually to investigate whether baseline performance was stable (baseline 1 versus 2).Where baseline performance was found to be stable, Cochran's Q tests were used to examine each participant's performance prior to and following the intervention (baseline 2 versus postintervention versus maintenance).For these comparisons, baseline 2 measurement was used as the pre-intervention measure as participants were more familiar with the tasks and the researcher compared with baseline 1.The same analysis was applied to the control measure.

RESULTS
All five participants completed the intervention study with an intervention phase of 6-7 weeks (P5 had to cancel two therapy sessions which prolonged their intervention phase by 1 week).Results are reported measure by measure, evaluating individual change patterns after intervention.

Narratives
For the primary outcome variable, combination ratio, it was expected that the intervention would lead to increased connected speech, that is, higher values at postintervention and maintenance probes.Figure 2 displays participants' scores over time as well as a group mean for each assessment point.As shown, all five participants presented with combination ratios below normative values (Table 2), indicating less connected speech compared with neurotypical speakers (i.e., lower proportion of multiword utterances).
While within-participant comparisons indicate an increase in combination ratios between baseline 2 and the first post-intervention probe for four participants (P2-P5), these differences did not exceed the baseline variation in three of these participants (P2-P4).For P5, however, the increase in values between baseline 2 and post-intervention was higher than the baseline variation and remained stable across post-intervention and maintenance probes.P1's combination ratio was 0.46 and 0.51 at baseline 1 and 2, respectively, dropping back to 0.46 after intervention, with a value of 0.48 at maintenance.Thus, over time, P1's amount of connected speech appeared relatively stable.

Story completion test
Figure 3 shows 'expected answers' scores in the four assessment probes.As noted for the narratives, participants' raw scores at baseline varied considerably, and only P5's performance after the intervention showed a marked increase compared with pre-intervention scores.
In terms of well-formed utterances, participants' patterns over time are shown in Figure 4. P1's and P2's proportions of well-formed utterances increased significantly after the intervention.P3 was unable to retrieve any constructions at baseline ('dunno' answers for all items, resulting in scores of 0 at baseline 1 and 2), but did so following intervention.Although P3 was able to give differentiated responses after the intervention and produced more well-formed utterances across post-intervention and maintenance probes, this difference was not statistically significant.

TROG-2
A comparison of TROG-2 scores before and after intervention, as shown in Table 3, revealed a relatively stable performance for P2.While P4's and P1's scores decreased by one block, P3's score increased by two blocks and P5's score by three blocks.This tentatively suggests that some participants with NFA (P3 and P5) may show improved spoken sentence comprehension after intervention.

AIQ-21
AIQ-21 ratings (sampled at weeks 3 and 16), presented in Table 4, suggest an increase in perceived well-being and emotional state following intervention.However, with regard to communication, the ratings became more negative for some participants which may point to increased awareness of communication difficulties after intervention.
Four out of five participants showed improved ratings in emotional state/well-being (P1, P3-P5), while there was no change in this area for P2.In the area of participation, no change was detected in three participants (P1, P3, P4).P5, however, showed an improved average rating and P2's average rating decreased.In terms of communication, three participants rated their communicative skills more  Note: The AIQ-21 uses a Likert-type scale from 0 (most positive rating) to 4 (most negative rating).Difference scores were expected to be zero or negative, e.g., 1.00 -1.36 = -0.36.
negatively after the intervention (P3-P5), and for P1 there was no change.P2, on the other hand, showed improved ratings.

Acceptability
The overall perceived usefulness of the intervention was rated on a scale from 1 ('not at all helpful') to 5 ('very helpful').The resulting average rating from participants with NFA was 4.7, while CPs rated the intervention's overall usefulness for their friend/family member with an average of 4.0.All five participants found 'one-to-one sessions to practise speaking' particularly helpful, and four participants also selected 'practising speaking at home'; 'making recordings of own voice and listening back to them'; and 'three steps to practise speaking' as helpful elements.Participants' answers regarding the frequency of using target constructions included 'every day' (one participant with NFA) and 'a few times a week' (four participants with NFA).This is suggestive of the constructions' everyday relevance.
For perceived effects of therapy (self-reported changes in communication), one participant with NFA circled 'my speaking got much easier' (P2), while the remaining four participants circled 'my speaking got a bit easier'.CPs reported that, after intervention, the expressive abilities of their friend or family member 'got a bit easier' (two CPs) or 'no change' (three CPs).

Summary of the results by participant
For P1, combination ratio values did not change in a meaningful way following intervention.P1's outcomes in the adapted story completion test were more positive, with a greater number of well-formed utterances and some evidence for increased availability of target constructions P2 showed positive change in combination ratio postintervention.On the story completion test, P2 displayed significant gains in number of well-formed utterances from baseline through to maintenance.P2's TROG-2 scores showed no change following intervention.Increased communication ratings on the AIQ-21 were evident, while participation ratings decreased after the intervention.
P3 had more severe aphasia, resulting in a restricted inventory of word combinations.P3 displayed considerable variation in combination ratio across assessment points.With regard to the adapted story completion test, P3 was unable to retrieve any answers other than 'dunno' prior to the intervention, but could give differentiated responses after the intervention, where he showed promising gains, especially with regard to well-formed utterances (increased number of well-formed utterances across post-intervention probes).TROG-2 scores improved after the intervention, as did his AIQ-21 ratings for wellbeing/emotional state.However, P3 rated his communication more negatively after the intervention.
Despite more impaired language output compared with other participants, P4 showed higher values in combination ratio post-intervention.P4's performance in the adapted story completion test revealed little evidence of availability of target constructions after the intervention, but a slight increase in the number of well-formed utterances.There was a slight decrease in number of TROG-2 blocks correct after the intervention.AIQ-21 post-therapy ratings were more positive for well-being/emotional state, but more negative for communicative skills.
P5 showed increased combination ratio between the baseline and post-intervention probe, and this performance stayed stable until the final maintenance assessment.There were significant gains in well-formed utterances and expected answers in the story completion test.However, this result was not stable across postintervention and maintenance assessment points.In addition to these positive results, P5's TROG-2 performance was enhanced after the intervention, while AIQ-21 ratings of communicative skills decreased after the intervention.

DISCUSSION
The present investigation piloted a novel computerized intervention to enhance the connected speech of individuals with NFA.The intervention was motivated by usage-based principles, training semi-fixed constructions with high functional value and then extending them by the process of superimposition of new lexical information into construction 'slots'.Moreover, the three intervention phases employed psycholinguistic and neuroscientific learning elements (e.g., structural priming, errorreduction strategies) and self-voice, a social-motivational learning component.Findings from a case series of five individuals with NFA revealed promising changes in some participants which encourage further development of the intervention programme and future testing of its effectiveness in a larger trial.This will allow exploration of the profiles of individuals who benefit most (and least) from the intervention.The intervention is acceptable to this client group.Acknowledging this is a small case series, participants with NFA and their CPs rated the intervention as helpful, with slightly higher ratings from participants with NFA.
This study, for the first time, used the FLAT variable 'combination ratio' (amount of connected speech) to evaluate outcomes of aphasia intervention.Regarding the research question, 'Is there evidence that after intervention participants with NFA demonstrate enhanced connected speech, as measured by a higher proportion of multiword utterances in narratives', the intervention showed potential to enhance the ability to combine words into well-formed utterances for a subset of participants.However, since there was considerable variability in the baseline probes and between the participants, further investigations should explore how pre-/post-difference scores correspond to small/medium/large effects.
Combination ratio is a useful variable to characterize an individual's ability to combine single words into wellformed word combinations.It is therefore well suited to profile spontaneous speech before and after therapy.FLAT offers the advantage that once a sample is transcribed and formatted, it analyses words and combinations automatically (and therefore objectively).Aside from combination ratio, the FLAT also provides other frequencybased variables, for instance association strength between the component words of two-and three-word combinations.Such variables may be useful to detect change in the creativity and flexibility of combinations.Since frequency-based measures are well-suited for naturalistic, spontaneous speech samples, they are worth further exploration as an outcome measure in usage-based interventions.
Analysis of combination ratio values does not answer the question of whether participants showed an increased use of trained constructions after intervention.To address this question, performance on an adapted story completion test was analysed.The results showed trends towards greater availability of target constructions in the majority of participants, but was only significant for one out of the five participants.Results revealed more evidence for positive change with regard to the number of well-formed utterances.
The limited outcomes on the 'expected answers' category of the adapted story completion test might be related to the variable cloze probabilities that the items elicited in the norming sample.Despite adjusting the scoring procedure to account for this, 'expected answer' scores fluctuated considerably across baseline and post-intervention probes.The findings suggest that the second layer of scoring, number of well-formed utterances, was more sensitive to grammatical change following intervention.While this task was well-suited for the purposes of studies by Goodglass et al. (1972) and Gleason et al. (1975), the adapted version was not ideal to evaluate the availability of the target constructions used in the present intervention.Additionally, the nature of the instruction (completing a scenario with a phrase or sentence) may not be suitable for participants with global aphasia and/or AoS.However, despite the challenges in the present study, a story completion format might be the only valid method available of eliciting target constructions.In future studies, it would be desirable to devise an improved story completion task and extend the method to include both trained and untrained items, as is common in the assessment of naming performance (e.g., Nardo et al., 2017).
The availability of constructions and their variations could be compared before and after intervention and would thereby add a valuable layer to an evaluation of whether this type of intervention facilitates more productive use of target constructions.For example, P1's narrative revealed the presence of the '[REFERENT] is-TENSE [PROPERTY]' construction, with variations including: 'he's really funny' and 'it's okay'.Such a qualitative analysis would provide insight into the availability and use of specific constructions in spontaneous speech.Such a procedure would be similar to the techniques applied in Dąbrowska and Lieven (2005) and Lieven et al. (2009) who analysed fixed phrases and frames with slots in the language of children.Whitworth et al. (2018) suggest that one baseline probe might be sufficient for some connected speech measures (e.g., narratives).If sampling takes place twice within a baseline period, Whitworth et al. (2018) showed that samples were more stable when taken 3 weeks apart than when they are taken within a shorter interval (1 week).In contrast to Whitworth et al. (2018), the current case series indicates considerable intra-individual variation in connected speech across two baseline assessment points (2 weeks apart).However, the present study reported descriptive statistics, while Whitworth et al. (2018) applied inferential statistics.Future evaluations of this intervention should further examine intra-and inter-individual stability of connected speech probes (potentially comparing picture-based narratives with monologic speech elicited by open questions), ideally using a baseline of at least 4 weeks to ensure sampling occurs with a minimum interval of 3 weeks.
The intervention resulted in some change in TROG-2 in two participants, indicating that with further development it might facilitate gains in both sentence comprehension and production.Furthermore, there is tentative evidence that the intervention may raise awareness of communicative disabilities in some participants, as reflected in more elevated scores on the AIQ-21 communication domain (i.e., communication was judged to be worse after intervention).The results also reflect positive changes in participants' perceived well-being and emotional state, a finding in line with the suggestion that practising familiar expressions might have non-linguistic effects such as improved well-being and quality of life (Stahl & Van Lancker Sidtis, 2015).
Finally, Bhogal et al. (2003) recommend at least 8.8 h over 11 weeks for speech and language therapy to be effective.The present intervention was delivered at a lower dose (eight 60-min intervention sessions over 4 weeks).This was supplemented by self-managed home practice.However, the exact amount of time that participants spent on these self-managed activities was not recorded in the current version of the software.Future investigations could measure therapeutic dose more accurately and explore the relationship between dose and outcome.
A larger trial based on the current, promising findings is warranted.Ideally, a researcher blind to the assessment point should conduct the assessment battery.A future trial should consider including standardized assessments as suggested by the ROMA consensus statement (Wallace et al., 2019), to support the availability of big data enabling the exploration of effectiveness of aphasia interventions.

CONCLUSIONS
This novel intervention for NFA, underpinned by usagebased CxG, was based on common constructions and their communicative functions.It was acceptable to participants and their CPs.The intervention shows promise as an effective therapy for some individuals with NFA, with gains in both sentence comprehension and production.Results appear to reveal potential for transfer to participants' connected speech, an important outcome given that it is often difficult for aphasia therapies to achieve transfer to the multiword level (Webster et al., 2015).Outcome measures for future trials could examine further the value of frequency-based variables and refine the story completion task, probing treated and untreated constructions.This study can inform larger trials in which hypotheses, generated by the present case series, could be tested, for instance with regard to the relationships between aphasia severity, dose and outcome.

TA B L E 1
Demographic characteristics and background assessments of participants with NFA

F
I G U R E 1 Three phases of the intervention Microsoft PowerPoint) designed to practise more flexible use of constructions.The main underlying usage-based principle in Phase 3 was superimposition (Dąbrowska, 2014), where lexical items (e.g., 'Claire') were superimposed over an open slot (e.g., 'Where is ____?' → 'Where is Claire?').Each target sentence involved three steps applying SWORD (Whiteside et al., 2012) error-reduction strategies: (1) seeing the written form of the target phrase while listening to an audio-recording of it; (2) imagining saying the sentence; and (3) saying the sentence and making a recording of own attempt.During the 'listening' step, variations of a core construction were animated.For instance, in the 'you like it'-'listen'-slide, the animation started with the text box '______ like it', accompanied by a sound file where the open slot was filled by a beep ('*beep* like it', created with Audacity).Next, the pronoun 'you' option was flown into the open slot, accompanied by the sound file of the whole phrase 'you like it'.Open slots in source constructions were often filled with pro-forms (e.g., pronouns and deictic expressions such as 'that') as well as high-frequency verbs and nouns (e.g., 'dinner', 'swim') to reduce lexical demands.

F
I G U R E 2 Mean combination ratio by participant over time F I G U R E 3 Story completion test: Overall score based on expected answers by participant over time

F
Story completion test: Number of well-formed utterances by participant over time

F
Synonym matching task: Number of correct items by participant over time following intervention.Performance in the TROG-2 slightly decreased post-intervention, while post-therapy AIQ-21 ratings indicated improved emotional state/wellbeing.