Semantic fluency in aphasia: clustering and switching in the course of 1 minute

Background: Verbal fluency tasks are included in a broad range of aphasia assessments. It is well documented that people with aphasia (PWA) produce fewer items in these tasks. Successful performance on verbal fluency relies on the integrity of both linguistic and executive control abilities. It remains unclear if limited output in aphasia is solely due to their lexical retrieval difficulties or has a basis in their executive control abilities. Analysis techniques, such as temporal characteristics of word retrieved, clustering and switching, are better positioned to inform the debate surrounding the lexical and/or executive control contribution for success in verbal fluency. Aims: To investigate the differences in quantitative (i.e., number of correct words) and qualitative (i.e., switching, clustering and word-retrieval times) performances on animal fluency task as a function of time between PWA and healthy control speakers (CS). Methods & Procedures: Animal fluency data for 60 s were collected from 34 PWA and 34 CS, and responses were time stamped. The 60-s period was divided into four equal intervals of 15 s each (i.e., 15, 30, 45 and 60 s). The number of correct words, cluster size, number of switches, within-cluster pause and between-cluster pause were evaluated as a function of four 15-s time intervals between PWA and CS. Outcomes & Results: Compared with CS, PWA produced fewer words, had smaller cluster sizes and switched a fewer number of times. A decrease in the number of switches correlated with an increase in between-cluster pause durations. PWA showed longer withinand between-cluster pauses than CS. The two groups showed specific differences in the temporal pattern of the responses: as time evolved both PWA and CS showed decreased productivity for the number of correct words, but PWA reached the asymptote earlier in the time course than CS, neither group showed a change in cluster size, and the number of switches decreased as a function of time only for CS. Conclusions & Implications: The findings suggest that for PWA the search and retrieval process is less productive and more effortful. This is indicated by smaller cluster size, fewer switches associated with increased between-cluster pause durations, as well as overall slowed retrieval times for the words. This shows that the difficulties with verbal fluency performance in aphasia have a strong basis in their lexical retrieval processes, as well as some difficulties in the executive component of the task.


Introduction
The verbal fluency test is an extensively used wordretrieval task that relies on both linguistic and cognitive processes including accessing the mental lexicon and engaging with various executive processes including initiation, monitoring, organization, rule implementation and set-shifting. Typical administration requires participants to produce as many unique words as possible within a limited amount of time, usually 60 s, according to a given criterion. Most common types of criteria used are letter (or phonemic) fluency and semantic (or category) (e.g., Strauss et al. 2006). Successful performance depends on the use of specific cognitive strategies to initiate systematic search and retrieve words within the mental lexicon. One such strategy is clustering, which is the production of words within a subcategory, and the other is switching, which is the ability to shift efficiently to a new category when a subcategory is exhausted (Troyer et al. 1997, Tröster et al. 1998). These two components determine the overall number of words generated (Troyer 2000, Troyer et al. 1997. In addition to search strategies, there is a need to focus on the task, selecting words meeting certain constraints and avoiding repetition, all of which rely on the executive control processes (Luo et al. 2010, Shao et al. 2014, Troyer 2000. Therefore, the integrity of both linguistic and executive control abilities is essential for successful performance on a verbal fluency task. This hybrid nature of the verbal fluency task has made it an appealing quick test for linguistic and/or executive control abilities in various typical and atypical populations. Verbal fluency tasks are included in a broad range of aphasia assessments in both clinical and research studies. Despite its widespread use, there are only a handful of studies that have investigated both the quantitative (i.e., number of correct words) and qualitative (i.e., switching, clustering and/or temporal characteristics of recall) aspects of this task in aphasia (e.g., Adams et al. 1989, Arroyo-Anlló et al. 2011, Baldo et al. 2010, Helm-Estabrooks 2002, Kiran et al. 2014, Roberts and Le Dorze 1994, Sarno et al. 2005. It is well established that PWA produce fewer exemplars than healthy controls but limited research exists with regard to the qualitative nature of the performance (Baldo et al. 2010, Kiran et al. 2014. For example, Baldo et al.'s (2010) participant with Wernicke's aphasia had reduced cluster size, whilst the participant with Broca's aphasia demonstrated unimpaired cluster size. Kiran et al. (2014) in their group comparison of bilingual aphasia and controls found that compared with controls, individuals with aphasia showed smaller cluster sizes and switched fewer number of times in both languages.
While a sparse lexical retrieval is not a surprising finding in PWA, the mechanisms underlying such impairment are less understood. With exception of Adams et al. (1999), no study has systematically investigated the temporal characteristics of the retrieved words. Adams et al. compared the productivity and representativeness (common versus uncommon words) of the produced words between aphasia and CS over the course of four time quarters in 1 min. They found that compared with the controls, PWA produced fewer common and uncommon words in each of the time quarters. As verbal fluency tasks place a premium on rapid search and retrieval, a process which is generally affected in brain-damaged populations including aphasia, temporal measures of the performance (i.e., timing for the correct words, clustering and switching) and information processing speeds (i.e., time interval required to produce each word as a function of its position in the sequence) provide valuable insights into the linguistic and executive control strategies in brain damaged individuals (e.g., Crowe 1998, Hurk et al. 2004, Tröster et al. 1998. Research has shown that the time interval required to access new subcategories (i.e., between-cluster time) is long and increases during the time course, whereas the time required to produce items within semantic clusters (i.e., within-cluster time) was short and tended to remain constant (e.g., Gruenewald and Lockhead 1980). Accordingly, the time interval for switching between the clusters would increase over time, as it reflects an effortful and controlled retrieval process from the word store. This is associated more with the executive component of the verbal fluency task (Gruenewald and Lockhead 1980, Raboutet et al. 2010, Rosen et al. 2005. On the other hand, time interval for production of words within a semantic cluster depends more on the lexical Semantic fluency in aphasia 3 component of the task. For example, Rosen et al. (2005) have illustrated that lower switching abilities in the presence of increased between-cluster pauses can be argued to reflect a true difficulty in executive component of the verbal fluency task (see Mayr 2002 andRaboutet et al. 2010 for a similar argument). Therefore, documenting the exact retrieval times of the words to generate within-and between-cluster times is crucial in understanding how clustering and switching behaviours are driven by slowed processing speed and/or faulty executive control mechanisms.
Our research uses both the quantitative (i.e., number of correct words) and qualitative (i.e., switching, clustering and word-retrieval times) measures for a semantic fluency task (i.e., animals) and tracks changes in performances over the course of 60 s in PWA and healthy CS. This study fills a significant gap in the aphasia literature, contributing to the debate of whether performance differences between aphasia and control participants are a consequence of lexical retrieval difficulties, slowed processing speed, executive control difficulties, or a combination of some or all three aspects.

Characterizing verbal fluency performance and time-course analysis
The most commonly used metric for verbal fluency performance has been total number of correct words within 1 min, excluding repetitions and errors. Commonly used qualitative analysis methods include clustering and switching analysis, as well as time-course analysis (Crowe 1998, Mayr 2002, Troyer et al. 1997. The production of words is not evenly distributed over time, but tends to be produced in 'spurts', or temporal clusters, with a short time interval between words in a cluster and a longer pause between clusters (Gruenewald andLockhead 1980, Troyer et al. 1997). On semantic fluency tasks, the words that comprise these temporal clusters tend to be semantically related (e.g., first name farm animals, then switch to pets, then to birds). This response pattern has led to the suggestion that performance involves two processes: a search for semantic subcategories, which corresponds to a pause between clusters, followed by an output mechanism to produce as many words as possible from the subcategories (Gruenewald andLockhead 1980, Tröster et al. 1998). The metrics of switching and clustering have been suggested to quantify the above two processes (Troyer et al. 1997). Specifically, clustering involves accessing and using the word store, and cluster size is a measure of the ability to access words within semantic subcategories, switching involves the search processes and is a measure of the ability to shift efficiently from one subcategory to another and reduced switching has been attributed to executive function difficulty to shift between subcategories (Troyer et al. 1997, Tröster et al. 1998. Both switching and clustering equally contribute to the performance for semantic fluency in healthy controls (Troyer 2000, Troyer et al. 1997.
These assumptions of distinction between clustering and switching are supported by several neuropsychological studies showing a decrease in the number of switches in patients with frontal-lobe lesions or Parkinson's disease (Troyer et al. 1998, Tröster et al. 1998), a smaller number of clusters in schizophrenia (e.g., Bozikas et al. 2005), smaller cluster sizes in patients with temporallobe lesions or Alzheimer's disease (Nutter-Upham et al. 2008), as well as impaired switching up to 5 years before dementia onset but no significant difference in cluster size when compared with elderly controls (e.g., Raoux et al. 2008). These studies have shed a light into the organization and structure of the word store, and the possible reason for the deficit in different disorders in relation to lexical versus executive control processes. Considering that verbal fluency task can tap into lexical as well as executive processes, with exception of few studies (i.e., Arroyo-Anlló et al. 2011, Baldo et al. 2010, Kiran et al. 2014, Sarno et al. 2005, detailed clustering and switching analysis has not been undertaken in aphasia. As mentioned above, the number of retrieved items declines as a function of time. Typically, participants produce more items at the beginning of the recall period than during later periods and eventually reach asymptote (e.g., Troyer 2000). Usually in the first 15-20 s of the 60-s period, a ready store of frequently used words appears to be available and is automatically activated for production. As the time passes, the store becomes exhausted and the search for new words becomes more effortful and less productive (Crowe 1998, Luo et al. 2010, Hurk et al. 2004. This pattern of decline has been captured in healthy controls by plotting the number of responses over time (Crowe 1998, Luo et al. 2010, Hurk et al. 2004) and the frequency distribution of items (Crowe 1998). Examining the rate of decline in word production can provide important insights into the mechanisms underlying verbal fluency impairment in aphasia. For instance, despite pre-existing differences in the number of words produced, a comparable rate of decline could indicate that PWA do not differ from controls in their underlying mechanisms of retrieval. An exaggerated rate of decline could indicate impaired search and access mechanisms in these individuals relative to controls.

The present investigation
The current study investigates the differences in quantitative and qualitative performance on a semantic fluency task as a function of time between PWA and healthy CS. We collected animal fluency data for 60 s from 34 PWA and 34 CS. For this research, we primarily focused on semantic fluency task as this task is often included in standardized aphasia examinations (e.g., Western Aphasia Battery), and the literature has shown that semantic fluency is an easier task than letter fluency (Luo et al. 2010, Troyer 2000. Since our goal was to document more than just the number of correct responses, we chose a task that would provide enough of a corpus to allow qualitative analyses of clustering and switching over time. The responses were transcribed and tagged with the time when they were produced. The 60-s period was divided into four equal intervals of 15 s each (i.e., 15, 30, 45 and 60 s). Quantitative and qualitative analysis were conducted evaluating the number of correct words, cluster size, number of switches, withincluster pause and between-cluster pause, as a function of the four 15-s time intervals, between PWA and CS. We closely followed Troyer et al.'s (1997) method of clustering and switching analysis; any diversions from the original procedure are indicated in the methods.
Based on the literature on healthy adults and left hemisphere focal brain-damaged populations, we predict the following for the variables measured. First, PWA would produce fewer words than CS, and they would follow a similar curve from high productivity in the earlier intervals to lower productivity and asymptote in later intervals. Second, PWA would show an overall smaller cluster size as word production difficulties in aphasia are thought to reflect lexical retrieval difficulties. We offer no specific prediction for cluster size as a function of time. Third, if poor performance in PWA was because of their executive control difficulties, then they might show fewer switches than CS, and the number of switches would decrease over time as there remain fewer subcategories to switch between. In case of true switching difficulty in PWA, they will show a relationship between switching and between-cluster pause. Fourth, the within-and between-cluster pause durations would be longer for PWA due to their generally slow performance, and the pause durations would continue to grow as a function of time as the search gets more effortful as the exemplars are exhausted during the retrieval process.

Participants
Thirty-four PWA (11 male, 23 female) and 34 age-, gender-and education-matched healthy control participants (CS) took part in this study. The inclusion criteria for the PWA were: a single left-hemisphere cardiovascular accident as determined by neuroradiological and/or neurological examinations, a diagnosis of aphasia on standardized clinical tests (Boston Diagnostic Aphasia Examination: Goodglass et al. 2001, or Western Aphasia Battery: Kertesz 1982, at least 8 months post-stroke, monolingual English speaker, no history of other neurological illness, psychiatric disorders or substance abuse, and no other significant sensory and/or cognitive deficits that could interfere with the individual's performance in the investigation. The ages for the PWA ranged from 27 to 86 years (mean = 61.6, SD = 14.9), education level ranged from 11 to 19 years (mean = 14.4, SD = 1.8), and time since stroke ranged from 8 to 253 months (mean = 66, SD = 65.5). The PWA group included various aphasia types resulting in 11 fluent, 17 non-fluent and six mixed aphasias. The appendix provides each PWA's demographic and aphasia details along with their performance on the five variables. The CS participants were native monolingual English speaking individuals with no reported history of speech, language or hearing problems, or any other neurological deficits. The ages for the CS ranged from 25 to 82 years (mean = 54.9, SD = 15.3) and level of education ranged from 11 to 25 years (mean = 15.32, SD = 3.9). There was no significant difference between the groups with regard to age [t(66) = 1.82, p = 0.13) or level of education [t(66) = 1.17, p = 0.25]. Written informed consent procedures in accordance to the university research ethics board of the respective institutes were followed for all participants.

Procedure and scoring
As part of a larger research battery, the semantic fluency (i.e., animal) task was administered on an individual basis. Participants were instructed to generate names of as many animals as they could within 60 s. To ensure the cognitive and search strategies were spontaneous, no guidelines were provided regarding how the participants should generate and organize their production. All responses were recorded with a high-quality digital recorder. Every verbal response was transcribed, including correct and well-articulated names of animals, selfcorrections, repetitions, non-words, non-animal names, errors and unintelligible words. The times (s) at which each word was produced were recorded and correspondingly tagged to the word to identify the time quarter when they were produced. Using this information, the 60-s period was divided into four 15-s intervals (i.e., 15, 30, 45 and 60 s). Words initiated at a 15-s boundary were attributed to the prior interval. Error responses such as non-words, non-animal names and unintelligible productions were excluded prior to analysis. Repetition errors were excluded for the number of correct responses, but were retained for clustering and switching analysis as these were thought to be reflective of underlying cognitive processes regardless of whether they were included in total number of words generated (Troyer et al. 1997).

5
We measured a total of five variables-number of correct words, cluster size, number of switches, withincluster pause and between-cluster pause -over the four 15-s time intervals. The detailed procedure for clustering and switching analysis was based on the work of Troyer and colleagues (e.g., Troyer 2000, Troyer et al. 1997. Briefly, clusters were defined as successively generated words belonging to the same subcategories, e.g., pets, wild animals, African animals, Arctic animals, birds etc. (for more details, see Troyer 2000, appx). According to Troyer et al. (1997), the semantic subcategories were derived from the actual patterns of words generated by their participants during the task rather than on an a priori organizational scheme. The following were the operational definitions of the variables: r Number of correct words. This typical quantitative score measured the total number of correct words produced (i.e., animal names), excluding repetitions and errors.
r Cluster size. Cluster size was counted beginning with the second word in each cluster. That is, two words 'cat, dog' had a cluster size of one as they are both pets, three words 'horse, pig, chicken' form a cluster of two as all three instances are farm animals, and so forth. Repetitions were included. Since we examined clustering as a function of time, we had to add few other rules in our calculations. In cases when clusters straddled interval boundaries, we attributed the score of that cluster to the quarter containing the majority of the words of that cluster. For example, the cluster '42.0 s (45 s) -lion, 44.4 s (45 s) -tiger, 46.8 s (60 s) -leopard' would belong to the 45 s quarter as two out of the three responses were given in that time period. This occurred in a total of 23 cases (CS,18;PWA,5). In cases where the cluster contained an even number of responses either side of the boundary, e.g., 42.0 s (45 s) -lion, 44.4 s (45 s) -tiger, 46.8 s (60 s) -leopard, 52.8 s (60 s) -panther, the score would be attributed to the quarter in which the cluster was initiated, so in the example it would belong to the 45 s quarter. This happened 17 times (CS, 11; PWA, 6).
r Number of switches. Switches were calculated as the number of transitions between clusters. For example, dog, cat, gorilla, orang-utan, pig, cow, sheep contains two switches -before gorilla and before pig. Lion, tiger, skunk, badger, parrot, sparrow, crow, horse, cow has three switches -before skunk, parrot and horse. Repetitions were included.
r Within-cluster pause. This was the mean time (s) between the productions of consecutive words within the same cluster. If the cluster contains two words, the within-cluster pause was calculated by time of production of the second word -time of production of the first word (e.g. '4.00 s -tiger, 5.00 s -lion' has a pause of 1.00 s, '12.00 sgoat, 13.20 s -sheep' has a pause of 1.20 s). If the cluster contains more than two words, the within-cluster pause was calculated by taking the mean of time difference between the consecutive words. For example, in a three word cluster, such as, 4.00 s -tiger, 5.00 s -lion, 7.00 s -leopard, the within-cluster pause would be 1.50 s (i.e., '4.00 stiger, 5.00 s -lion' has a pause of 1.00 s, '5.00 slion, 7.00 s -leopard' has a pause of 2.00 s, therefore mean within-cluster pause was 1.50 s).
r Between-cluster pause. This was the time (s) between two consecutive words that belong to different clusters, signalling a switch. It was calculated in the same way as the within-cluster pause. For example -'0.60 s -rooster, 2.40 s -elephant' contains a pause of 1.80 s between clusters, while '13.40 s-dog, 24.40 s -cheetah' has a pause of 11.00 s.

Reliability
All scoring was performed by the second author; the first author performed the reliability checks for at least 40% of the data. The point-by-point interrater agreement was 98.3%; disagreements were resolved by reviewing and discussing the scoring definitions.

Analysis
A mixed analysis of variance (ANOVA) design with repeated measures was used on the dependent variables. In the design, Group (Aphasia, Control) was a betweensubject factor, and Time interval (15, 30, 45 and 60 s) was a within-subject factor. Since we performed five ANOVAs, we set our significance level at p ࣘ 0.01, instead of p ࣘ 0.05. For any significant Group × Time interaction, one-way within-group ANOVA was performed separately for PWA and CS with Time (15, 30, 45 and 60 s) as within-group factor to compare the performance over the time intervals. Since we had a large number of PWA (N = 34), as a secondary analysis we compared the performance of individuals with fluent aphasia (PWA-fluent, N = 11) versus non-fluent aphasia (PWA-non-fluent, N = 17) across the five variables to determine if there were any difference between the subgroups.
To examine the relationship amongst retrieval times of the words (i.e., within-and between-cluster pauses) with clustering and switching, correlations were performed separately for PWA and CS. Using Fisher r-to-z transformations, significant correlation coefficients for the two groups were compared to identify if the difference between the correlation coefficients were significant, or not. Although the groups were matched for age and years of education, we performed correlational analyses amongst demographics variables for each group (age, years of education and time post-onset for PWA only) with dependent variables to investigate if any of the demographic variables were related to the performance. Figure 1(A-E) illustrates the performance of the two groups for the five dependent variables as a function of time (i.e., 15, 30, 45 and 60 s). The results of repeated measure ANOVA with Group (PWA, CS) as betweensubject factor, and Time (15, 30, 45 and 60 s) as a withinsubject factor are presented in the text. The appendix provides the data for each PWA as well as group means and SDs (PWA-all, PWA-fluent and PWA-non-fluent) across the five variables, along with the group means and SDs for CS. It also provides the results of t-tests comparing the performance between PWA-fluent versus PWA-non-fluent. Table 1 presents the correlation matrix amongst the retrieval times (within-and between-cluster pauses), number of correct, cluster size and the number of switches.

Results
The The between-cluster pause showed significant negative correlations with the number of correct words and the number of switches for both CS and PWA (table 1). The strength of these significant correlations was large for CS and moderate for PWA. Testing for significant difference for correlation coefficients using Fisher r-to-z transformation did not reveal any difference suggesting both CS and PWA showed similar correlation patterns amongst between-cluster pause with number of correct words and the number of switches. The correlational analysis amongst the demographic and dependent variables for each group did not reveal any significant correlation.

Discussion
This research investigated the quantitative and qualitative differences in performance of a semantic (animal) fluency task as a function of time (i.e., 15, 30, 45 and 60 s) between PWA and CS. To summarize the main findings, compared with CS, PWA retrieved and produced fewer words, had smaller cluster size and switched fewer times. A decrease in the number of switches correlated with an increase in between-cluster pause durations. As expected for both measures of lexical retrieval times-within-and between-cluster pause durations-PWA showed significantly longer durations than CS, but no interaction of Group × Time, indicating that both groups evidenced similar patterns of durational measures, only that PWA were slower than CS. This substantiates the idea of overall generalized slowing in processing speed. The two groups showed specific differences with respect to temporal pattern of performance: as time evolved both PWA and CS showed decreased productivity for number of correct words, but PWA reached asymptote earlier in the time course than CS, neither group showed a change in cluster size, and the number of switches decreased as a function of time only for CS.
PWA retrieved and produced fewer correct animal names compared with CS. This corroborates literature in aphasia, which has indicated a lexical retrieval and production difficulty for PWA in verbal fluency tasks (Adams et al. 1989, Baldo et al. 2010, Kiran et al. 2014, Roberts and Le Dorze 1994, Sarno et al. 2005. The present study examined the temporal pattern in performance by using four time intervals (15, 30, 45 and 60 s) allowing comparison of the groups' retrieval patterns. Results revealed a significant Group × Time interaction. Both groups retrieved and produced a significantly higher number of correct words in the first 15 s followed by a pattern of gradual decrease in the number of words as the time evolved, and subsequently reaching an asymptote ( figure 1A). Importantly, the decrease in productiv-ity and reaching an asymptote were different for the two groups. PWA showed decreased productivity and reached asymptote by 30 s (i.e., second quarter), whereas CS reached asymptote only by 45 s (i.e., third quarter).This corroborates the observation by several researchers that production in earlier time periods involves the retrieval of readily available frequently used words from a store of possible words, and as time passes, the store is exhausted, and production becomes more difficult and less productive (Crowe 1998, Hurk et al. 2004, Raboutet et al. 2010. This exaggerated rate of decline in PWA was evidenced by reaching the asymptote earlier than CS, possibly indicating impaired search and access mechanisms in PWA.
The correlation analyses revealed a significant negative correlation between number of words and betweencluster pause for both groups. As it takes longer to switch between clusters, the number of correct words produced decreases. The between-cluster pause has been suggested to tap into the executive component of the verbal fluency task. A strong negative relationship could be indicative of the importance of executive control abilities for overall correct production in both typical and atypical populations (Rosen et al. 2005, Raboutet et al. 2010.
Clustering involves accessing and using the word store within semantic subcategories, whereas switching is thought to reflect the ability to shift efficiently between clusters (Troyer 2000, Tröster et al. 1998, Raboutet et al. 2010. Results indicated that PWA showed significantly smaller cluster size than CS, potentially reflecting a decreased word store and/or inefficiency in access of the word store. This once again highlights the lexical retrieval and production difficulty for PWA. Smaller cluster size has been reported in several neurological conditions, including Parkinson disease Table 1

Number of switches
Results also indicated that the cluster size did not change for either group as a function of time (i.e., lack of Group × Time interaction). Once participants were able to access a semantic subcategory, the number of words within that subcategory (i.e., cluster size) did not decrease over the course of 1 min. This supports observation of the time invariance for the clustering scores (Gruenewald andLockhead 1980, Raboutet et al. 2010). Moreover, within-cluster pause, which is thought to reflect search time within a subcategory, did not show a significant interaction with Time. Thus once a subcategory is accessed, the retrieval of words within that cluster is automatic as within-cluster retrieval time did not increase during the time course (Rosen et al. 2005, Raboutet et al. 2010. Switching is thought to reflect the executive component of verbal fluency tasks. Impaired switching has been observed in normal ageing (Troyer 2000) as well as in a number of neurological conditions, including focal frontal lobe lesions (Troyer et al. 1998), early stages of dementia (Raoux et al. 2008), Parkinson's disease with dementia (Tröster et al. 1998), and aphasia (Baldo et al. 2010, Kiran et al. 2014. Results from this study showed main effects of Group and Time, and an interaction between Group and Time. Compared with CS, PWA showed a decreased switching score. This could imply that inefficient executive control processes are in part responsible for the poor performance of PWA on this semantic fluency task. Researchers have suggested that a decreased switching score cannot be convincingly taken as an evidence of switching deficit, as lower switching scores in PWA could be a result of lower total number of correct words. Several previous studies have shown that switching is a strong predictor for total correct in both typical and atypical populations (e.g., Troyer et al. 1997) but the debate still remains whether fewer switches results in fewer correct words or vice versa. A better way to understand if fewer switches is a true reflection of the executive control difficulties is by interpreting the switching data in context of lexical retrieval times of the words (Mayr 2002, Raoux et al. 2008. Our correlation analysis revealed that switching scores correlated strongly with between-cluster pause but not with within-cluster pauses. This finding of increased between-cluster retrieval times along with fewer switches reveals the involvement of an effortful and controlled retrieval processes from the word store (e.g., Gruenewald andLockhead 1980, Rosen et al. 2005). PWA experienced greater difficulty with effective search strategies for subcategories highlighting the possible difficulties with the executive component of the task. This once again demonstrates that as the search gets more effortful, executive control components are stretched.
Hence, the concomitant observation of lower switching scores and a strong correlation with betweencluster retrieval times and switching provides us the evidence that PWA indeed had a difficulty in switching that could potentially be mediated by some difficulties in executive control. This adds to the body of research in aphasia which has demonstrated that in addition to the linguistic deficits PWA show executive control difficulties (Keil andKaszniak 2002, Murray 2012). Differences in switching have been shown to reflect differences in executive control abilities in populations where it is typically declining (e.g., ageing population, Troyer 2000) or where it is assumed to be superior (e.g., bilingual populations, Luo et al. 2010, Shao et al. 2014. From a clinical perspective, this research highlights the usefulness of using the switching measure of a verbal fluency task in aphasia to tap into their executive control abilities. This type of evidence is currently lacking in the aphasia literature. Taken together the results suggest that PWA had difficulty in both lexical (fewer words, smaller clusters) and executive (fewer switches) components, as well as overall slowed processing speed (increased retrieval times). It is not surprising that we found difficulties in both components of this semantic fluency task as this data is from a large group of 34 PWA which included participants with different aphasia types. Only Baldo et al.'s (2011) study had reported cluster size data comparing one Wernicke's aphasic with one Broca's aphasic participant. They found that the individual with Wernicke's aphasia showed reduced cluster size, whilst cluster size in Broca's aphasia was unimpaired. However, our analysis comparing fluent versus non-fluent aphasias did not reveal any significant differences on any of the variables. Future studies with lesion data can inform whether different aphasia types would demonstrate distinction in the lexical and executive component of verbal fluency tasks.
Although the findings suggest that executive control difficulties in PWA might be responsible for the switching difficulties observed in them, it is unclear which aspect of executive control difficulties underlies the manifestation of this behaviour. We urge caution in interpreting 'executive control difficulties' as executive control is a broad term encompassing several cognitive skills, for example, shifting, memory and updating, inhibition and suppressing interference. Specific experimental and clinical measures of different components of executive control (e.g., working memory span, Trail Making Test, Stroop) can be used in future studies to explore the components of executive control that relates to the switching behaviours in aphasia. Similar attempts to relate verbal fluency performance to specific components of executive controls have been made in healthy participants (e.g., Luo et al. 2010, Shao et al. 2014 as well as in some clinical populations (e.g., Alzheimer's disease: Rosen et al. 2005, Psychiatric disorders: Whiteside et al. 2016).
In addition, to understand better the role of executive control in fluency tasks in aphasia, future research should include comparison of both semantic and letter fluency tasks. Semantic and letter fluency tasks place different cognitive demands on word retrieval. Generating words in a semantic fluency task is akin to accessing a lexical item in an interconnected network. It has been suggested that generating words based on semantic categories is an over-learned processes of language production, and is largely automatic and relies primarily on linguistic representations (Luo et al. 2010). In contrast, letter fluency task is more effortful as letter generation is not a common strategy in word retrieval, nor there is an obvious congruency with the organization of words in some representational system (Strauss et al. 2006). The demands for executive control are increased in letter fluency task (Delis et al. 2001). Components of executive control that have been evident in fluency tasks have ranged from inhibiting inappropriate responses, self-monitoring, shifting, updating, memory, and avoiding perseveration. Although, many of these processes are involved in semantic fluency tasks, their role is more decisive in letter fluency. Therefore, to tap into the question of the contribution of executive control for verbal fluency task in aphasia, comparison of semantic and letter fluency will be essential.

Conclusions
Verbal fluency tasks remain a widely used measure in healthy and neuropsychological populations. This study investigated the temporal characteristics of quantitative (number of correct words) and qualitative (clustering, switching and retrieval times) differences in a semantic fluency task between a large group of PWA and healthy CS. The findings suggest that for PWA the search and retrieval process is less productive and more effortful as indicated by smaller cluster sizes and overall slowed retrieval times for the words. Furthermore, PWA showed fewer switches associated with increased between-cluster pause durations, indicating some difficulties in the executive control processes required to search for and access semantic subcategories. The temporal pattern of performance for the two groups was distinct, with PWA demonstrating more effortful searches and reaching asymptote in performance earlier than CS. These findings suggest that difficulties in verbal fluency performance for aphasia have a strong basis in their lexical retrieval and production, as well as some difficulties in executive components of the task. Although this study was the first of its kind to document the time course of performance for quantitative and qualitative measures of verbal fluency, there remain important issues to be investigated by future research (e.g., comparison of semantic versus letter fluency, investigating components of executive control influency fluency performance). This in turn can ensure the diagnostic validity of the task for different types of neurological populations. Despite the limitations, we believe the present study provides useful data and motivates several lines of future research which will have both theoretical and clinical implications. 1