Incidence of RNA viruses infecting taro and tannia in East Africa and molecular characterisation of dasheen mosaic virus isolates

Abstract Taro (Colocasia esculenta) and tannia (Xanthosoma sp.) plants growing in 25 districts across Ethiopia, Kenya, Tanzania and Uganda were surveyed for four RNA viruses. Leaf samples from 392 plants were tested for cucumber mosaic virus (CMV), dasheen mosaic virus (DsMV), taro vein chlorosis virus (TaVCV) and Colocasia bobone disease‐associated virus (CBDaV) by RT‐PCR. No samples tested positive for TaVCV or CBDaV, while CMV was only detected in three tannia samples with mosaic symptoms from Uganda. DsMV was detected in 40 samples, including 36 out of 171 from Ethiopia, one out of 94 from Uganda and three out of 41 from Tanzania, while none of the 86 samples from Kenya tested positive for any of the four viruses. The complete genomes of nine DsMV isolates from East Africa were cloned and sequenced. Phylogenetic analyses based on the amino acid sequence of the DsMV CP‐coding region revealed two distinct clades. Isolates from Ethiopia were distributed in both clades, while samples from Uganda and Tanzania belong to different clades. Seven possible recombination events were identified from the analysis carried out on the available 15 full‐length DsMV isolates. Nucleotide substitution ratio analysis revealed that all the DsMV genes are under strong negative selection pressure.

DsMV is transmitted in a nonpersistent manner by several aphid species and can also be transmitted by vegetative propagation or sap inoculation (Elliott et al., 1997;Nelson, 2008). The virus has a worldwide distribution and infects both edible and ornamental members of the Araceae family (Elliott et al., 1997). Infection typically results in a characteristic feathery mottle and mosaic symptom on the leaves, but symptoms may vary considerably between cultivars and season of the year (Alconero & Zettler, 1971;Elliott et al., 1997). DsMV infection is reported to affect both the quality and quantity of the edible corms, with production losses of more than 25% (Reyes, Rönnberg-Wästljung, & Nyman, 2006;Valverde et al., 1997).
Colocasia bobone disease-associated virus (CBDaV) is a member of the family Rhabdoviridae based on sequence analysis and the presence of characteristic, enveloped, bullet-shaped particles of $300 Â 50 nm in infected plants (Higgins et al., 2016;Pearson et al., 1999). CBDaV has only been reported from Papua New Guinea (PNG) and the Solomon Islands, where it has been associated with the severe diseases bobone and alomae (Gollifer, Jackson, Dabek, Plumb, & May, 1977;Revill, Jackson, et al., 2005). Bobone disease is thought to be caused by CBDaV alone and is characterised by stunting and gall formation on the pseudostem (Gollifer et al., 1977;Higgins et al., 2016;Pearson et al., 1999;Revill, Jackson, et al., 2005), whereas alomae is a lethal disease usually caused by the dual infection of taro with CBDaV and TaBV (Revill, Jackson, et al., 2005).
Several other viruses have also been reported from aroids worldwide. Taro reovirus (TaRV), a putative member of the genus Oryzavirus in the family Reoviridae, has been partially characterised based on near full-length sequences of four different genomic segments of an isolate from PNG (Revill, Jackson, et al., 2005;Revill, Trinh, et al., 2005). However, no symptoms have been associated with TaRV infection and the virus has only been detected in symptomless taro plants and plants infected with other viruses (Revill, Jackson, et al., 2005) (Dong et al., 2008;Manikonda et al., 2011;Sivaprasad, Reddy, Kumar, Reddy, & Gopal, 2011;Wang, Wang, Wang, & Hong, 2014). Of the known viruses reported to infect aroids, DsMV and TaBV are the most widespread (Gollifer et al., 1977;Revill, Jackson, et al., 2005).
We have previously reported the incidence, distribution and molecular characterisation of TaBV and TaBCHV infecting taro and tannia in East Africa , but there is no information on the incidence, distribution and diversity of RNA viruses. In this article, we report the results of surveys

| Sequence analysis
Sanger-derived sequences were trimmed to remove primer-binding sites and analysed using CLC Main Workbench v6.9.2 (Qiagen, USA) and Geneious v11.0.2 (Biomatters, New Zealand). For RNAseq data, adapter sequences were removed using the fastx_clipper and reads were further trimmed to attain optimum quality using the

| Evolutionary analysis
The evolutionary relationship of East African DsMV isolates with previously reported isolates from the NCBI database was determined through phylogenetic, recombination, selection pressure and pairwise sequence comparison (PASC) analysis. The conserved core coat protein (CP)-coding region, excluding the heterogeneous N-terminal sequences, was aligned and analysed using the ClustalW multiple alignment application using BioEdit sequence alignment editor program version 7.1.9 (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Signals identified by RDP4 as potentially being the result of evolutionary processes other than recombination were disregarded. Selective pressures within each distinct protein-coding region were determined using the average non-synonymous to synonymous (dN/dS) substitutions ratio in MEGA7.
Pairwise Sequence Comparison (PASC) analysis for DsMV isolates from East Africa together with previously characterised DsMV isolates was carried out on aligned core CP amino acid sequences using Geneious v11.0.2 (Biomatters). In addition, PASC was carried out for each individual protein-coding region using the nucleotide sequences of the 15 available full-length DsMV isolates. Pairwise distances of all available full-length DsMV nucleotide sequences were also assessed using Sequence Distances in the SSE platform (Simmonds, 2012).

| Sample collection and symptoms
Four surveys were conducted covering a total of 25 taro and tannia growing regions of Ethiopia, Kenya, Tanzania and Uganda ( Figure 1;

| Evolutionary analysis
Phylogenetic analysis was carried out using the amino acid sequences  Table S3. DsMV, dasheen mosaic virus sequences (events 1, 2, 5 and 6), including two events in Et5, one event in Et41 and one event in Tz34. Of the remaining three events (3, 4 and 7), one was identified in an isolate from the USA (KY242358), while the remaining two recombination events were in isolates from China (NC003537 and JX083210). Three of the four events identified in East African sequences involved only East African isolates as putative parents (events 1, 5 and 6). The putative recombination event in isolate Tz34 (event 2) included an isolate from the USA as a major putative parent. All the three recombination events in isolates from either China or the USA have an African isolate as one (event 3 and 7) or both (event 4) of the putative parents, while event 7 includes a putative minor parent from India (KT026108) ( Table S3).
Analysis of the dN/dS substitution ratios for the individual coding regions of the 15 full-length DsMV sequences revealed a moderate to high negative (purifying) selection pressure (dN/dS <1) for all coding regions ( Figure S1A). These ratios vary between a very high purifying selection pressure of 0.04 in the 6K1 region to a relatively moderate purifying selection pressure of 0.54 in the P3 gene ( Figure S1A). The HC-Pro, 6K1, CI, 6K2, VPg, NIa and NIb genes are under high to very high negative selection pressure, whereas the P1, P3 and CP genes were found to be under a relatively moderate negative selection pressure.
PASC analysis based on the CP amino acid sequences of DsMV isolates from East Africa together with previously reported isolates from the NCBI database revealed similarity ranging between 88.3% and 100% (Table S4). Subsequently, PASC analysis based on the individual protein-coding regions of the 15 available full-length DsMV isolates were carried out both at the nucleotide and amino acid levels.
When East African DsMV isolates were considered, amino acid similarities ranging from 90.4 to 100% were observed within the 6K1, CI, VPg and CP regions (Table S4). In contrast, amino acid similarities in the P1, P3 and 6K2 regions for East African isolates varied between 72 and 100%. Comparison of East African DsMV sequences with previously reported isolates from NCBI showed highest similarity (>90%) within the 6K1 and CI regions, while the P1 and P3 regions showed the lowest amino acid similarities (Table S4). PASC analysis based on nucleotide sequences revealed that the P1 region has the lowest level of identity (61.4%), while the remaining nine gene products showed more than 70% nucleotide sequence identity, either among the DsMV isolates from East Africa sequenced in this study or together with previously characterised sequences Wang et al., 2017). Further comparison of DsMV full-length nucleotide sequences using the SSE platform also identified the highest mean pairwise distances (both within and between geographic groups) within the P1 region, ranging from 0.25 to 0.33, followed by the P3 region and the NIB/CP junction ( Figure S1B), consistent with the PASC analysis described previously.

| DISCUSSION
Of the 392 samples collected from 25 regions in the four countries, a total of 91 (68 taro and 23 tannia) samples showed virus-like symptoms similar to those associated with virus infection in a range of aroids (Elliott et al., 1997;Revill, Jackson, et al., 2005;Zettler et al., 1970 with DsMV infection (Figure 2a-k) (Nelson, 2008). Of these 45 symptomatic samples, 26 were confirmed to be infected with DsMV, including 23 samples from Ethiopia, one from Uganda and two from Tanzania. In addition, 13 asymptomatic plants from Ethiopia together with an asymptomatic sample from Tanzania (Tz24) also tested positive for DsMV. This phenomenon is consistent with previous studies and may occur as a consequence of seasonal effects or differences in symptom expression in different host plant species (Elliott et al., 1997;Nelson, 2008).  (Table S2). Interestingly, de novo assembly, followed by BLAST analysis, failed to identify any viruses in the selected samples. This result confirms that in some cases, the observed symptoms are probably not associated with viral infections.
In work associated with the current study , the same leaf samples were tested for badnaviruses using PCR and rolling circle amplification and full-length sequences were characterised ( Interestingly, there were no mixed infections between the badnavirus, TaBCHV and DsMV. Although partial NIb-coding region sequences of DsMV isolates from Ethiopia are available (Kidanemariam, Macharia, et al., 2018), no complete genomic sequences of East African DsMV isolates have been reported before. Therefore, the complete genome sequences of nine East African isolates were determined and analyses were carried out to determine the evolutionary relationship of these and previously reported DsMV isolates. The genome organisation of the nine DsMV isolates was consistent with other previously characterised DsMV isolates. Phylogenetic analysis carried out using the core CP-coding amino acid sequences was also consistent with previous reports, with DsMV isolates grouping into two distinct clades (Babu & Hegde, 2014;Wang et al., 2017). The separation of Ethiopian DsMV isolates across the two clades each containing isolates from different geographic locations including, China, Nicaragua, Taiwan, India, the USA and Japan suggests that the virus has most likely been introduced from different sources on multiple occasions. Similarly, the isolates sequenced in the present study from Uganda (clade I) and Tanzania (clade II) are members of distinct subgroups and most probably have distinct origins. The phylogenetic analysis also revealed no relationships between clades with respect to geographic origin or host plant among the DsMV isolates included in this study, which is also consistent with previous work (Wang et al., 2017).
Recombination contributes to the evolution of potyviruses and facilitates adaptation to new hosts or environments (Gagarinova, Babu, Strömvik, & Wang, 2008;Revers, Le Gall, Candresse, Le Romancer, & Dunez, 1996). Analysis of the nine full-length genome sequences reported in the present study, together with the six previously published complete genome sequences of DsMV, revealed a high rate of recombination between virus isolates. This finding is consistent with the previous study conducted by Wang et al. (2017) where three recombination events were detected out of the six fulllength DsMV isolates available at that time. The analysis also revealed that some recombinant isolates are also either major or minor parents in other recombination events showing a high degree of recombination has occurred in DsMV. For instance, isolate Et5 was identified as a recombinant (events 1 and 5), and as a parent in the other recombination events (events 2 and 6) ( Table S3) Development Agency (SIDA). We are also thankful to all the farmers for allowing us to inspect their fields and collect samples. D.B.K. is the recipient of an Australia Awards Scholarship.