Signatures of adaptation to a monocot host in the plant-parasitic cyst nematode Heterodera sacchari

synthesis using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems) and oligo-dT primer. The cDNA was tested for expression changes during the nematode life stages using the Stepone Plus Real-Time PCR System (Applied Biosystems) with cycling parameters of 95° C for 15 s and 60°C for 30 s (40 cycles) for amplification. Each sample well contained 10 μl of Fast SYBR Green qPCR Master Mix (Invitrogen), 9 μl of the gene specific primer mixture with a final concentration 1 μM for each primer and 1 μl of cDNA. The data was analysed using the Stepone Plus Real-Time PCR software to create Ct values and relative expression was calculated following Pfaffl (2001). Elongation Factor 1 alpha (EF1 alpha) was used as internal control for all experiments. Three biological replicates

Introduction have subsequently been identified as cyst nematode effectors including pectate lyase (Popeijus et al., 2000), GHF43 Arabinase (Cotton et al., 2014), GH53 Arabinogalactan endo-1,4 beta galactosidase (Vanholme et al., 2009), expansins (Qin et al., 2004) and proteins encoding carbohydrate binding domains (Hewezi et al., 2008). All of these genes, as well as others encoding chorismate mutase (Jones et al., 2003) and proteins potentially involved in vitamin biosynthesis (Craig et al., 2008), have been acquired by horizontal gene transfer from bacteria (reviewed in Kikuchi et al., 2017). Cyst nematode effectors have also been identified that suppress host defence responses, most notably several members of the SPRYSEC family of effectors (Postma et al., 2012;Mei et al., 2015) and a modified ubiquitin extension protein (Chronis et al., 2013). The details of how cyst nematodes induce the formation of their syncytium in the roots of their hosts are less clear, although effectors that are likely to be important in this process have been characterised. Another novel effector (19C07) has been identified from Heterodera glycines and Heterodera schachtii that interacts with the LAX3 auxin influx transporter (Lee et al., 2011). In addition, all cyst nematodes studied to date produce effectors that include variable numbers of C-terminal repeats encoding peptides similar to CLAVATA3/EMBRYO SURROUNDING REGION-like (CLE) peptides. Functional studies have shown that the nematode peptides can complement mutant Arabidopsis lacking these peptides (Wang et al., 2005). The CLE proteins are modified by plant cell machinery in a manner similar to that of the endogenous proteins and the CLE peptides themselves subsequently interact with the CLAVATA2 receptor protein, which is required for nematode parasitism (Replogle et al., 2011;Replogle et al., 2013). In plants, CLE peptides regulate cell differentiation and thus contribute to the control of meristem maintenance in shoots, roots and vascular tissues. The ability to produce endogenous CLE peptides is likely to be a key factor in the ability to induce a feeding structure in plants.
Many plant species are parasitised by cyst nematodes, including monocots and dicots. However, each individual species of cyst nematode tends to have relatively narrow host range, with some exceptions.
Although little is known about the molecular determinants of host range in plant parasitic nematodes, effectors have been shown to have a central role in this process in other pathogens (reviewed by Stam et al., 2014). A view has emerged "non-host resistance" of closely related species is most likely due to recognition of effectors by a resistance gene, while failure to infect a more distantly related species is most likely due to the incompatibility of effectors with their cognate targets (Schulze-Lefert & Panstruga, 2011). For example, comparisons of the genomes of Phytophthora infestans and a closely related species Phytophthora mirabilis, which infects Mirabilis jalapa revealed that 82 of 345 genes which showed signs of positive selection could encode effector sequences (Raffaele et al., 2010).
Subsequent work on orthologous effectors that encode protease inhibitor from the two species showed that the protease inhibitor effectors from each of these species interact specifically with protease targets from their respective host plants (Dong et al., 2014). Host-pathogen co-evolution is therefore reflected in adaptations of effectors for function in the host.
Heterodera sacchari is an increasingly economically important pathogen of several monocot species including rice and sugarcane. Crop losses due to H. sacchari can exceed 40%, with more severe damage reported under rain fed upland conditions (Kyndt et al., 2014). Compared to other characterised cyst nematodes, the biology of H. sacchari is unusual in that it is restricted to monocots and reproduces mainly by mitotic parthenogenesis (CABI, 2014). Here we sequenced and assembled the transcriptome of H. sacchari and use these data to reconstruct the evolutionary history of this and related species.
These data suggest that H. sacchari and a related monocot parasite H. avenae evolved to parasitise monocot plants secondarily, from a dicot-parasitic ancestor. To explore the genetic changes associated with the evolutionary transition from dicot to monocot parasite, we identified homologues of previously identified effectors in H. sacchari. We show that in general the effectors have diversified in sequence more than non-effectors when compared to their most similar homologue in a dicotparasite, and that while some effectors show conserved expression profiles and likely function, specific aspects of the effector repertoire of H. sacchari appear to be adapted to monocots.

Results and Discussion
The transcriptome of H. sacchari A total of 17,086,132 paired end reads were obtained from the cDNA extracted from second stage juveniles (J2) and parasitic stage nematodes 15 days after infection. These sequence reads have been submitted to SRA (accession number PRJEB28025). The sequence reads were pooled and assembled into a single reference transcriptome of 44,230 transcripts after filtering with Transrate and removing contaminants. Assemblies are available at https://zenodo.org/deposit/1324265. CEGMA analysis showed that 88% of the core eukaryotic genes were present as full-length transcripts, with a further 6% represented by partial length transcripts. BUSCO analysis using the metazoan data set suggests that the assembly contains 76% complete BUSCO sequences and a further 6.0% fragmented sequences. Seventeen percent of the BUSCO genes were not identified. Comparisons with published gene models from cyst nematode genome sequences, showed that 34,203 (77.3%) and 30,482 (68.9%) of the H. sacchari transcripts matched sequences in G. rostochiensis and G. pallida respectively (evalue 1e -10 ). Taken together, these data suggest that the assembly produced represents a large proportion of H. sacchari transcriptome of these life stages.
H. sacchari evolved to parasitise monocots from a dicot-parasitic ancestor. We used a subset of 96 core eukaryotic genes conserved in H. sacchari and 18 related species to reconstruct a multi-gene phylogeny (Figure 1). This phylogeny robustly positions H. sacchari and the related cereal cyst nematode H. avenae in a mono-phyletic sub-clade of monocot parasites, nested within a clade of related dicot-parasites of the genera Heterodera and Globodera. The most parsimonious explanation for this is that 1) H. sacchari and H. avenae share a common, monocotparasitic, ancestor, and 2) this last common monocot-parasitic ancestor evolved to parasitise a monocot host secondarily, from a dicot-parasitic ancestor. This provides the comparative framework to explore the genes conserved, and the genes diverged, during adaptation to monocot parasitism by nematodes.

Genes encoding cell wall modifying enzymes acquired via horizontal gene transfer in the H. sacchari transcriptome
The plant cell wall is the first significant barrier to an invading pathogen, and while largely similar between dicots and monocot, there are noticeable differences in composition. Plant-parasitic nematodes in general are well equipped with proteins that allow them to modify and degrade specific components of the plant cell wall. Many of these genes were acquired via horizontal gene transfer from bacteria (Danchin et al., 2010).
The transcriptome of H. sacchari contains representatives of most previously described cases of horizontal gene transfer in related plant-parasitic nematodes, including a wide range of cell wall degrading enzymes (Table 1) and several other sequences putatively acquired by horizontal gene transfer (e.g. the GH32 invertases and chorismate mutase (Danchin et al., 2016;Jones et al., 2003)).
Analysis of the expression profiles of one of the H. sacchari GHF5 cellulases and the chorismate mutase showed that, as in other cyst nematodes, expression was restricted to the subventral pharyngeal gland cells in J2s and while the cellulase was upregulated at J2 the chorismate mutase was expressed throughout the life cycle ( Figure 2). These sequences may therefore play a similar role in the biology of H. sacchari and other cyst nematodes.
Notably however, both the H. sacchari and H. avenae transcriptomes lack sequences similar to GH53 Arabinogalactan endo-1,4 beta-galactosidase, in spite of the fact that all other cyst nematodes analysed to date, which parasitise dicots, have such proteins. While the absence of evidence in transcriptome datasets is not necessarily evidence of absence in the genome, it nevertheless reflects the host range of these species. Cell walls of commenlinoid monocots (which include the main hosts of H. sacchari and H. avenae) have a different composition compared to those of dicots and contain relatively low amounts of pectic polysaccharides, including the substrate of the GH53 enzymes (Vogel, 2008). Without genomic resources we cannot conclude whether the conspicuous absence in the transcriptomes of the monocot parasites is because these genes have been lost entirely or that they are not expressed under these conditions.

An overview of H. sacchari effector-like sequences
Effectors modulate plant process to promote disease, and are often finely tuned to their host. To determine whether the effector repertoire of H. sacchari reflects its secondary adaptation to a monocot host, we first identified and characterised effectors in the H. sacchari transcriptome by building on a detailed genome wide analysis of cyst nematode effectors performed for G. rostochiensis (Eves-van den Akker et al., 2016a). Using G. rostochiensis effectors as a starting point for comparative purposes, 185 of the 295 identified a similar sequence in the H. sacchari transcriptome (at e value <1e -10 ). We compared the similarity of H. sacchari effector-like sequences and non-effector-like sequences to their most similar homologue in the G. rostochiensis genome. This showed that, on average, putative effectors of H. sacchari are more different to their closest homologue in G. rostochiensis than non-effectors are to their corresponding closest homologue ( Figure 3). Taken together, these data suggest that while most effectors are apparently conserved, they have nevertheless accumulated more mutations that than non-effectors since these species diverged. Further analysis of the differences between the variation in effectors and non-effectors was undertaken. 21% of the mutations were classed as nonsynonymous for the putative effectors, versus 22% for other transcripts, however, non-synonymous SNPs in the putative effectors were more often predicted to have a significant effect on the coding sequence compared to other transcripts: 0.93% vs 0.82% were classed as "high impact" for putative effectors and other transcripts respectively. All of the following high impact categories were more common in the putative effectors compared to other transcripts: gain of premature start codon, in frame deletions, frameshift variants, start loss, stop gain. Moreover, the rate of variants (i.e. length of transcript/ number of variants) was significantly higher for the putative effectors versus non-effectors (mean=1754 vs mean=1243 for putative effectors vs other transcripts respectively, Mann Whitney U test: 1.98e-41).
Consistent with this observation, analysis of effectors from a range of pathogens, including PPN, has shown that they are under strong diversifying selection pressure. For example, whole genome resequencing of five pathotypes of G. rostochiensis showed that effectors contain significantly more variants and more nonsynonymous variants per gene than do randomly selected non-effector genes (Eves-van den Akker et al., 2016a). On a more detailed scale, the SPRYSEC effector (RBP1) from G. pallida, which is recognised by the Gpa2 resistance gene of potato, has been subjected to positive selection at several different residues, including the residue that determines recognition or evasion by Gpa2 (Sacco et al., 2009).
The SPRY domain family is greatly expanded in cyst nematodes with 295 sequences predicted in G.  Figure 4) and this contrasts with the G. pallida SPRYSECs which tend to be upregulated in J2s (Mei et al., 2015). Several SPRYSECs have been shown to suppress host defence signalling (Postma et al., 2012;Mei et al., 2015;Mei et al., 2018). What role the H.
sacchari SPRYSECs play in infection is yet to be determined. pallida shows that these sequences have both highly conserved regions that flank the variable domain but they are the first to encode a major structural variant outside the hypervariable domain, and the region corresponding to the "hypervariable domain" contains a novel sequence that has a very limited repeat structure (sequence features summarised in Figure 5).

Specialisation of H. sacchari CLE-like effectors to parasitism of a monocot host
One class of H. sacchari effectors shows signs of adaptation to a monocot host. The CLAVATA3/EMBRYO SURROUNDING REGION-like (CLE) effectors mimic plant peptide hormones and have been characterised in a number of plant-parasitic nematodes. The transcriptome of H. sacchari contained several partial transcripts that encode proteins with similarity to CLE-like effectors from other plant-parasitic nematodes. Using a novel approach based on identifying reads that map to partial transcripts and carrying out a local assembly, we were able to computationally assemble six unique transcripts to recapitulate the full length open reading frame, resulting in 5 unique polypeptide sequences from methionine to stop. As for other cyst nematodes, the H. sacchari sequences each encoded a signal peptide at the N-terminus followed by an N-terminal domain ( Figure S2), which in other cyst nematodes enables translocation into the apoplast after the protein is secreted into the plant cell (Wang et al., 2010a). All six of these H. sacchari transcripts encode a single canonical CLE domain at their C terminus. Another transcript (DN37996_c0_g2_i1) encodes a tandemly repeated motif with no clear homology to canonical CLE domains (despite the similarity of the rest of the protein sequence to CLE-effectors of other cyst nematodes). The six H. sacchari CLE-effector-like sequences can be divided into two groups based at least in part on their signal peptide and CLE domains. The CLE domains within each group are identical in protein and nucleic acid sequence ( Figure 6A).
Given that CLE peptides vary between plant species in general, and monocots and dicots in particular, we hypothesised that the CLE domains of CLE-like effectors in H. sacchari may have specialised prior to/concurrent with the transition from dicot-to monocot-parasite. To test this hypothesis, we analysed a database of 391 CLE peptides from 20 plant species collated from Zhang et al. (2014) and Oelkers et al. (2008). We created an all by all matrix of similarity between plant CLE peptides and H. sacchari CLE peptides based on a normalized BLOSUM62 score. We then used this matrix to generate a CLE similarity network ( Figure 6B), highlighting the host (rice Oryza sativa) and the nematode (H. suggesting selective mimicry of a subset of this family. It is known that CLE family members in dicots have diverse roles (e.g. Mitchum et al., 2008) and mimicry of a subset of rice CLEs by H. sacchari may reflect the need to target the function of a subset of the full rice CLE complement.
The functional significance of the similarity between the H. sacchari CLE peptides and those from rice was experimentally investigated by taking advantage of the short root phenotype observed from overexpression or exogenous application of CLEs (e.g. Fiers et al., 2004 andChen et al., 2015). We synthesised a synthetic version of the H. sacchari CLE peptide that had the highest connectivity with rice CLEs in the network (sequence identical to the rice CLEs in the 13 terminal amino acids). In order to confirm that this is indeed a true H. sacchari sequence we cloned the gene encoding this peptide from H. sacchari gDNA ( Figure S3). We analysed the effect of the peptide when applied exogenously to rice seedlings on root growth when compared to a randomised version of this CLE, or a CLE from the dicot-parasitic H. glycines (Wang et al., 2005). This analysis showed that the peptide from H. sacchari induced a short root phenotype in rice whereas peptides from H. glycines and a shuffled peptide used as a control had no effect (n=24 per condition, p <0.001, Tukey's HSD, Figure 7).

Conclusions
We use whole transcriptome sequencing to show that H. sacchari and the related H. avenae evolved to parasitise a monocot host from a last common dicot-parasitic ancestor. We mine these data to identify and characterise the cell wall degrading enzyme and effector complement of H. sacchari.
Finally, we show that while H. sacchari has a similar effector arsenal to the related cyst nematodes that parasitise dicots, the CLE effectors provide a paradigm of functional adaptation for parasitism of a monocot host. These data build a foundation on which to explore novel effectors involved in monocot-parasitism.

Biological material
Heterodera sacchari was cultured on rice cv Nipponbare as described in Pokhare et al. (2019). Briefly, plants were grown in a potting mixture of sand, field soil and organic matter (70:29:1) and were infected with second stage juveniles. After 12 weeks, watering of the plants was stopped and the plants were allowed to dry for 2 weeks. Cysts were collected by Cobb's decanting and sieving method using standard protocols (Cobb, 1918). The cysts were surface sterilised and placed in 3 mM ZnCl 2 to initiate hatching. The resulting J2s were collected every 5 days and either frozen in liquid nitrogen or Although the original population was derived from a field population of cysts, this nematode reproduces by mitotic parthenogenesis meaning that genetic diversity within the population is likely to be low, facilitating much of the bioinformatic analysis.
Transcriptome sequencing RNA was extracted from second stage juvenile and parasitic stage female nematodes using a Nucleospin RNA XS kit (Macherey Nagel) following the manufacturer's instructions. The quantity and integrity of RNA were assessed using a Bioanalyser. Library preparation for RNAseq was performed using the TruSeq RNA Library Prep Kit v2 (Illumina) as recommended by the manufacturer (Illumina; Protocol # 15026495, revision F). Separate libraries were generated from the juvenile (1 g) and female (500 ng) total RNA samples, using single-end TruSeq Index Adapters AR002 (CGATGT) and AR004 (TGACCA), respectively. Each library was quality checked using a Bioanalyzer 2100 (Agilent) and quantified using a Qubit fluorometer (Thermo Fisher). Equal molarities of each library were combined and run at 12 pM on a MiSeq (Illumina) generating paired-end 2x 250 bp reads, as recommended. A fastq file was generated for each sample using MiSeq Control Software (version 2.6) for downstream QC and analysis. and that had a BLAST hit identity greater than 70% to a non-metazoan was flagged as putative contamination. Putative contaminant sequences were removed, and the corresponding transcripts were removed from the assembly. BUSCO version 1.1b (Simão et al., 2015) and CEGMA version 2.4 (Parra et al., 2007) were used to quantify the completeness of the assembly. The resulting coding sequences were annotated using Trinotate (Grabherr et al., 2011) elegans sequencing consortium, 2018), C. briggsae (Hillier et al., 2007) and L. elongatus (Danchin et al., 2017). The protein sequences of CEGMA genes were aligned and refined individually using MUSCLE (Edgar, 2004). Individual alignments were concatenated and submitted to the IQtree online server with associated partition file. Model selection was carried out on each partition and a concatenated multi-gene phylogeny generated using the ultra-fast mode and 1000 bootstraps (Trifinopoulos et al., 2016).

CLE effector identification and network analysis
Several partial transcripts encoding proteins with similarity to CLE-effectors from various plantparasitic nematodes were identified in the transcriptome assembly. Partial transcripts were computationally extended using an iterative approach of mapping and overlap assembly using the wrapper script provided with MITObim (Hahn et al., 2013). Only the deduced amino acid sequences of full length CLEs were used for further analyses. The gene encoding the CLE peptide used for functional analyses was cloned and sequenced in order to ensure the validity of the computational approach. To compare H. sacchari CLE domains to those of plants a database of CLE peptides was collated from Zhang et al. (2014), and Oelkers et al. (2008). All CLE domains were combined into a fasta file and filtered for those with missing information. Custom python script 1 was used to construct an all vs all similarity matrix based on the BLOSUM62 scores between amino acids. Custom python script 2 was used to parse the matrix and normalise self-normalise similarity scores. Custom python script 3 was used to convert the matrix to gefx, which was loaded into Gephi to visualise the network (Bastian et al., 2009). Custom python scripts are available under github repository https://github.com/sebastianevda/.

Cloning and characterisation of candidate effector sequences
The complete ORFs of selected genes were amplified from cDNA of pre-parasitic second stage juveniles, or complete genes from gDNA extracted from cysts. PCR products were purified using the Qiagen PCR Purification kit and cloned into the pGEMT Easy or pCR8/TOPO/GW vectors following the manufactures guidelines.

Analysis of expression profiles of candidate effectors
The spatial expression patterns of candidate effectors from H. sacchari were investigated using in situ hybridisation of digoxigenin labelled probes to juvenile nematodes as previously described (Thorpe et al., 2014). Negative controls for in situ hybridisations were carried out using sense probes and gave no signal ( Figure S1) 150 parasitic stage nematodes were collected from rice grown in pluronic gel as described above.
Nematodes were stored at -80° C and used subsequently for RNA extraction. Total RNA was extracted as above and were processed using Agilent 2100

Functional analysis of H. sacchari CLE sequences
To analyse the in vivo function of CLE peptides, we developed a protocol for exogenous application to rice seedlings, similar to that described for A. thaliana (Wang et al., 2010).

Data availability statement
All sequence data, including assemblies, raw sequence reads and scripts used for processing, are publicly available as described in the materials and methods section. All other materials described in this MS are available upon request. Eves-van den Akker, S., Laetsch, D.R., Thorpe, P. et al. (2016a). The genome of the yellow potato cyst nematode, Globodera rostochiensis, reveals insights into the bases of parasitism and virulence.            Table S1: Primers used in this study. Figure 2: Analysis of spatio-temporal expression patterns of cellulase (left panels) and chorismate mutase (right panels) identified in the H. sacchari transcriptome. In situ hybridisation analysis showed that both genes are expressed in the subventral gland cells of second stage juveniles (J2s) of H. sacchari (purple staining; upper panels). Analysis using qPCR shows that the cellulase is expressed specifically at the J2 life stage while the chorismate mutase is expressed at all life stages tested (lower panels). Error bars represent standard errors across the biological replicates.      249x515mm (300 x 300 DPI)