Identification and functional characterization of effectors in expressed sequence tags from various life cycle stages of the potato cyst nematode Globodera pallida


  • Note: EST sequences reported for the first time in this article are available in dbEST.

  • Accession numbers of genes used in this study: FJ810122 (13G11), FJ810123 (22E10), FJ810124 (24D4), FJ810125 (33H17), ACJ14490 (RBP1).



In this article, we describe the analysis of over 9000 expressed sequence tags (ESTs) from cDNA libraries obtained from various life cycle stages of Globodera pallida. We have identified over 50 G. pallida effectors from this dataset using bioinformatics analysis, by screening clones in order to identify secreted proteins up-regulated after the onset of parasitism and using in situ hybridization to confirm the expression in pharyngeal gland cells. A substantial gene family encoding G. pallida SPRYSEC proteins has been identified. The expression of these genes is restricted to the dorsal pharyngeal gland cell. Different members of the SPRYSEC family of proteins from G. pallida show different subcellular localization patterns in plants, with some localized to the cytoplasm and others to the nucleus and nucleolus. Differences in subcellular localization may reflect diverse functional roles for each individual protein or, more likely, variety in the compartmentalization of plant proteins targeted by the nematode. Our data are therefore consistent with the suggestion that the SPRYSEC proteins suppress host defences, as suggested previously, and that they achieve this through interaction with a range of host targets.


The analysis of expressed sequence tags (ESTs) by single-pass sequencing of cloned cDNA fragments provides information relating to the biology of any organism for which a cDNA library is available. The use of stage- or tissue-specific libraries allows more targeted analysis of ESTs and, when coupled with bioinformatics tools, can be a rapid and cost-effective route for the discovery of novel genes (for example, Parkinson et al., 2004a). In addition, large-scale EST analysis can be used as a means of assessing gene expression in specific biological life stages or tissues and to help in the annotation of genome sequences through the identification of expressed genes and subsequent training of gene prediction software.

EST analysis has been applied widely to the study of the biology of nematodes. Well over 500 000 ESTs from a range of nematodes, including free-living (mainly Caenorhabditis elegans), animal-parasitic and plant-parasitic species, are present in dbEST (reviewed in Parkinson et al., 2004b). ESTs have also been used extensively in plant nematology with substantial EST datasets reported from the soybean cyst nematode Heterodera glycines (Elling et al., 2007), root knot nematodes Meloidogyne spp. (included in Dautova et al., 2001; Mitreva et al., 2005; Roze et al., 2008), Bursaphelenchus xylophilus and B. mucrontaus (Kikuchi et al., 2007), Xiphinema index (Furlanetto et al., 2005), Radopholus similis (Jacob et al., 2008), Pratylenchus penetrans (Mitreva et al., 2004) and, to a lesser extent, the potato cyst nematodes Globodera rostochiensis and G. pallida (Popeijus et al., 2000a). Other EST datasets that have not been described in papers also exist; at the time of writing, in excess of 150 000 ESTs derived from plant-parasitic nematodes have been deposited in dbEST.

ESTs from plant-parasitic nematodes have been used extensively for the identification of genes important in host–parasite interactions. In sedentary endoparasitic nematodes, such effectors are usually expressed in the subventral or dorsal pharyngeal gland cells and the protein products are secreted into the host via the stylet (reviewed by Gheysen and Jones, 2006). Effectors, including pectate lyase (Popeijus et al., 2000b), expansin (Qin et al., 2004), chorismate mutase (Jones et al., 2003; Lambert et al., 1999) and polygalacturonase (Jaubert et al., 2002), have been identified from cyst and root knot nematodes using this approach, as have cellulase (Kikuchi et al., 2004), pectate lyase (Kikuchi et al., 2006) and β-1,3-endoglucanase (Kikuchi et al., 2005) from B. xylophilus. Detailed bioinformatics analysis of ESTs from plant-parasitic nematodes has led to the identification of other effectors (Roze et al., 2008; Scholl et al., 2003; Vanholme et al., 2006). Other approaches, including RNA fingerprinting and subtractive hybridization, have been used to identify novel effectors, including those encoding secreted proteins with an SPRY domain (SPRYSECs: Blanchard et al., 2005; Qin et al., 2000; Rehman et al., 2009) and genes encoding ubiquitin-like proteins with unusual C-terminal extensions (Tytgat et al., 2004). In addition, the analysis of cDNA libraries obtained from purified gland cell contents has led to the identification of a large number of novel effectors of unknown function from the soybean cyst nematode H. glycines (Gao et al., 2003) and from a root knot nematode Meloidogyne incognita (Huang et al., 2003).

Globodera pallida, the white potato cyst nematode, is the most economically important plant-parasitic nematode in the UK, and also causes problems for growers in many other parts of the world. Repeated use of cultivars containing the H1 gene, which provides complete control of the G. rostochiensis pathotypes present in the UK, has led to the selection of G. pallida to such an extent that a recent survey found that it is present in 65% of fields used for growing potatoes (Minnis et al., 2002). A lack of major gene resistance against G. pallida, coupled with legislative and consumer pressures for reduced inputs of effective nematicides, has led to problems in controlling this nematode. Like other sedentary endoparasitic plant-parasitic nematodes, G. pallida has complex interactions with its hosts. However, compared with other plant-parasitic nematodes, fewer ESTs have been described from G. pallida and very few effectors have been identified to date.

In this article, we describe the analysis of over 9000 ESTs from cDNA libraries obtained from various life stages of G. pallida. We have identified G. pallida effectors from this dataset using bioinformatics analysis, by screening clones in order to identify secreted proteins up-regulated after the onset of parasitism and using in situ hybridization to confirm the expression in pharyngeal gland cells. We have identified a substantial gene family of G. pallida SPRYSEC proteins and, in addition, have examined the subcellular localization in plants of several members of this family of secreted proteins.


ESTs from G. pallida

A total of 9242 ESTs from G. pallida were analysed. Two cDNA libraries, one from invasive (second)-stage juveniles (J2s) and one from early parasitic-stage juvenile nematodes, were sequenced in our laboratory as part of this study; 1600 high-quality sequences were obtained from the J2 library, which formed 389 contigs and 569 singletons after assembly; 3264 high-quality sequences were obtained from the parasitic-stage library, which formed 344 contigs and 1369 singletons. In addition, 4378 sequences were downloaded from dbEST [the majority of which were sequenced from a library produced by one of us (CJL) from adult female-stage nematodes] and subjected to analysis. After the removal of poor-quality sequences, these ESTs formed 383 contigs and 1190 singletons. When all the ESTs were pooled and contigged, a total of 3754 unique sequences was identified, composed of 3349 singletons and 405 contigs. The distribution of these contigs and singletons is summarized in Fig. 1. These pooled unique sequences are available through

Figure 1.

Venn diagram showing the distribution of singletons and contigs across the three cDNA libraries in the pooled dataset. J2, second-stage juvenile.

The profile of EST numbers within the contigs from each library was examined in more detail. The numbers of ESTs within contigs were in the ranges 2–31 (J2 library), 2–54 (parasitic-stage library) and 2–79 (sequences within dbEST generated from adult female libraries) (Fig. 2). Assuming that a representative cDNA library has been produced, the most highly abundant transcripts are likely to give rise to large numbers of ESTs, and thus form the contigs containing the largest numbers of ESTs. Details of the most abundantly represented genes in each library are shown in Table 1. Transcripts encoding collagens were among the most abundantly represented in both the parasitic-stage and adult female libraries, but were not abundant in the J2 library, presumably reflecting the fact that the former stages are actively moulting, whereas J2s are not. An mRNA that could encode a G. pallida homologue of H. glycines gland cell protein 4 (Wang et al., 2001) was among the most abundantly represented transcripts in the parasitic-stage library.

Figure 2.

Distribution of contig content in expressed sequence tags (ESTs) from second-stage juvenile (J2), parasitic and adult female libraries.

Table 1.  Abundantly represented genes in second-stage juvenile (J2), parasitic- and female-stage libraries of Globodera pallida.
NoAccession*No. ESTsBest match descriptorE value
  • *

    Accession number of one G. pallida EST forming part of the contig.

  • EST, expressed sequence tag.

 1GO24991231Acyl CoA dehydrogenase (XP_001497474)1 × 10−31
 2GO25018330RNA helicase (XP_002002333)0.001
 3GO25020625GpFAR1 (CAA70477)1 × 10−92
 4GO24932119Caenorhabditis elegans hypothetical protein (CAE57640)3 × 10−28
 5GO24976516G. pallida polyubiquitin (CAL30085)6 × 10−80
 6GO24906216G. rostochiensis actin (AF539593)0
 7GO24926816C. elegans matrix metalloprotease (AAC46708)2 × 10−59
 8GO24925115Ribosomal protein S11 (AAV34867)2 × 10−62
 9GO25019813HMG protein family member (AAC78598.1)2 × 10−23
10GO24923712No matches (other than Globodera ESTs)0
NoIDNo. ESTsBest match descriptorE value
 1GO25239954Heterodera glycines C-type lectin domain (AF498244_1)0
 2GO25142530H. glycines gland cell protein 4 (AAG21334.2)3 × 10−54
 3GO25182123GpFAR1 (CAA70477)1 × 10−92
 4GO25204722Collagen family member (AAA97982.1)8 × 10−7
 5GO25238415Collagen family member (CAA90188.1)8 × 10−4
 6GO25056115C. elegans hypothetical (CAE72086.1)1 × 10−16
 7GO25055315G. pallida collagen (CAA65474.1)2 × 10−5
 8GO25096511Elongation factor 1α (AAY17222.1)0
 9GO25182511G. rostochiensis major sperm protein (AAA29147.1)2 × 10−68
10GO25169611Vacuolar H ATPase (NP500188.1)1 × 10−66
Adult female
NoIDNo. ESTsBest match descriptorE value
 1BM416507.153Ribosomal protein S23 (ABC58767.1)3 × 10−68
 2CV577526.143Collagen (CAA65506.1)7 × 10−21
 3BM415040.127GpFAR1 (CAA70477)1 × 10−92
 4BM416504.126No matches 
 5BM415716.126Meloidogyne incognita collagen (AAC47437.1)2 × 10−41
 6BM415092.126G. rostochiensis amphid protein (CAB66341.1)2 × 10−66
 7CV579211.123Vitellogenin (AAB49749.1)2 × 10−38
 8BM416381.121M. incognita collagen (AAC47437.1)4 × 10−28
 9CV577338.118Polyubiquitin (XP001599434.1)1 × 10−116
10CV577016.117G. pallida collagen (CAB88203.1)5 × 10−45

Four hundred genes highly up-regulated during the transition to parasitism were identified by filter screening and were sequenced. Analysis of these sequences backed up the data obtained from comparisons of genes abundantly represented in ESTs from the libraries, with changes in gene expression reflecting changes in the biology of the nematode. Genes encoding collagens were included in this dataset, as were digestive proteases. A novel secreted protein was also identified which was shown by in situ hybridization to be expressed in the digestive system (Fig. 3a), reflecting the increased activity of this structure in (feeding) parasitic nematodes. Genes associated with general metabolism were also present in this dataset. In addition, candidate effectors were identified in these screens (see below).

Figure 3.

In situ hybridizations showing expression patterns of cDNAs encoding: (a) novel secreted protein (digestive system); (b) ubiquitin extension protein (dorsal gland cell); (c) novel secreted protein (dorsal gland cell); (d) cellulase (subventral gland cell); (e–h) SPRYSEC proteins and ABA1 lipid-binding protein (digestive system). Antisense controls show no binding to nematode structures (i).


The main purpose of this work was to identify G. pallida effectors. This was achieved using several approaches. First, blast searches against the G. pallida ESTs were performed using the complete collection of candidate effectors identified from H. glycines (T. Baum, Iowa State University, Ames IA). In addition, the results of blast searches against nonredundant databases for all the ESTs, including those up-regulated in parasitic nematodes when compared with J2, were inspected manually in order to identify homologues of known nematode effectors. Finally, the full EST dataset was analysed for the presence of sequences that could encode secreted proteins without transmembrane domains as described previously (for example, Elling et al., 2009). The full list of candidate effectors from G. pallida identified in these experiments is summarized in Tables 2 and 3. Homologues of effectors previously characterized from G. pallida or, more often, G. rostochiensis were identified, including β-1,4-endoglucanases (cellulases), expansin and chorismate mutase, as well as homologues of effectors identified from other plant-parasitic nematodes (but not previously identified in G. pallida or G. rostochiensis). These included cellulose-binding proteins and ubiquitin extension proteins, as well as a large number of novel H. glycines proteins that are expressed in the gland cells of this species. In situ hybridization was used to confirm that the expression of some of these genes was restricted to the subventral or dorsal pharyngeal gland cells (Fig. 3b–d). One of the most notable features of the G. pallida EST dataset was the presence of a large family of secreted SPRY domain (SPRYSEC) proteins. Sixteen different genes were identified in the EST dataset that could encode different SPRYSEC proteins, and subsequent work (E. Grenier, unpublished work) suggests that the SPRYSEC gene family is considerably larger than this, consisting of at least 30 members in the (introduced) G. pallida present in Europe, but of more than 100 members in indigenous South American populations. A similarly large family of genes encoding dorsal gland proteins has recently been described from G. rostochiensis (Rehman et al., 2009). In situ hybridization confirmed that the expression of several different G. pallida SPRYSEC genes is also restricted to the dorsal pharyngeal gland cell (Fig 3e–g). Intriguingly, a far greater range of different SPRYSEC proteins was present in the J2 dataset (14 unigenes) than in the (much larger) parasitic (three unigenes) or adult female (two unigenes) datasets.

Table 2.  Candidate effectors from Globodera pallida expressed sequence tag (EST) dataset. Where more than one entry is listed for a matching sequence, more than one different G. pallid a gene was identified.
G. pallida accession number*Best matchE value
  • *

    Accession numbers are from an EST generating a match against this effector. Not all matches for each effector are shown.

GO249336Heterodera glycines eng-2 (AAC48326.1)5 × 10−9
GO249479Glododera tabacum eng-1 (AAD56392.1)2 × 10−40
54549806G. rostochiensis eng-2 (AAC63989.1)0
GO250090G. rostochiensis EXPB1 (AJ311901.1)1 × 10−141
GO250317G. pallida chorismate mutase (AW505693.1)1 × 10−104
GO253107Annexin (001022756.1)1 × 10−73
GR367886Ubiquitin extension protein (AAP37976)1 × 10−63
54548961H. glycines cellulose-binding protein (ABY49997.1)6 × 10−7
GO250277H. glycines cellulose-binding protein (AAN32887.1)7 × 10−10
GO250301Venom allergen protein (AAK60209.1)1 × 10−25
GO250137Similar to G. pallida IA7 (BM276573.1)6 × 10−8
GO248962G. pallida IA7 (ABF51008.1)7 × 10−29
GO249400G. mexicana IVG9 effector (DQ493453.1)2 × 10−3
GO248987G. mexicana IVG9 effector (ABF51007.1)6 × 10−29
GO249318G. mexicana IVG9 effector (ABF51007.1)7 × 10−13
GO249742G. mexicana IVG9 effector (ABF51007.1)8 × 10−25
GO249805G. pallida IVG9 effector (ABF51007.1)6 × 10−13
GO249878G. pallida IVG9 effector (ABF51007.1)2 × 10−11
54546985H. glycines gland cell secreted protein 29D09 (AAP30755.1)6 × 10−40
GO252878H. glycines gland cell secreted protein 16B09 (AA085454.1)1 × 10−33
GO252477H. glycines gland cell secreted protein 4G05 (AA033477.1)2 × 10−6
GO251672H. glycines gland cell secreted protein 30G12 (AAP30757.1)6 × 10−17
GO251644H. glycines gland cell secreted protein 29D09 (AAP30755.1)4 × 10−26
GO248977H. glycines gland cell secreted protein 19C07 (AA085458.1)7 × 10−2
GO248915H. glycines gland cell secreted protein 12H04 (AA08542.1)9 × 10−14
54547851H. glycines gland cell secreted protein 7E05 (AF500023)1 × 10−36
54545761H. glycines gland cell secreted protein 4D06 (AF469063)5 × 10−29
GO250257G. rostochiensis SPRYSEC (AJ251758.1)3 × 10−37
GO251586G. rostochiensis SPRYSEC (AJ251758.1)3 × 10−20
GO249657SPRYSEC protein (ACO35733.1)7 × 10−14
GO249681SPRYSEC protein (AY769949.2)5 × 10−2
GO248887SPRYSEC protein (ACQ55285.1)4 × 10−55
GO249108SPRYSEC protein (ACO35733.1)6 × 10−41
GO248906SPRYSEC protein (ACO35733.1)2 × 10−26
GO249693SPRYSEC protein (ACJ14491.1)2 × 10−19
GO249739SPRYSEC protein (ACO35731.1)8 × 10−77
GO250120SPRYSEC protein (ACO35731.1)2 × 10−15
GO250257SPRYSEC protein (CAC21848.1)5 × 10−37
GO250276SPRYSEC protein (ACO35733.1)6 × 10−17
GO250370SPRYSEC protein (CAC21848.1)2 × 10−19
GO252211SPRYSEC protein (ACO35733.1)1 × 10−28
GO252009SPRYSEC protein (ACJ14492.1)6 × 10−35
18381136SPRYSEC protein (ACO35733.1)4 × 10−50
54545747SPRYSEC protein (CAC21848.1)2 × 10−16
54545652SPRYSEC protein (ACQ55284.1)5 × 10−9
Table 3.  Genes identified by screening the Globodera pallida expressed sequence tags (ESTs) for the presence of a signal peptide at the start of a predicted open reading frame coupled with the absence of a transmembrane domain. Sequences identified in this analysis, but listed in Table 2, are not included.
G. pallida accession numberBest match
GO252147.1Heterodera glycines hypothetical gland cell protein 12 (AF159591.1)
GO250293.1Brugia malayi transthyretin-like protein (XP001894510.1)
54548721Radopholus similis transthyretin-like protein (CAM84513.1)
CV577838.1Caenorhabditis elegans hypothetical secreted protein (NP 509391.1)
GO251019Physcomitrella patens SIN3 histone deacetylase (XP 001776865.1)
CV578979.1H. glycines lectin domain protein (AAM18623.1)
CV577812.1Meloidogyne incognita cathepsin (ABC88426.1)
G. pallida pioneers meeting signal peptide and no transmembrane domain criteria

Screening of the EST dataset for genes encoding proteins with a signal peptide, but lacking a transmembrane domain, identified genes that could include novel effectors (Table 3). This bioinformatics process identified several genes that could encode transthyretin-like proteins; these proteins have been reported as being expressed in the gland cells or excretory/secretory products of other nematode species (Furlanetto et al., 2005; Mulvenna et al., 2009). Secreted proteases were also identified using this process. In addition, 43 G. pallida genes were identified that produced no matches in databases. As strict filtering criteria were used to generate this list, these pioneer genes represent novel secreted proteins, which may include novel G. pallida effectors, although further analysis of these genes is required to confirm this. Previous studies have shown that nematode secreted proteins and effectors may evolve faster than nonsecreted proteins (Harcus et al., 2004), and this also seems to be true for G. pallida, as 68% of the candidate effectors were novel compared with 41% novel genes for the full dataset.

A G. pallida homologue of ABA1, a lipid-binding protein characterized from a range of animal-parasitic nematodes and from C. elegans, was present in the G. pallida EST dataset. The expression pattern of this gene was also examined by in situ hybridization in order to determine whether, like another G. pallida lipid-binding protein (GpFAR1), it has been modified for a role in parasitism. However, these experiments showed that, like the C. elegans and animal-parasitic nematode homologues, it is expressed in the digestive system of G. pallida (Fig. 3h), arguing against a role in parasitism.

Subcellular localization of SPRYSECs in plants

The size of the predicted SPRYSEC gene family in our EST dataset and in G. rostochiensis (Rehman et al., 2009) suggests that the proteins encoded by this gene family may play important roles in the host–parasite interaction. One gene family member from G. rostochiensis interacts with a protein similar to resistance genes of the SW5 cluster, leading to the suggestion that these proteins may suppress host defences (Rehman et al., 2009). We therefore examined the subcellular localization in both leaf and root cells of several of the G. pallida SPRYSEC proteins using a modified TRV vector.

WoLF PSORT and PSORTII were first used to predict the subcellular localization of five SPRYSEC homologues with signal peptides (predicted by SignalP) removed. Some SPRYSEC proteins were predicted to be cytoplasmic, but nuclear localization signals were predicted for the majority of the proteins (Table 4).

Table 4.  Subcellular localizations for SPRYSECs.
Gene name (accession number)PSORTII predictionWoLF PSORT predictionExperimental localization
56.5% cytoplasmic7 nucleus
5 cytoplasm
1 mitochondria
47.8% nuclear6 nucleus
6 cytoplasm
1 chloroplast
Nuclear (nucleolar)
52.2% nuclear9.5 nucleus
6 cytoplasm and nucleus
2 chloroplast
1.5 cytoplasm
Nuclear (nucleolar)
60.9% cytoplasmic10 cytoskeleton
3 cytoplasm
65.2% nuclear9 nucleus
3 chloroplast
1 cytoplasm

Three of the five genes tested (RBP1, Blanchard et al., 2005; 24D4 and 33H17) showed a cytoplasmic localization similar to that observed for free green fluorescent protein (GFP) from a control construct (Fig. 4a–g). However, two of the genes, 22E10 and 13G11, showed strong nuclear localization (Fig. 4h–m) with both showing localization to the nucleolus (for example, Fig. 4l). Observations in root tissues confirmed those obtained from leaves, with RBP1, 24D4 and 33H17 showing a cytoplasmic localization and with 22E10 and 13G11 localizing to the nucleus. No differences in localization patterns were observed irrespective of whether the SPRYSEC proteins were fused to the N or C terminus of GFP (not shown). Members of the SPRYSEC family of proteins from G. pallida therefore show different subcellular localization patterns. Even very similar sequences may show differential localization patterns, as was the case for two of the genes examined here (33H17 and 22E10, which differ by only 7% of their amino acid sequences).

Figure 4.

Different Globodera pallida SPRYSEC proteins show different subcellular localizations within leaves and roots. RBP1, 24D4 and 33H17 are localized within the cytoplasm within leaves (a–c) and roots (d,e), whereas 22E10 (h–j) and 13G11 (k–m) are localized to the nucleus and nucleolus in leaves and roots. Free green fluorescent protein (GFP) controls show cytoplasmic localization in the leaves (f) and roots (g). Fluorescence from GFP is green, autofluorescence of chloroplasts is red.


In recent years, our understanding of the molecular aspects of plant–nematode interactions has greatly benefited from the identification of effectors from plant-parasitic nematodes (for example, Davis et al., 2008). We have sequenced and bioinformatically analysed over 9000 ESTs from G. pallida with the dual aims of providing background information on the biology of this nematode and identifying G. pallida effectors. We have also screened two of the libraries used for EST generation in order to identify genes up-regulated in early-stage parasitic nematodes, with the expectation that further effectors would be identified. In addition, we have performed functional studies on a potentially important family of effectors through an analysis of the subcellular localization of a range of the family members.

Differences in the most abundant genes between J2s and other life stages seem to reflect changes in the biology of the organism. The invasive J2 is a nonfeeding stage that utilizes its lipid reserves as a source of energy. In aerobic conditions, stored fatty acids undergo β-oxidation, releasing acetyl CoA, which can then enter the glyoxylate cycle as a substrate for gluconeogenesis (Barrett and Wright, 1998). A key enzyme in the β-oxidation pathway, acyl CoA dehydrogenase, is encoded by the most abundant transcript in the J2 EST dataset, highlighting the importance of lipid catabolism prior to the onset of feeding. Once successful parasitism has been established, the digestive system becomes more pronounced and shows structural changes and an increase in size, consistent with the onset of feeding by the nematode. For example, in J2s, few microvillar folds are observed on the surface of the gut facing the intestinal lumen but, after feeding begins, the number of microvillar folds increases dramatically (Bird and Bird, 1991). The most abundant gene within the parasitic-stage library was a gene encoding a C-type lectin domain protein, and similar proteins have been identified as gut proteins in animal-parasitic nematodes and in C. elegans (Jasmer et al., 2001). In the adult female, the gut absorbs food and may also function as a storage organ (Bird and Bird, 1991). It is also the site of synthesis of vitellogenin, a major yolk protein in many organisms, including nematodes (Vercauteren et al., 2003). The presence of vitellogenin-like transcripts in the list of the most abundant genes present in the adult female library therefore also reflects the changing biology of G. pallida.

Feeding nematodes develop through a series of moults to J3, J4 and adult stages, and the most important protein component of the cuticle is collagen. Collagens form a large multigene family in both C. elegans (Kramer, 1997) and plant-parasitic nematodes, with 119 cuticle collagens predicted in the genome sequence of M. incognita (Abad et al., 2008). In C. elegans, different collagens are transcribed in discrete temporal periods during cuticle synthesis prior to moulting (Johnstone and Barry, 1996). Various collagens were among the most abundantly expressed genes in both parasitic and adult female nematodes (Table 1).

Nematodes are unusual in producing a wide range of secreted lipid-binding proteins (Kennedy, 2000). Some of these have modified expression patterns in parasitic nematodes when compared with free-living nematodes, suggesting that their function has been adapted for a role in parasitism. For animal parasites, it has been suggested that these modified functions may include modulation of the host immune response (Kennedy, 2000). A fatty acid-binding protein, GpFAR-1, shows a modified expression pattern in plant-parasitic nematodes, being present on the nematode surface, suggesting a role in host–parasite interactions (Prior et al., 2001). The transcript encoding GpFAR-1 was among the most abundant in the datasets from each life stage, and homologues are similarly abundant in other cyst nematodes (Vanholme et al., 2006) and root knot nematodes (McCarter et al., 2003). It has been suggested that this protein may bind precursors of the jasmonate signalling pathway during the interaction between cyst nematodes and their hosts, thus suppressing defences (Prior et al., 2001). In order to investigate whether, as in animal-parasitic nematodes, other lipid-binding proteins may be adapted for a role in parasitism, we examined the expression pattern of another lipid-binding protein in G. pallida. However, the gene encoding this protein (ABA-1) was expressed only in the digestive system, with no evidence of expression in the gland cells or in the hypodermis. FAR-1 may therefore be the exception in plant parasites in having a role in host–parasite interactions.

One of the most notable findings was the presence of a large gene family encoding SPRYSEC proteins. Considerably more genes from this family were present in the J2 dataset than in those from parasitic nematodes. Although these genes have been described as being down-regulated in adult females (Qin et al., 2000), the observation that far fewer ESTs are present in the (much larger) parasitic nematode dataset was surprising, given that the dorsal gland cell (the site of expression of these genes) is considerably larger and more active in parasitic-stage nematodes when compared with J2s (Hussey and Mims, 1990). Our parasitic nematodes were sampled at 7 days post-infection, and these observations suggest that many of the SPRYSEC proteins have a functional role in the very early stage of the plant–nematode interaction. However, as other gene family members are expressed throughout parasitism, a subset of the gene family may be important at later stages. A more detailed analysis of the expression patterns of all the SPRYSEC proteins identified to date is currently underway within our group. A similar gene family has recently been described from G. rostochiensis, and one gene family member has been shown to interact with a CC-NB-LRR protein, similar to the resistance genes of the SW5 cluster (Rehman et al., 2009). A role for this protein in suppression of host defences was suggested. The observed size of the SPRYSEC gene family in both G. rostochiensis and G. pallida is consistent with this idea, and also makes the SPRYSECs a potential source of nematode avirulence genes. It is interesting to note that at least two members of this gene family may be involved in interactions with plant resistance genes (Rehman et al., 2009; Sacco et al., 2009).

Effectors have been identified from H. glycines that contain functional nuclear localization signals and are therefore transported to the nucleus when expressed in plants. It has been suggested that nematode secreted proteins targeted to the plant nucleus may be important in modifying host gene expression or in regulating changes in the cell cycle that accompany the development of the syncytium (Elling et al., 2007). It is also possible that nematode proteins targeted to the nucleus may suppress plant defence signalling if their host targets are normally localized to this structure or targeted to the nucleus as a result of their interaction with the nematode protein. We therefore examined the subcellular localization of several of the G. pallida SPRYSEC proteins.

The members of the SPRYSEC family of proteins from G. pallida tested in this study showed different subcellular localization patterns, with some localized to the cytoplasm and others to the nucleus and nucleolus. This variation may reflect different functional roles for each individual protein or, more likely, differences in the localization of plant proteins targeted by the nematode. Our data are therefore consistent with the suggestion that SPRYSEC proteins suppress host defences (Rehman et al., 2009) and that they achieve this through interaction with a range of host targets.

A genome sequencing project is currently underway for G. pallida, and this will reveal the full complement of potential effectors present in this species. Assigning function to these effectors will be the subsequent challenge, and a range of tools for functional analysis, including RNA interference (Chen et al., 2005), will need to be applied. The availability of the virus vector described here, modified to allow subcellular localization of nematode proteins in roots and leaves in a high-throughput manner, will also be of value for this process.


cDNA libraries

Three cDNA libraries were produced from second-stage juveniles (J2), a mixture of parasitic stages J2 and J3 (J2/J3) and young feeding females (female) of G. pallida. Second-stage juveniles were hatched from cysts in tomato root diffusate as described previously (Jones et al., 2003). The hatched J2 were cleaned by flotation on 1 : 1 (w/v) sucrose in sterile distilled water and mRNA was extracted using Trizol (Sigma, Poole, UK) and an Oligotex mRNA extraction kit (Qiagen, Crawley, UK). Double-stranded oligo(T)-primed cDNA was synthesized from 2 µg of mRNA using a ZAP-cDNA synthesis kit (Stratagene, Cambridge, UK) and ligated directionally into the EcoRI/XhoI sites of the λ Uni-ZAP XR vector (Stratagene) following the manufacturer's instructions. The phages were packaged using MaxPlax lambda packaging extract (Epicentre, Madison, WI, USA). A standard mass excision protocol (Stratagene) was used to generate a copy of the library in a plasmid vector for high-throughput sequence analysis. To obtain J2 and J3, Solanum tuberosum (cv. Desirée) plants were grown in compost in root trainers (Ronaash, Kelso, UK). When the roots were approximately 10 cm long, the root trainers were opened, the roots were inoculated with J2s (∼500 nematodes/plant) and the root trainers were closed again. After 7 days, the plants were removed from the root trainers, the roots were washed free of compost, cut into 0.5-cm pieces with a scalpel and blended for 10 s. Parasitic stages were recovered with a glass micropipette and collected on ice prior to long-term storage at −80 °C. mRNA was extracted from these nematodes using the mRNA DIRECT micro kit (Invitrogen, Paisley, UK). This was converted to cDNA and cloned directionally into the pDNR LIB plasmid vector using a SMART cDNA library construction kit following the manufacturer's instructions. For this library, 20 cycles of polymerase chain reaction (PCR) were required to generate sufficient material for cloning.

To obtain young females, potato plantlets (S. tuberosum cv. Desirée) were potted into a sand–loam mix infested with cysts of G. pallida at 20 eggs/g soil. Adult female stages were released from harvested, washed roots by brief blending, followed by sieving at intervals from 3 to 5 weeks after transfer to infected soil. Young female nematodes were hand picked away from root debris under a stereo-binocular microscope, collected on ice and stored at –80 °C. Any female nematodes that were observed as gravid were discarded. mRNA was isolated from approximately 500 mg fresh weight of young female G. pallida using a Quick-Prep mRNA Purification Kit (GE Healthcare, Little Chalfont, UK); 5 µg of mRNA was used to prepare a cDNA library in the Uni-ZAP XR vector, as described for pre-parasitic J2s.

The representation of each library was assessed by the number of primary transformants and, in each case, was considerably greater than 106. The range of insert sizes in each library was checked by PCR on 20 colonies selected at random. These reactions were set up containing 1 ×Taq buffer (Promega, Southampton, UK), 1.5 mm MgCl2, 200 µm of each deoxynucleoside triphosphate (dNTP), 1 µm of vector primers (Table 5) and 1 unit of Taq DNA polymerase (Promega, Southampton, UK). Thirty cycles of PCR were performed in an Applied Biosystems (Foster City, USA) PE9700 thermal cycler with an annealing temperature of 55 °C. PCR products were separated on 1% agarose gels and stained with ethidium bromide. For each library, no inserts of less than 500 bp were amplified in these tests. Analysis of the ESTs obtained from each library showed that the average GC content was between 45% and 47.5%. ESTs below 50 bp were discarded and the maximum contig sizes for each library were 1453 (J2), 2072 (parasitic-stage) and 4033 (adult female) nucleotides. The mean EST lengths (after processing) for each library were 519.5 (J2), 406 (parasitic) and 694.9 bp (adult female), with medians of 540.7, 413.1 and 616.5 bp, respectively.

Table 5.  Sequences of primers used in this study. The 5′ ends of RanBPM clones 22E10 and 33H17 are identical and the 22E10 F primers were therefore used for both of these genes.
PrimerSequence (5′–3′)Gene and use
J21C1FCAAAGACACAGGCAGCGTAAJ21C1 (SPRYSEC homologue) in situ hybridization
J21C1RTCCGTTCAATCCAAAGTTCGJ21C1 (SPRYSEC homologue) in situ hybridization
GpUBIFAAGACACTGACCGGCAAAACUbiquitin-like protein in situ hybridization
GpUBIRCGAAGGACCAGGTGAAGAGTUbiquitin-like protein in situ hybridization
Up2-5FGGACAAGGAGCACAAAGAGCUp-regulated Contig2–5 in situ hybridization
Up2-5FATGGACGTCTCCGAGTTCACUp-regulated Contig2–5 in situ hybridization
ABAFCAAAGAGGCTGCGAGTCCATABA1 homologue in situ hybridization
ABARTTCAGCGTGCCCCGCCGATCABA1 homologue in situ hybridization


Sequences from the female cDNA library were generated by the Washington University Genome Sequencing Center (St Louis, MO, USA). These sequences (2546) were downloaded from dbEST as processed files and analysed as described below. An additional 1723 sequences in dbEST from a cDNA library generated from female nematodes by another group were also included in this analysis.

For the J2 and parasitic-stage libraries, colonies were picked into 384-well plates using a Qbot (Genetix, New Milton, UK). Plates were grown overnight at 37 °C and stored at –80 °C until use. Plasmids were prepared in 96-well format using Millipore (Watford, UK) clearing and binding plates following the manufacturer's instructions. DNA inserts were sequenced from the 5′ end using the appropriate vector primer in 1/16 size reactions of the Big Dye terminator 3 kit (Applied Biosystems). Sequencing reaction products were run on an ABI 3730 sequence analyser.

Bioinformatics analysis

Sequences were processed using Phred, and vector sequences and poly A tails were removed using local perl scripts. The ‘trimmed’ sequences were then contigged using cap3 (Huang and Madan, 1999), and the resulting contigs and singletons were used to search nonredundant nucleotide and protein databases and dbEST using Netblast. Sequences from each individual library were analysed separately in the first instance and then combined and analysed using blast2go (Conesa et al., 2005).

Unique sequences (singletons and contigs) of at least 50 bp from the three life stages were assembled using cap3 (Huang and Madan, 1999), and these data were used to generate a Venn diagram (Fig. 1). Using Biopython (Cock et al., 2009), 4039 potential peptides were identified employing simple gene finding. Parameters for peptides were a minimum length of 30 amino acids, and 90% of the length of the longest peptide in the same parent sequence. Of these, 149 had signal peptides according to all outputs of SignalP 3.0, using both neural network and hidden Markov model (HMM) (Emmanuelsson et al., 2007). After cutting at the cleavage site predicted by the SignalP HMM, analysis by TMHMM 2.0 (Krogh et al., 2001) gave 63 genes with a signal peptide and no transmembrane domain which may include candidate effectors.

Screening of J2 and parasitic-stage libraries

High-density filters, each containing two duplicates of 9216 clones from the J2 cDNA library and 9216 clones from the parasitic-stage cDNA library, were produced using a Qbot (Genetix). These filters were screened with 32P-labelled cDNA extracted from J2 or parasitic-stage nematodes. cDNA was produced using the Dynabeads mRNA direct kit (Dynal) and a Superscript III cDNA synthesis kit (Invitrogen, Paisley, UK), as described above, and labelled with 32P-dCTP using Ready-to-go labelling beads (GE Healthcare, Little Chalfont, UK) following the manufacturer's instructions. Clones (from either library) that showed greater hybridization with probe generated from parasitic-stage cDNA than with probe from J2 cDNA were selected for further analysis. Plasmid was produced from each of these clones and resequenced. The resulting sequences were contigged and blast searched as described above.

In situ hybridization

Some candidate effectors (see Fig. 3) were used for in situ hybridization experiments as described previously (Jones et al., 2000). The sequences of primers used to amplify fragments of these genes for probe synthesis are given in Table 5.

Subcellular localization of SPRYSEC-like genes

The subcellular localization of G. pallida proteins similar to SPRYSEC proteins from G. pallida and G. rostochiensis (Blanchard et al., 2005; Qin et al., 2000) was investigated by making fusions of five of the genes to the 5′ and 3′ ends of the GFP gene in a tobacco rattle virus (TRV) RNA2 vector. The open reading frame of each of the five SPRYSEC genes was amplified using a proof reading polymerase (KOD, Novagen, Beeston, UK) employing primers (which incorporated attB sites for BP Gateway cloning, Table 5), and cloned into pDONR221 using the BP Clonase system (Invitrogen) according to the manufacturer's instructions. An AatII/EcoRI fragment from TRV-2b-GFP (Valentine et al., 2004) containing the 2b and GFP genes was inserted between the same sites of the binary vector pTRV2 (Liu et al., 2002). Subsequently, Gateway™ cassettes were inserted at the 5′ or 3′ end of the GFP gene to allow virus-based expression of GFP fusions. LR Clonase was used to introduce SPRYSEC sequences, with and without stop codons, at the 3′ and 5′ ends of the GFP gene, respectively. Agrobacterium cultures containing pTRV1 (Liu et al., 2002) and the pTRV2 derivatives were resuspended to an optical density at 600 nm (OD600) of 0.1 and mixtures in a 1 : 1 ratio were infiltrated into Nicotiana benthamiana leaves. The fluorescence of expressed GFP fusions was imaged in infected leaf and root tissue using a Leica (Milton Keynes, UK) SP2 confocal laser-scanning microscope with excitation at 488 nm and emission collection between 510 and 550 nm.


The authors thank Clare McQuade for assistance with DNA sequencing and Anne Holt, Alison Paterson and Ailsa Smith for technical assistance. Funding for this work was provided by the Scottish Executive Environment and Rural Affairs Department potato pathology work package (1.5), SA Link project LK0933, A NATO–Royal Society postdoctoral award (LP and VB), The Franco–British research partnership programme Alliance (French Ministry of European and Foreign Affairs/British Council) (VB/EG) and the Biotechnology and Biological Sciences Research Council, UK (CJL). This work benefited from interactions stimulated and funded through COST Action 872.