Correspondence: David J. Studholme, Geoffrey Pope Building Biosciences, University of Exeter, Stocker Road, Exeter, EX4 4QD, UK. Tel.: +44 (0) 1392 724678; fax: +44 (0) 1392 263434; e-mail: D.J.Studholme@exeter.ac.uk
Phytophthora lateralis is a fungus-like (oomycete) pathogen of trees in the family Cupressaceae, including Chamaecyparis lawsoniana (Lawson cypress or Port Orford cedar). Known in North America since the 1920s, presumably having been accidentally introduced from its assumed East Asian centre of origin, until recently, this pathogen has not been identified causing disease in Europe except for a few isolated outbreaks. However, since 2010, there have been several reports of infection of C. lawsoniana by P. lateralis in the United Kingdom, including Northern Ireland. We sequenced the genomes of four isolates of P. lateralis from two sites in Northern Ireland in 2011. Comparison with the closely related tree and shrub pathogen P. ramorum (cause of ramorum disease of larch and other species in the UK) shows that P. lateralis shares 91.47% nucleotide sequence identity over the core conserved compartments of the genome. The genomes of the four Northern Ireland isolates are almost identical, but we identified several single-nucleotide polymorphisms (SNPs) that distinguish between isolates, thereby presenting potential molecular markers of use for tracking routes of spread and in epidemiological studies. Our data reveal very low rates of heterozygosity (compared with P. ramorum), consistent with inbreeding within this P. lateralis population.
The genus Phytophthora includes many devastating pathogens of plants; the most notorious example is P. infestans, which causes late blight in potatoes and was responsible for the great famine of Ireland in the 1840s (Kroon et al., 2012). This genus belongs to the oomycetes, having closer affinity with photosynthetic heterokonts such as the photosynthetic brown algae and diatoms and is not closely related to true fungi despite superficial similarity (Cavalier-Smith & Chao, 2006). Several Phytophthora species are emerging as increasingly important pathogens of trees in Europe and North America (Brasier & Webber, 2010) probably imported via international trade (Brasier, 2008) and, in the last 10 years, four have been reported in the UK and Ireland for the first time. Phytophthora kernoviae was identified in South West England in 2005 and Ireland in 2008 (hosts include beech, rhododendron and bilberry; Brasier et al., 2005), P. pseudosyringae was recorded on species including beech, hornbeam and Nothofagus spp. (Scanu et al., 2012) and perhaps the best known, P. ramorum, which is responsible for sudden oak death in North America, has, since 2009, caused sudden larch death in the UK (Brasier & Webber, 2010). Most recently, Green et al. (2013) reported the first occurrence of P. lateralis in the UK.
Phytophthora lateralis (Tucker & Milbrath, 1942) is a pathogen of trees in the family Cupressaceae (Robin et al., 2011), including the Lawson cypress or Port Orford cedar (Chamaecyparis lawsoniana). It is closely related to P. ramorum, with which it occurs in the 8c clade together with P. hibernalis, a pathogen of citrus (Blair et al., 2008; Kroon et al., 2012). Its relatively narrow host range contrasts markedly with the wide host range of P. ramorum which infects North American oak species, larch, rhododendron and several other plants in the Ericacae (Hansen, 2008). Phytophthora lateralis also differs from P. ramorum in that infects its host via the root, whereas P. ramorum is primarily a foliar pathogen, its deciduous sporangia making it well adapted for aerial spread (Davidson et al., 2005). Phytophthora lateralis has been known in North America since the 1920s, presumably having been accidentally introduced from its assumed east Asian centre of origin (Grünwald et al., 2008; Brasier et al., 2010; Webber et al., 2012). Until recently, this pathogen had not been found in Europe except for a few isolated outbreaks in France and the Netherlands (Green et al., 2013). However, since 2010, there have been several reports of P. lateralis in the United Kingdom, from Scotland, England and Northern Ireland (Green et al., 2013).
Complete genome assemblies, of at least draft quality, are available for several species of Phytophthora including P. sojae (Tyler et al., 2006), P. ramorum (Tyler et al., 2006), P. infestans (Haas et al., 2009) and P. capsici (Lamour et al., 2012). Genome-wide resequencing data are also available for additional strains of P. infestans and several species very closely related to P. infestans (P. ipomoeae, P. mirabilis and P. phaseoli) (Raffaele et al., 2010; Cooke et al., 2012) all belonging to Phytophthora clade 1c (Blair et al., 2008). Here, we present the first draft genome sequence for P. lateralis.
Materials and methods
We used the Illumina HiSeq 2000 instrument, following the manufacturer's instructions, to generate genome-wide sequence data for four isolates of P. lateralis collected from two sites in Northern Ireland, one in the north west in County Londonderry (isolates MPF4, MPF6) and one in the south east in County Down (isolates SMST21, SMSTG). We generated paired reads of 100 bases each. The sources of all four isolates were Lawson cypress (Chamaecyparis lawsoniana), and the pathogen was isolated from phloem tissue, found just beneath the outer bark. We assembled draft genome assemblies for each isolate using Velvet 1.2.03 (Zerbino & Birney, 2008) (with k-mer size = 65 and coverage cut-off = 2). We removed sequences that showed significant nucleotide sequence similarity to bacterial sequences in the RefSeq database (Pruitt et al., 2012). Accession numbers for the sequence data are provided in Table 1.
Table 1. Accession numbers and summary statistics for the Phytophthora lateralis sequence data described in this article. N50 lengths were calculated on the basis of an assumed genome size of 65 Mb. Only contigs and scaffolds of at least 500 base pairs long are included in the assembly statistics
We identified SNPs by aligning the sequence reads from each isolate against the de novo assembly of the P. lateralis genome assembly using BWA (Li & Durbin, 2009) and generating a pileup-formatted alignment using SamTools (Li et al., 2009). The pileup file is essentially a table of frequencies of nucleotides aligned at each genomic position. We used a very conservative method for calling SNPs from the alignment: for each SNP position, there had to be at least 10X coverage in all four datasets and at least 95% consensus within each of the four datasets at that position. Any site in the genome that had less than 10X coverage or < 95% consensus, in any of the four isolates, was considered to be ambiguous and excluded from SNP analysis. Of 43 172 449 base pairs in the MPF4 contigs, 1 765 945 were considered to be ambiguous, leaving 41 406 504 base pairs that are unambiguous across all four isolates.
Results and discussion
Overview of genome sequencing and assembly
We generated draft genome assemblies for each isolate using Velvet (Zerbino & Birney, 2008). Table 1 gives the summary statistics for the assemblies. The total lengths of assembled sequences in our P. lateralis assemblies are comparable to that of the previously sequenced P. ramorum (Tyler et al., 2006). The total length of the P. ramorum assembly (Tyler et al., 2006) was 66 655 834 base pairs. However, about 18 Mb of the P. ramorum genome consists of sequences that are difficult to assemble and fall within short contigs or into gaps between contigs. When we remove gaps between contigs and remove contigs shorter than 2 kb, the remaining length of the P. ramorum assembly is just 48 430 824 bp; the corresponding figure for our P. lateralis SMST21 assembly is 46 644 229 base pairs. This observation is consistent with the two closely related Phytophthora species having genomes of similar size and sequence complexity.
Genetic variation among P. lateralis isolates from Northern Ireland
Our genome assemblies of the four Northern Ireland isolates appear to be almost identical to each other. However, we found several single-nucleotide polymorphisms (SNPs) that distinguish between isolates (Fig. 2). Most of the SNPs distinguished isolates between the two outbreak locations (MPF4 and MPF6 vs. SMST21 and SMSTG), while isolates at a single site shared a common genotype. The exceptions are at position 17 260 in GenBank: AMZP01003473.1, at which MPF4 and MPF6 differ from each other, and position 1890 in GenBank: AMZP01003269.1, at which isolate SMST21 differs from isolate SMSTG. Knowledge of such SNPs within an otherwise genetically monomorphic pathogen may be useful for tracking pathogen transmission and spread, which is particularly valuable for a quarantine pathogen of non-native species, which may be spread through international plant trade.
Comparison of P. lateralis genome vs. previously sequenced Phytophthora genomes
Phytophthora ramorum is the known species most closely related to P. lateralis (Blair et al., 2008, 2012; Martin et al., 2012). The genome of P. lateralis shows extensive synteny with that of P. ramorum NA1 (data not shown). Using the dnadiff tool in the MUMmer package (Jung et al., 2002), we found that 72% of the P. lateralis genome assembly was alignable at the DNA sequence level against the previously published genome of P. ramorum NA2 (Tyler et al., 2006). Over this alignable portion of the genome, P. lateralis and P. ramorum share 91.47% nucleotide sequence identity. Sequence identity is lower for P. lateralis vs. P. capsici (84.5%), P. infestans (84.4%) and P. sojae (85.6%). Over the ‘core orthologue’ set of proteins (Haas et al., 2009) that are conserved between P. ramorum, P. sojae and P. infestans, the mean amino acid sequence identity between proteins from P. lateralis and P. ramorum is 94%.
The repertoire of RxLR effectors in P. lateralis
The previously sequenced P. ramorum genome (Tyler et al., 2006) encodes 309 RxLR proteins (Haas et al., 2009) that are believed to act as virulence effectors, being secreted and translocated into the host cell to subvert host defences (Birch et al., 2009; Schornack et al., 2009; Wawra et al., 2012). For 80% of these (248 of 309), we found at least one homologue in P. lateralis (sharing at least 80% amino acid sequence identity over at least 90% of the sequence); see Fig. 1a. These 248 RxLR proteins are listed in the Supporting Information. It is possible that the remaining 61 RxLR proteins are absent from P. lateralis, or are too divergent in their sequences to be detected at this stringency, or their sequences may be incomplete in our draft genome assemblies. As an indicator of the completeness of our assemblies, we counted how many of the Phytophthora ‘core orthologue’ protein sequences could be recovered from our P. lateralis assemblies. These core orthologues comprise 7107 P. ramorum proteins that are conserved between P. infestans, P. ramorum and P. sojae (Haas et al., 2009), and so, we would expect most of these to be also conserved in P. lateralis. We found that 98.5% (7006 of 7107) of these ‘core orthogue’ proteins had at least one homologue in P. lateralis (sharing at least 80% amino acid sequence identity over at least 90% of the sequence); see Fig. 1d. This greater variability in RxLR proteins compared with core conserved proteins is consistent with previous findings (Tyler et al., 2006; Haas et al., 2009) for accelerated evolution of these putative virulence effectors.
Although P. lateralis is much more closely related to P. ramorum than to other previously sequenced Phytophthora species, the P. lateralis genome encodes several RxLR proteins not found in P. ramorum; see Fig. 1b and c. For example, seven P. sojae RxLR proteins are conserved in P. lateralis: PsG_158999, PsG_159007, PsG_136284, PsG_139250, PsG_139445, PsG_159166 and PsG_159114; see Fig. 1b. Similarly, five P. infestans RxLR proteins are conserved in P. lateralis but not in P. ramorum (Fig. 1c); these are: PITG_02918, PITG_21107, PITG_18609, PITG_15152 and PITG_06419. It is likely that some of these RxLR proteins are ancestral and that they have been lost in P. ramorum. For example, P. sojae RxLR PsG_139250 is conserved in P. capsici and P. infestans as well as P. lateralis. It is more parsimonious to propose loss in the P. ramorum lineage than independent acquisitions in several different species. This is further supported by the presence of a short fragment of this gene being retained in P. ramorum (GenBank: AAQX01000350.1 positions 41331-41441).
The genome of P. lateralis shows a high level of homozygosity
The genome of P. lateralis showed very low levels of heterozygosity compared with P. ramorum (Fig. 3). This is to be expected given that P. lateralis is assumed to be homothallic (Tucker & Milbrath, 1942; Kroon et al., 2012; Martin et al., 2012) and therefore highly inbred. On the other hand, P. ramorum is heterothallic (Kroon et al., 2012; Martin et al., 2012) but does not appear to reproduce sexually in Europe and North America (Goss et al., 2009) and so has no mechanism for eliminating heterozygosity through sexual recombination (Charlesworth & Wright, 2001). Interestingly, we found one SNP site that was heterozygous in one of the P. lateralis isolates (Fig. S1). This presumably represents a spontaneous point mutation that arose in a recent ancestor of MPF4, in which strain the genomic site is heterozygous. However, the site is homozygous in the other three isolates, suggesting that the heterozygosity has been eliminated via inbreeding during the time period since these isolates diverged from each other.
Availability of data
The raw sequence data have been deposited in Sequence Read archive (SRA), and the de novo assemblies have been deposited in GenBank where they are freely available under the accession numbers listed in Table 1.
We hereby present the draft genome sequence of P. lateralis, specifically the sequences of four isolates from Northern Ireland. Despite very low levels of genetic diversity among these isolates, we identified several SNPs that are candidates for molecular markers to distinguish between ‘individuals’ of the pathogen and thereby allow tracking of the pathogen's spread between outbreak sites. The availability of complete genome sequence data has been invaluable in understanding the fundamental biology of other Phytophthora species (Grünwald, 2012), notably the evolution and species specificity of the pathogens' effector complements (Kamoun, 2006; Tyler et al., 2006; Birch et al., 2009). It has also been useful in more practical applications such as developing molecular markers that can be used for surveillance and detection as well as revealing patterns of geographical spread of the pathogen (Grünwald, 2012). We hope that availability of whole genome sequence data from several isolate will accelerate progress in research and control of this pathogen.
We are grateful to Karen Moore and Brendan Moreland for their expert technical assistance. This work was supported by The Gatsby Charitable Foundation. Paul O'Neill was supported by a doctoral studentship jointly funded by the Food and Environment Research Agency and the College of Life and Environmental Sciences, University of Exeter.