Genome-wide sequencing of Phytophthora lateralis reveals genetic variation among isolates from Lawson cypress (Chamaecyparis lawsoniana) in Northern Ireland


Correspondence: David J. Studholme, Geoffrey Pope Building Biosciences, University of Exeter, Stocker Road, Exeter, EX4 4QD, UK. Tel.: +44 (0) 1392 724678; fax: +44 (0) 1392 263434; e-mail:


Phytophthora lateralis is a fungus-like (oomycete) pathogen of trees in the family Cupressaceae, including Chamaecyparis lawsoniana (Lawson cypress or Port Orford cedar). Known in North America since the 1920s, presumably having been accidentally introduced from its assumed East Asian centre of origin, until recently, this pathogen has not been identified causing disease in Europe except for a few isolated outbreaks. However, since 2010, there have been several reports of infection of C. lawsoniana by P. lateralis in the United Kingdom, including Northern Ireland. We sequenced the genomes of four isolates of P. lateralis from two sites in Northern Ireland in 2011. Comparison with the closely related tree and shrub pathogen P. ramorum (cause of ramorum disease of larch and other species in the UK) shows that P. lateralis shares 91.47% nucleotide sequence identity over the core conserved compartments of the genome. The genomes of the four Northern Ireland isolates are almost identical, but we identified several single-nucleotide polymorphisms (SNPs) that distinguish between isolates, thereby presenting potential molecular markers of use for tracking routes of spread and in epidemiological studies. Our data reveal very low rates of heterozygosity (compared with P. ramorum), consistent with inbreeding within this P. lateralis population.


The genus Phytophthora includes many devastating pathogens of plants; the most notorious example is P. infestans, which causes late blight in potatoes and was responsible for the great famine of Ireland in the 1840s (Kroon et al., 2012). This genus belongs to the oomycetes, having closer affinity with photosynthetic heterokonts such as the photosynthetic brown algae and diatoms and is not closely related to true fungi despite superficial similarity (Cavalier-Smith & Chao, 2006). Several Phytophthora species are emerging as increasingly important pathogens of trees in Europe and North America (Brasier & Webber, 2010) probably imported via international trade (Brasier, 2008) and, in the last 10 years, four have been reported in the UK and Ireland for the first time. Phytophthora kernoviae was identified in South West England in 2005 and Ireland in 2008 (hosts include beech, rhododendron and bilberry; Brasier et al., 2005), P. pseudosyringae was recorded on species including beech, hornbeam and Nothofagus spp. (Scanu et al., 2012) and perhaps the best known, P. ramorum, which is responsible for sudden oak death in North America, has, since 2009, caused sudden larch death in the UK (Brasier & Webber, 2010). Most recently, Green et al. (2013) reported the first occurrence of P. lateralis in the UK.

Phytophthora lateralis (Tucker & Milbrath, 1942) is a pathogen of trees in the family Cupressaceae (Robin et al., 2011), including the Lawson cypress or Port Orford cedar (Chamaecyparis lawsoniana). It is closely related to P. ramorum, with which it occurs in the 8c clade together with P. hibernalis, a pathogen of citrus (Blair et al., 2008; Kroon et al., 2012). Its relatively narrow host range contrasts markedly with the wide host range of P. ramorum which infects North American oak species, larch, rhododendron and several other plants in the Ericacae (Hansen, 2008). Phytophthora lateralis also differs from P. ramorum in that infects its host via the root, whereas P. ramorum is primarily a foliar pathogen, its deciduous sporangia making it well adapted for aerial spread (Davidson et al., 2005). Phytophthora lateralis has been known in North America since the 1920s, presumably having been accidentally introduced from its assumed east Asian centre of origin (Grünwald et al., 2008; Brasier et al., 2010; Webber et al., 2012). Until recently, this pathogen had not been found in Europe except for a few isolated outbreaks in France and the Netherlands (Green et al., 2013). However, since 2010, there have been several reports of P. lateralis in the United Kingdom, from Scotland, England and Northern Ireland (Green et al., 2013).

Complete genome assemblies, of at least draft quality, are available for several species of Phytophthora including P. sojae (Tyler et al., 2006), P. ramorum (Tyler et al., 2006), P. infestans (Haas et al., 2009) and P. capsici (Lamour et al., 2012). Genome-wide resequencing data are also available for additional strains of P. infestans and several species very closely related to P. infestans (P. ipomoeae, P. mirabilis and P. phaseoli) (Raffaele et al., 2010; Cooke et al., 2012) all belonging to Phytophthora clade 1c (Blair et al., 2008). Here, we present the first draft genome sequence for P. lateralis.

Materials and methods

We used the Illumina HiSeq 2000 instrument, following the manufacturer's instructions, to generate genome-wide sequence data for four isolates of P. lateralis collected from two sites in Northern Ireland, one in the north west in County Londonderry (isolates MPF4, MPF6) and one in the south east in County Down (isolates SMST21, SMSTG). We generated paired reads of 100 bases each. The sources of all four isolates were Lawson cypress (Chamaecyparis lawsoniana), and the pathogen was isolated from phloem tissue, found just beneath the outer bark. We assembled draft genome assemblies for each isolate using Velvet 1.2.03 (Zerbino & Birney, 2008) (with k-mer size = 65 and coverage cut-off = 2). We removed sequences that showed significant nucleotide sequence similarity to bacterial sequences in the RefSeq database (Pruitt et al., 2012). Accession numbers for the sequence data are provided in Table 1.

Table 1. Accession numbers and summary statistics for the Phytophthora lateralis sequence data described in this article. N50 lengths were calculated on the basis of an assumed genome size of 65 Mb. Only contigs and scaffolds of at least 500 base pairs long are included in the assembly statistics
GenBank AMZP00000000 ANHN00000000 ANHO00000000 AOFH00000000
Number of 100-bp read pairs53 251 21451 030 28636 962 01071 128 586
Total sequence length of assembly43 981 37941 960 57943 878 23446 415 064
Number of scaffolds6372377732215628
Scaffold N5030 38519 83825 83116 640
Number of contigs14 9404 3374 3589 074
Contig N5013 44017 15217 73811 894

We identified SNPs by aligning the sequence reads from each isolate against the de novo assembly of the P. lateralis genome assembly using BWA (Li & Durbin, 2009) and generating a pileup-formatted alignment using SamTools (Li et al., 2009). The pileup file is essentially a table of frequencies of nucleotides aligned at each genomic position. We used a very conservative method for calling SNPs from the alignment: for each SNP position, there had to be at least 10X coverage in all four datasets and at least 95% consensus within each of the four datasets at that position. Any site in the genome that had less than 10X coverage or < 95% consensus, in any of the four isolates, was considered to be ambiguous and excluded from SNP analysis. Of 43 172 449 base pairs in the MPF4 contigs, 1 765 945 were considered to be ambiguous, leaving 41 406 504 base pairs that are unambiguous across all four isolates.

Results and discussion

Overview of genome sequencing and assembly

We generated draft genome assemblies for each isolate using Velvet (Zerbino & Birney, 2008). Table 1 gives the summary statistics for the assemblies. The total lengths of assembled sequences in our P. lateralis assemblies are comparable to that of the previously sequenced P. ramorum (Tyler et al., 2006). The total length of the P. ramorum assembly (Tyler et al., 2006) was 66 655 834 base pairs. However, about 18 Mb of the P. ramorum genome consists of sequences that are difficult to assemble and fall within short contigs or into gaps between contigs. When we remove gaps between contigs and remove contigs shorter than 2 kb, the remaining length of the P. ramorum assembly is just 48 430 824 bp; the corresponding figure for our P. lateralis SMST21 assembly is 46 644 229 base pairs. This observation is consistent with the two closely related Phytophthora species having genomes of similar size and sequence complexity.

Genetic variation among P. lateralis isolates from Northern Ireland

Our genome assemblies of the four Northern Ireland isolates appear to be almost identical to each other. However, we found several single-nucleotide polymorphisms (SNPs) that distinguish between isolates (Fig. 2). Most of the SNPs distinguished isolates between the two outbreak locations (MPF4 and MPF6 vs. SMST21 and SMSTG), while isolates at a single site shared a common genotype. The exceptions are at position 17 260 in GenBank: AMZP01003473.1, at which MPF4 and MPF6 differ from each other, and position 1890 in GenBank: AMZP01003269.1, at which isolate SMST21 differs from isolate SMSTG. Knowledge of such SNPs within an otherwise genetically monomorphic pathogen may be useful for tracking pathogen transmission and spread, which is particularly valuable for a quarantine pathogen of non-native species, which may be spread through international plant trade.

Comparison of P. lateralis genome vs. previously sequenced Phytophthora genomes

Phytophthora ramorum is the known species most closely related to P. lateralis (Blair et al., 2008, 2012; Martin et al., 2012). The genome of P. lateralis shows extensive synteny with that of P. ramorum NA1 (data not shown). Using the dnadiff tool in the MUMmer package (Jung et al., 2002), we found that 72% of the P. lateralis genome assembly was alignable at the DNA sequence level against the previously published genome of P. ramorum NA2 (Tyler et al., 2006). Over this alignable portion of the genome, P. lateralis and P. ramorum share 91.47% nucleotide sequence identity. Sequence identity is lower for P. lateralis vs. P. capsici (84.5%), P. infestans (84.4%) and P. sojae (85.6%). Over the ‘core orthologue’ set of proteins (Haas et al., 2009) that are conserved between P. ramorum, P. sojae and P. infestans, the mean amino acid sequence identity between proteins from P. lateralis and P. ramorum is 94%.

The repertoire of RxLR effectors in P. lateralis

The previously sequenced P. ramorum genome (Tyler et al., 2006) encodes 309 RxLR proteins (Haas et al., 2009) that are believed to act as virulence effectors, being secreted and translocated into the host cell to subvert host defences (Birch et al., 2009; Schornack et al., 2009; Wawra et al., 2012). For 80% of these (248 of 309), we found at least one homologue in P. lateralis (sharing at least 80% amino acid sequence identity over at least 90% of the sequence); see Fig. 1a. These 248 RxLR proteins are listed in the Supporting Information. It is possible that the remaining 61 RxLR proteins are absent from P. lateralis, or are too divergent in their sequences to be detected at this stringency, or their sequences may be incomplete in our draft genome assemblies. As an indicator of the completeness of our assemblies, we counted how many of the Phytophthora ‘core orthologue’ protein sequences could be recovered from our P. lateralis assemblies. These core orthologues comprise 7107 P. ramorum proteins that are conserved between P. infestans, P. ramorum and P. sojae (Haas et al., 2009), and so, we would expect most of these to be also conserved in P. lateralis. We found that 98.5% (7006 of 7107) of these ‘core orthogue’ proteins had at least one homologue in P. lateralis (sharing at least 80% amino acid sequence identity over at least 90% of the sequence); see Fig. 1d. This greater variability in RxLR proteins compared with core conserved proteins is consistent with previous findings (Tyler et al., 2006; Haas et al., 2009) for accelerated evolution of these putative virulence effectors.

Figure 1.

Conservation of RxLR effector proteins between Phytophthora lateralis and previously sequenced Phytophthora species. We performed BLAT (Kent, 2002) searches using as queries each of a previously curated catalogue of 1207 RxLR sequences (Haas et al., 2009). These 1207 RxLR proteins included 309 from Phytophthora ramorum, 563 from Phytophthora infestans and 335 from Phytophthora sojae (Haas et al., 2009). We searched against previously published genome assemblies of P. ramorum, P. sojae, P. infestans and P. capsici (Tyler et al., 2006; Haas et al., 2009; Lamour et al., 2012) as well as against the four combined P. lateralis assemblies. We counted a protein as conserved if BLAT recovered a hit covering at least 90% of the query sequence with an amino acid sequence identity of at least 80%. Venn diagrams in a, b and c summarise the numbers of conserved RxLR proteins from each of the three sets of query sequences, respectively, from P. ramorum, P. sojae, P. infestans and P. capsici. As a control, in d, we also include results of the same analysis using as queries a set of 7107 P. ramorum proteins that are conserved between P. ramorum, P. sojae, P. infestans and P. capsici (Haas et al., 2009).

Although P. lateralis is much more closely related to P. ramorum than to other previously sequenced Phytophthora species, the P. lateralis genome encodes several RxLR proteins not found in P. ramorum; see Fig. 1b and c. For example, seven P. sojae RxLR proteins are conserved in P. lateralis: PsG_158999, PsG_159007, PsG_136284, PsG_139250, PsG_139445, PsG_159166 and PsG_159114; see Fig. 1b. Similarly, five P. infestans RxLR proteins are conserved in P. lateralis but not in P. ramorum (Fig. 1c); these are: PITG_02918, PITG_21107, PITG_18609, PITG_15152 and PITG_06419. It is likely that some of these RxLR proteins are ancestral and that they have been lost in P. ramorum. For example, P. sojae RxLR PsG_139250 is conserved in P. capsici and P. infestans as well as P. lateralis. It is more parsimonious to propose loss in the P. ramorum lineage than independent acquisitions in several different species. This is further supported by the presence of a short fragment of this gene being retained in P. ramorum (GenBank: AAQX01000350.1 positions 41331-41441).

The genome of P. lateralis shows a high level of homozygosity

The genome of P. lateralis showed very low levels of heterozygosity compared with P. ramorum (Fig. 3). This is to be expected given that P. lateralis is assumed to be homothallic (Tucker & Milbrath, 1942; Kroon et al., 2012; Martin et al., 2012) and therefore highly inbred. On the other hand, P. ramorum is heterothallic (Kroon et al., 2012; Martin et al., 2012) but does not appear to reproduce sexually in Europe and North America (Goss et al., 2009) and so has no mechanism for eliminating heterozygosity through sexual recombination (Charlesworth & Wright, 2001). Interestingly, we found one SNP site that was heterozygous in one of the P. lateralis isolates (Fig. S1). This presumably represents a spontaneous point mutation that arose in a recent ancestor of MPF4, in which strain the genomic site is heterozygous. However, the site is homozygous in the other three isolates, suggesting that the heterozygosity has been eliminated via inbreeding during the time period since these isolates diverged from each other.

Figure 2.

Homozygous SNPs that distinguish between the four Phytophthora lateralis isolates. Background shading is used to clarify the distributions of genotypes across the four different P. lateralis isolates.

Figure 3.

Comparison of levels of heterozygosity in Phytophthora lateralis and in Phytophthora ramorum. We aligned genome-wide Illumina sequence reads (SRA: SRX202264) from P. lateralis SMST21 against its draft genome assembly (GenBank: ANHO00000000) using BWA (Li & Durbin, 2009).We examined 430 744 single-nucleotide sites in the alignment and at each site, we calculated the relative frequency (density) of each of the most common nucleotide at that position; we would expect the frequency to be close to 1 at homozygous sites and close to 0.5 at heterozygous sites. The black line indicates the frequency distributions for the density of the most abundant nucleotide at each site. The unimodal distribution for P. lateralis indicates very high levels of homozygosity. The grey line indicates the results of the same analysis performed on 422 016 sites in the genome of P. ramorum. The bimodal distribution, with the minor mode at 0.5 indicated by an arrow, indicates a significant minority of sites being heterozygous. For P. ramorum, the sequence reads were from SRA accession SRX202258 and the genome assembly was GenBank accession AMZZ00000000.

Availability of data

The raw sequence data have been deposited in Sequence Read archive (SRA), and the de novo assemblies have been deposited in GenBank where they are freely available under the accession numbers listed in Table 1.


We hereby present the draft genome sequence of P. lateralis, specifically the sequences of four isolates from Northern Ireland. Despite very low levels of genetic diversity among these isolates, we identified several SNPs that are candidates for molecular markers to distinguish between ‘individuals’ of the pathogen and thereby allow tracking of the pathogen's spread between outbreak sites. The availability of complete genome sequence data has been invaluable in understanding the fundamental biology of other Phytophthora species (Grünwald, 2012), notably the evolution and species specificity of the pathogens' effector complements (Kamoun, 2006; Tyler et al., 2006; Birch et al., 2009). It has also been useful in more practical applications such as developing molecular markers that can be used for surveillance and detection as well as revealing patterns of geographical spread of the pathogen (Grünwald, 2012). We hope that availability of whole genome sequence data from several isolate will accelerate progress in research and control of this pathogen.


We are grateful to Karen Moore and Brendan Moreland for their expert technical assistance. This work was supported by The Gatsby Charitable Foundation. Paul O'Neill was supported by a doctoral studentship jointly funded by the Food and Environment Research Agency and the College of Life and Environmental Sciences, University of Exeter.