Intraspecific comparative genomics to identify avirulence genes from Phytophthora

Authors


Author for correspondence: Sophien KamounTel: +1 330 263 3847Fax: +1 330 263 3841Email: kamoun.1@osu.edu

Summary

Members of the oomycete genus Phytophthora cause some of the most devastating plant diseases in the world and are arguably the most destructive pathogens of dicot plants. Phytophthora research has entered the genomics era. Current genomic resources include expressed sequence tags from a variety of developmental and infection stages, as well as sequences of selected regions of Phytophthora genomes. Genomics promise to impact upon our understanding of the molecular basis of infection by Phytophthora, for example, by facilitating the isolation of genes encoding effector molecules with a role in virulence and avirulence. Based on prevalent models of plant–pathogen coevolution, some of these effectors, notably those with avirulence functions, are predicted to exhibit significant sequence variation within populations of the pathogen. This and other features were used to identify candidate avirulence genes from sequence databases. Here, we describe a strategy that combines data mining with intraspecific comparative genomics and functional analyses for the identification of novel avirulence genes from Phytophthora. This approach provides a rapid and efficient alternative to classical positional cloning strategies for identifying avirulence genes that match known resistance genes. In addition, this approach has the potential to uncover ‘orphan’ avirulence genes for which corresponding resistance genes have not previously been characterized.

Introduction

Induced resistance in plant–microbe interactions is regulated by recognition of pathogen molecules by the plant. This is illustrated by the gene-for-gene concept, which implies that a ligand (elicitor) encoded by an avirulence (Avr) gene from the pathogen is recognized by a receptor encoded by a resistance (R) gene from the plant, resulting in recognition of the pathogen and activation of plant defense mechanisms such as the hypersensitive response (HR) (Staskawicz et al., 1995; Hammond-Kosack & Jones, 1997; Dangl & Jones, 2001). In its simplest illustration, the biochemical basis of the gene-for-gene model consists of direct interaction between Avr and R gene products. However, recent studies indicate a more complex basis for recognition, in which perception of Avr products by R proteins is indirect and involves at least a third component (Dixon et al., 2000; Leister & Katagiri, 2000; Rivas et al., 2002a,b). According to the ‘Guard hypothesis’ (Van der Biezen & Jones, 1998) this component is a virulence target (VT) that is recognized by the Avr product in both susceptible and resistant plants. The R protein thus acts as a ‘guard’, monitoring alterations in the VT mediated by the Avr product, and promoting further defense signaling. Supporting evidence for the Guard Hypothesis is accumulating with the identification of the VT in several gene-for-gene interactions (Dangl & Jones, 2001; Innes, 2001; Swiderski & Innes, 2001; Kim et al., 2002; Kruger et al., 2002; Rivas et al., 2002a,b; Schneider, 2002).

Avirulence genes of eukaryotic plant pathogenic microbes exhibit a number of common structural and functional features (Lauge & De Wit, 1998; van’t Slot & Knogge, 2002). For example, the majority of fungal Avr genes described to date encode extracellular proteins with a type II secretion peptide. Many eukaryotic Avr genes, such as Cladosporium fulvum Avr2, Avr4, and Avr9, Rhynchosporium secalis nip1, and Phytophthora elicitins, encode small secreted proteins with an even number of cysteine residues, that can induce defense responses when infiltrated into plant tissues (Lauge & De Wit, 1998; van’t Slot & Knogge, 2002). Several of these common structural features, most notably secretion and the disulfide bridges formed by the pairs of cysteines are essential for HR induction and avirulence function (Joosten et al., 1997; Kooman-Gersmann et al., 1997; Lauge & De Wit, 1998; Kamoun et al., 1999a; Luderer et al., 2002a; van’t Slot & Knogge, 2002). The disulfide bridges could enhance stability in the plant apoplast, which is known to be rich in degradative proteases (Joosten et al., 1997; Luderer et al., 2002a). However, despite the structural features of signal peptides for secretion and the presence of cysteine residues, there is no primary DNA sequence similarity between these Avr genes. The Avr genes AVR-Pita and PWL from the rice blast fungus Magnaporthe grisea, also encode extracellular proteins that are secreted via a type II signal peptide (Kang et al., 1995; Sweigard et al., 1995; Jia et al., 2000; Orbach et al., 2000). AVR-Pita encodes a metalloprotease that is thought to be delivered inside rice cells where it can interact with the Pi-ta R protein (Jia et al., 2000; Orbach et al., 2000). The PWL genes encode small proline and glycine rich proteins with unknown function (Kang et al., 1995; Sweigard et al., 1995). In addition to the described structural features, all fungal avirulence genes, including the M. grisea AVR-Pita gene, are actively expressed during infection of the host plant. Some are expressed exclusively during infection, whereas expression of other avirulence genes, such as Avr9, may also be induced by nutritional stress (Perez-Garcia et al., 2001).

By definition, Avr genes that define cultivar-specific resistance exhibit significant sequence variation among races of the pathogen. For example, in virulent races, the C. fulvum avirulence gene Avr4 occurs in various isoforms and the Avr2 gene exhibits various mutations that typically result in truncated proteins (Joosten et al., 1997; Luderer et al., 2002b). The majority of the recessive virulence alleles of Avr2 and Avr4 carry single or simple nucleotide polymorphisms (SNPs) that result in unstable or nonfunctional HR elicitors. By contrast, the Avr9 gene is deleted from races of C. fulvum that infect Cf-9 plants (Lauge & De Wit, 1998). In the R. secalis Avr gene, nip1, single nucleotide changes, resulting in amino acid substitutions, were detected in the coding regions of nip1 alleles from virulent races (Rohe et al., 1995). As a consequence of these changes, interaction of R. secalis with barley is no longer controlled by Rrs1, indicating that recognition by the host plant can be circumvented by alteration of the primary structure of the NIP1 gene product (Rohe et al., 1995). AVR-Pita is located in the unstable telomeric region of chromosome 3 of M. grisea and is frequently deleted in virulent strains (Orbach et al., 2000). In some races of M. grisea that are virulent on Pi-ta plants, the avr-Pita alleles carry insertions (Kang et al., 2001) or point mutations, one of which is within the predicted active site of the protease (Orbach et al., 2000). The PWL genes are highly polymorphic among rice blast isolates from diverse grass species and geographic regions, and appear to contribute to species-specific avirulence (Sweigard et al., 1995). This gene family appears to be highly dynamic and rapidly evolving (Kang et al., 1995).

Oomycetes, such as Phytophthora, downy-mildews, and Pythium, form a unique branch of eukaryotic plant pathogens with an independent evolutionary history (Sogin & Silberman, 1998; Baldauf et al., 2000; Margulis & Schwartz, 2000). Among the oomycetes, Phytophthora species cause some of the most destructive plant diseases in the world, and are arguably the most devastating pathogens of dicot plants (Erwin & Ribeiro, 1996). The most notable and best-studied oomycete is Phytophthora infestans, the Irish famine pathogen. P. infestans causes late blight, a devastating and re-emerging disease of potato and tomato (Fry & Goodwin, 1997a,b; Birch & Whisson, 2001; Schiermeier, 2001; Smart & Fry, 2001; Shattock, 2002). Despite their peculiar phylogenetic affinities and economic importance, oomycetes were chronically under-studied at the molecular level. This trend has dramatically reversed in recent years with significant technical developments, such as routine DNA transformation, use of reporter genes, genetic manipulation using gene silencing, and expanding genomic resources that promise to facilitate gene discovery and functional analyses (Kamoun, 2003).

Genetic analyses of P. infestans and P. sojae indicate that in many cases race/cultivar specificity follows the gene-for-gene model and involves segregating Avr genes. Several Avr genes have been targeted for positional cloning in P. infestans and P. sojae (van der Lee et al., 2001a,b; Whisson et al., 2001; MacGregor et al., 2002; May et al., 2002; Tyler, 2002). Nevertheless, so far, none of these race-specific Avr genes from Phytophthora have been described in the literature mainly because of the numerous difficulties encountered with positional cloning in Phytophthora, such as high levels of repetitive DNA and aberrant segregation at the target locus. To date, the only Avr genes described from Phytophthora are members of the elicitin family, which are thought to condition avirulence to the nonhost plant tobacco (Kamoun et al., 1998; Kamoun, 2001) and were first identified biochemically as elicitors of the HR (Ponchet et al., 1999). In this paper, we describe an alternative approach to positional cloning for the identification of Avr genes in Phytophthora. This strategy consists of several steps (Fig. 1) and takes complete advantage of genome sequence data by combining expressed sequence tag (EST) data mining, intraspecific comparative genomics (association genetics, or linkage disequilibrium), and functional analyses. First, candidate genes exhibiting structural features indicative of known Avr genes are selected from sequence databases. Then, these genes are amplified and sequenced from a panel of phenotypically characterized Phytophthora races to identify polymorphic genes and establish associations between allele variation and avirulence phenotype. Finally, candidate Avr genes are validated using functional assays. Here, we describe the experimental steps required for this strategy as currently applied to the identification of race-specific Avr genes from P. infestans in our laboratories at the Ohio State University and the Scottish Crop Research Institute. We also present an update of our efforts to validate this strategy by providing preliminary analyses of polymorphisms for five candidate effector genes of P. infestans. Finally, we discuss the potential limitations of this approach and the outlook for future applications.

Figure 1.

Schematic illustration of strategy for identification of avirulence genes using intraspecific comparative genomics. Four steps are illustrated: (i) PCR amplification of candidate avirulence genes from panels of races of the target pathogen; (ii) DNA sequencing of amplicons; (iii) cloning in expression vectors; and (iv) functional assays on a collection of resistance gene containing plants. See text for details.

Experimental approach

Selection of candidate Avr genes from sequence databases

We have focused on P. infestans genes encoding proteins with structural features characteristic of known fungal and oomycete Avr and elicitor proteins. These features were used as a basis for data mining criteria to identify candidate Avr genes from expressed sequence tag (EST) or genome sequence databases (Kamoun et al., 2002). Some of these criteria were discussed elsewhere (Qutob et al., 2000; Birch & Whisson, 2001; Kamoun et al., 2002; Torto et al., 2003), and include genes encoding extracellular and small cysteine-rich proteins, as well as genes up-regulated during preinfection and infection stages. Computational tools and algorithms have been developed to facilitate the identification of these features from sequence databases. For example, PexFinder, an algorithm, that identifies genes encoding extracellular proteins from ESTs, has been used to select a set of 142 Pex (Phytophthora extracellular proteins) cDNAs, many of which are considered candidate effector genes (Torto et al., 2003; http://www.oardc.ohio-state.edu/phytophthora/pexfinder). In addition, since ESTs from multiple races of P. infestans are available, comparative sequence analyses can be performed to identify allelic polymorphisms between the different races.

Primer design, PCR amplification, and high-throughput sequencing

To identify polymorphic genes, pairs of oligonucleotide primers were designed for the amplification of the entire open reading frame (ORF) from each selected candidate gene. Restriction enzyme sites were appended to the primers to allow facile cloning of the ORFs in plant or microbial expression vectors for functional assays. PCR amplifications were performed on genomic DNA from panels of up to 30 isolates of P. infestans that have been assessed for virulence on 11 (R1–R11) potato R gene differential genotypes. To increase the throughput of the PCR amplifications and subsequent DNA sequencing, all reactions were performed in panels of 8 primer pairs × 12 P. infestans isolates in 96-well microtiter plates. Amplicons were purified using 96-well PCR purification kits, checked on agarose gels, and their concentrations were measured as absorbance at 260 nm by spectrophotometry. Appropriately diluted samples were then submitted in 96-well microtiter plates for high-throughput DNA sequencing at our core sequencing facilities.

Sequence data analysis and interpretation

We used established bioinformatic platforms for sequence analysis and single nucleotide polymorphism (SNP) identification. Base calling, Quality Values (QV), and trimming were obtained with the Phred algorithm (Ewing & Green, 1998; Ewing et al., 1998). Sequences were aligned and examined using SequencherTM 4.1 (Gene Codes Corp, Ann Arbor, MI, USA) or other programs that allow viewing of sequence calls, QV, and chromatogram data. Visual comparison of chromatogram data with the sequence alignment data was performed to evaluate whether differences in nucleotide sequences were likely to be polymorphisms or caused by base calling errors. Depending on the genes examined, several outcomes were observed. Occasionally, no amplification was obtained in some races suggesting that the gene might be missing or that a polymorphism corresponding to the primer sequences might interfere with PCR amplification. This was then tested by DNA blot hybridization and/or using different primer pairs. Some genes were not polymorphic within the set of P. infestans races examined. Other genes were polymorphic ranging from a few to a fairly large number of nucleotide substitutions. Considering that P. infestans is diploid and outbreeding, identified SNPs are frequently in a heterozygous state, leading to mixed sequence peaks at polymorphic nucleotides. Some genes resulted in more complex mixed DNA sequences suggesting that multiple alleles or paralogs were amplified to yield a mixture of amplicons. For these genes, a cloning step was introduced and was followed by sequencing the inserts of a set of randomly picked clones.

We catalogued all identified SNPs and examined the virulence typing data sets for correlations between polymorphisms that result in amino acid substitutions, other structural alterations of the encoded protein, and specific avirulence phenotypes. A simple statistical test was used to determine the extent of linkage disequilibrium between the polymorphic marker and avirulence phenotype (Lewontin, 1964):

image

where pAB is the frequency of AB haplotypes, and similarly for pAb, paB and pab, as a measure of departure from linkage equilibrium. The value of D varies from 0 (linkage equilibrium) to 1 (complete association) and is taken as an absolute number due to the arbitrary assignment of symbols A and B. To date, all tested P. infestans avirulence loci have been determined to be dominant, although some loci can appear recessive in certain crosses. Consequently only those comparisons where a case could be made for avirulent individuals being heterozygous at a locus were considered.

Validation using functional assays

Polymorphic genes identified by comparative sequencing were selected for functional assays, using one or several of the methods available for Phytophthora genes (Kamoun et al., 2002). For example, functional assays can be performed through ectopic expression of candidate Avr genes in plants. To this end, the candidate genes were cloned in plant expression vectors, such as the binary vector pGR106 that allows agroinfection of potato virus X (PVX) in solanaceous plants (Jones et al., 1999) or standard binary expression cassettes that allow transient expression by agroinfiltration (Van der Hoorn et al., 2000). Both of these vectors allow assaying for HR-like symptoms and have been successfully used with several Phytophthora genes (Kamoun et al., 1999a; Kamoun et al., 2002; Qutob et al., 2002; Torto et al., 2003). All of the genes tested for linkage disequilibrium with avirulence have previously been assayed for HR-eliciting activity in the nonhost tobacco, using the PVX expression system. None were shown to exhibit any HR-eliciting activity in this plant.

Alternatively, candidate genes can be cloned in vectors for bacterial or yeast expression. Recombinant proteins purified from E. coli or yeast can then be injected into plant leaves to assay for induction of the HR. For all these experiments, a panel of resistant plant genotypes, including varieties of tomato and potato known to carry R genes, as well as wild relatives of these host plants, will be selected for the functional assays.

A standard complementation approach can also be taken for functional analysis of candidate Avr genes. Currently, candidate genes are being cloned in oomycete transformation vectors and used to transform virulent strains that lack the particular allele. Transformed strains expressing the candidate gene can then be assessed for avirulence on the appropriate plant genotype to validate the Avr function. A similar gain of function experiment has been reported in which functional expression of the Phytophthora cryptogea gene encoding the basic elicitin cryptogein in P. infestans resulted in altered interaction with tobacco plants (Panabieres et al., 1998).

SNPs in Phytophthora extracellular proteins (Pex) and association with avirulence

Current work in our laboratories aims at validating the described strategy and applying it to the discovery of Avr genes from the late blight pathogen P. infestans. Here, we present preliminary sequence polymorphism data for five candidate effector genes identified by Torto et al. (2003) as encoding secreted and cysteine-rich proteins. The general characteristics of the five selected genes, scr50, scr58, scr76, scr91, and pex208 are described in Table 1. All five genes encode proteins with a predicted signal peptide ranging from 18 to 29 amino acids in length. The predcited mature peptides for scr50, scr58, scr76, and scr91 are relatively small ranging from 27 to 70 amino acids, and contain an even number of cysteines (range 2–6). Pex208 encodes a predicted 190 amino acid protein with a total of 18 cysteines. All five genes appear restricted to Phytophthora and do not show significant similarity to sequences from other organisms based on BLASTP searches against GenBank nonredundant database (E-value cutoff 1e-02). Scr91 shows similarity to PcF, a 52 amino acid extracellular protein from Phytophthora cactorum, that causes necrosis in strawberry and tomato (Orsomando et al., 2001).

Table 1.  List of candidate Phytophthora infestans scr (small cysteine rich) and pex (Phytophthora extracellular) genes analyzed for single nucleotide polymorphisms
GeneGenBank accessionLength of proteinaLength of signal peptideaNumber of cysteinesbBest BLASTP hitcE-value
  • a

    Predicted sequence features are based on Torto et al. (2003), length of sequences in amino-acids.

  • b

    b Number of cysteines in predicted mature protein.

  • c

    Best BLASTP hit to non P. infestans sequences deposited in GenBank nonredundant database (April 2003). E-values larger than 1e-02 were not considered significant.

scr50AAN31504 5023 4No significant hitsNA
scr58AAN31505 5818 2No significant hitsNA
scr76AAN31495 7629 4No significant hitsNA
scr91BE775988 9121 6AAK63068 phytotoxic protein PcF precursor (Phytophthora cactorum)1e-08
pex208AAN314992081818No significant hitsNA

We used PCR amplification with primers flanking the ORFs of the five selected genes and DNA sequencing to identify SNPs. A minimum of 16 isolates of P. infestans were examined and scored for SNPs in the coding region. A total of nine polymorphic sites corresponding to seven SNPs were identified (Table 2). All five genes displayed at least one SNP. Five SNPs consisted of single nucleotide substitutions, whereas two SNPs in scr58 and scr91 consisted of two-nucleotide changes. Two SNPs consisted of more than two polymorphic nucleotides. The SNP in scr50 occurred as a C, G, or T, whereas the second SNP of pex208 occurred as a T, G, or C. Six of the seven SNPs were nonsynonymous resulting in amino acid replacements. The distribution frequency of the identified SNPs was examined. Heterozygocity was assumed when two SNPs were identified in the same isolate and was detected at variable frequency for all seven SNPs (range 0.06–0.75). However, we cannot exclude at this stage that some of these genes occur as gene families and the two SNPs were amplified from two paralogs and not alleles.

Table 2.  Catalog of single nucleotide polymorphisms identified in examined Phytophthora infestans genes
GeneaSingle nucleotide polymorphism (SNPs)bFrequencycHomozygous (mutant)
Nucleotide (codons)Amino acidHomozygous (reference)Heterozygous
  • a

    GenBank sequence accessions listed in Table 1.

  • b

    b The SNPs are underlined. Reference sequences correspond to the first codon listed.

  • c

    A minimum of 16 P. infestans isolates were examined for each SNP. Heterozygous was assumed when two versions of a SNP were detected in one isolate. However, we cannot exclude at this stage that the two SNPs were amplified from paralogs and not alleles.

scr50CTC to GTCLeu8 to Val80.880.060
 CTC to TTCLeu8 to Phe80.060
scr58GCT to GTCAla12 to Val120.310.690
scr76TGT to TGCCys59 to Cys59 (silent)0.380.560.06
scr91 (SNP #1)GAA to GTAGlu52 to Val520.860.140
scr91 (SNP #2)GCT CCC to GCG TCCAla76 Pro77 to Ala76 Ser7700.190.81
pex208 (SNP #1)ACG to ATGThr29 to Met290.190.750.06
pex208 (SNP #2)ATC to AGCIle83 to Ser830.130.750
 ATC to ACCIle83 to Thr830.060
 AGC to ACCSer83 to Thr830.06

We used the test for linkage disequilibrium described above to assess whether any of the seven SNPs is associated with one of the 11 known Avr genes of P. infestans. The C/T SNP detected at base 54 in the scr58 gene resulted in an alanine to valine replacement (Table 2) and exhibited complete association (D = 1) with avirulence on the potato R1 differential genotype. However, no reliable conclusion can be reached at this point since D is sensitive to low allele frequencies and small sample sizes. In this case the complete association may have resulted from the absence of one of the four possible haplotypes. Importantly, the majority of associations were between virulence and heterozygous SNPs, a result at variance with the genetics of avirulence towards the R1 genotype.

The pex208 gene exhibited a C/T SNP (SNP #1, Table 2) which showed stronger evidence for linkage disequilibrium with Avr11 (D = 0.43). The strength of the association is not enough to warrant selecting this gene as a candidate Avr11, but the principle of combining SNP discovery with evaluation using test statistics such as D was validated.

Other tested candidate genes typically yielded only SNPs dissociated with the avirulence phenotypes tested. This was particularly of interest for scr91, which exhibits significant similarity to PcF, a necrosis-inducing protein from P. cactorum (Fig. 2; Orsomando et al., 2001). Scr91 is an attractive candidate Avr gene for two reasons. Firstly, scr91 was shown in our laboratories to be strongly up-regulated during host plant infection by P. infestans (unpublished results), and, secondly, the Pro to Ser substitution caused by SNP #2 is in a potentially key residue. In P. cactorum, the specific activity of this protein may involve a unique 4-hydroxyproline residue near the C-terminal end of the protein (Orsomando et al., 2001). In scr91, there is also a proline at this position, although it is not known if it is hydroxylated or not (Fig. 2). SNP #2 that leads to a proline-serine (P/S) substitution was detected at high frequency in the examined P. infestans isolates and could affect the biological activity of this protein. However, no association of this polymorphism with any of the 11 characterized avirulence phenotypes was observed. Nevertheless, it is worthy to note that the proline substitution was not detected in a homozygous form and was always observed in older populations of the US-1 clonal lineage of P. infestans, as well as in several race 0 isolates that cannot infect potato differential lines that carry at least one of the 11 known R genes.

Figure 2.

Alignment of Phytophthora cactorum phytotoxic PcF protein with two Phytophthora infestans SCR91 predicted proteins. Identical amino acids are shaded in dark gray and similar amino acids shaded in light gray. Residue numbers are indicated above the sequences. Only the mature sequence of the proteins was included. The conserved cysteine residues are in pink and the proline residue that is hydroxylated in P. cactorum is highlighted in blue. Residues polymorphic in P. infestans are underscored with an asterisk.

Promising polymorphism-to-phenotype associations will be further tested across a wider range of P. infestans isolates and across virulence-tested progeny from appropriate mapping populations. This will allow rigorous statistical evaluation of the significance of the associations. In addition, functional assays are under way to determine whether the examined polymorphic genes induce HR-like responses in late blight resistant Solanum.

Potential limitations and considerations

Selection criteria

The series of selection criteria used to identify candidate Avr genes form the central assumption behind this strategy. It is possible that P. infestans or other Phytophthora spp. carry some Avr genes with novel or unexpected structural features that can not be presently identified by data mining. For instance, the Avr elicitor may not be the direct product of an Avr gene but could be a secondary metabolite produced by a biochemical pathway, making it difficult to predict the genes involved. Nevertheless, it is reasonable to expect that many Avr genes will exhibit some of the obvious structural features, such as the presence of a secretion signal. Ultimately, as technologies for sequencing and SNP identification continue to improve, it will be possible to loosen the criteria and assay larger numbers of candidate genes.

Noncoding and complex polymorphisms

It is possible that in some races, the virulence allele of the Avr gene does not carry polymorphisms that result in structural alterations of the protein but rather mutations that affect gene expression and regulation. These polymorphisms might be harder to identify and interpret, but nevertheless they can still be used in classical and association genetic analyses. In fungal Avr genes, such as those cloned from M. grisea, the change in interaction phenotype from avirulent to virulent may result from many different types of polymorphisms in the Avr gene, or its expression. Therefore, for any candidate Avr gene it is essential to relate any polymorphism (presence/absence, insertions/deletions, or SNPs) within the gene, as a whole, to phenotype. In the P. infestans genes assayed thus far, only SNPs have been identified. In the future, differences in gene expression can also be assessed for correlation with avirulence.

Gene families

The genomes of many oomycete pathogens are large and complex. Of these, P. infestans is one of the largest at 240 Mb and contains high levels of repetitive DNA. Many of the P. infestans genes that have been determined to be up-regulated in planta are also members of gene families. Some candidate Avr genes occur as complex gene families with several closely related paralogs identified in one individual isolate. This creates technical difficulties in assessing the particular genotype of a given strain since PCR amplifications occasionally result in mixed amplicons that cannot be sequenced directly. Additional steps are then needed for cloning and sequencing the corresponding genomic regions in order to identify the various haplotypes and design specific markers for genotyping.

Genetic hitchiking effect

In identifying polymorphisms associated with avirulence, the genetic hitchhiking effect must be considered (Barton, 2000). For instance, genetic loci conditioning avirulence/virulence can be considered as potentially under strong selective pressure by deployment of resistance genes in potato and tomato. An R gene selective sweep through a given P. infestans population may not only increase the prevalence of the virulence allele, but would also increase the prevalence of linked but selectively neutral loci. P. infestans can reproduce both sexually and asexually, depending on whether both mating types are present in a population. In sexual populations, the hitchhiking effect breaks down with physical distance due to sexual recombination over many generations. In asexually reproducing populations, which are effectively clonal, the hitchhiking effect would act across the entire selected genotype. However, an R gene selective sweep would select against an Avr allele and any hitchhiking effect in the surrounding genomic regions would be due to population founder effects. To test whether a particular association between a SNP and an Avr allele is due to the hitchiking effect, the association should be tested in multiple P. infestans populations that have evolved in relative isolation from each other. If a single gene is responsible for the avirulence phenotype, the evolution to virulence should have occurred independently in separate populations. Ultimately, functional assays for avirulence are needed to validate whether the identified gene is an Avr gene.

Orphan Avr genes

In determining the extent of linkage disequilibrium between alleles of P. infestans genes and interaction phenotypes, the number of host genotypes considered and the robustness of the correlation must be taken into account. For instance, P. infestans isolates are routinely tested against potato genotypes carrying the resistance alleles R1R11, and occasionally against differential tomato genotypes. However, many species of Solanum exhibit both HR-based resistance and susceptibility to P. infestans infection and are likely to bear a fairly large number of uncharacterized R genes (Kamoun et al., 1999b; Vleeshouwers et al., 2000). Such species include, among others, accessions of S. papita, S. chacoense, S. verrucosum, and S. bulbocastanum. Interestingly, the approach we describe in this article has the potential to identify P. infestans Avr genes that do not correspond to the known potato or tomato R genes, but correspond to unknown R genes that have not been genetically characterized. Such Avr genes are still expected to be polymorphic but may not associate with the known races of P. infestans. These genes might be termed ‘orphan’Avr genes until screening of plant germplasm uncovers the matching R genes. Extensive infection surveys of host genotypes with a diverse collection of P. infestans isolates will permit orphan Avr gene–SNP associations to be tested in parallel with the typical R1/Avr1R11/Avr11 interactions.

Outlook

Compared to bacteria, the number of Avr genes cloned from fungal and oomycete plant pathogens remains relatively low (van’t Slot & Knogge, 2002). This is mainly caused by the relatively large size of the genomes of phytopathogenic eukaryotes, and the lack of large scale transformation procedures that hinders the use of the functional complementation approach that has proven successful in the cloning of many bacterial Avr genes. In the oomycete Phytophthora, positional cloning has been the technique of choice for the cloning of race-specific Avr genes. However, even though projects for positional cloning of Avr genes from P. infestans and P. sojae have been running for 5–10 yr in several laboratories, to date no race-specific Avr genes from Phytophthora have been described in the literature.

In this paper, we describe an alternative strategy for cloning Phytophthora Avr genes. This approach exploits emerging sequence data for Phytophthora and combines intraspecific comparative genomics with data mining and functional analyses. In a collaborative effort between our two groups, we have been applying this strategy to the cloning of Avr genes from the late blight pathogen P. infestans and have obtained promising preliminary results. Eventually, this work will lead to the identification of novel genes from P. infestans that condition avirulence to resistant potato and tomato and will help unravel the molecular basis of host specificity in this important pathosystem. Interestingly, this approach has also the potential to identify P. infestans‘orphan’Avr genes that do not correspond to the known potato or tomato R genes, but correspond to uncharacterized R genes. A promising ‘orphan’Avr gene candidate is scr91 (Fig. 2). This gene encodes a secreted small cysteine-rich protein with similarity to a necrosis-inducing protein, is strongly up-regulated during infection of host plants, and is polymorphic in a key amino acid residue. Future work, inlcuding functional assays, will help confirm that scr91 encodes an avirulence product and will unravel the nature of the matching R gene.

Acknowledgements

This research is part of a consortium of oomycete–plant interaction laboratories that includes the groups of Jim Beynon, Horticulture Research International, Wellesbourne, Warwick, UK; Vivianne Vleeshouwers, Wageningen University, Wageningen, The Netherlands; and Pieter van West, The University of Aberdeen, Aberdeen, Scotland, UK. We are grateful to these colleagues and their coworkers for useful discussions and access to unpublished information. Research at OSU is supported by NSF Plant Genome grant DBI-0211659, Syngenta Biotechnology, and State and Federal Funds appropriated to the Ohio Agricultural Research and Development Center, the Ohio State University. The SCRI group are grateful to the Scottish Executive Environment and Rural Affairs Department (SEERAD) for continuing financial support. JIBB, MA, and SCW contributed equally to this work and should be considered as cofirst authors.

Ancillary