Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.
Trichothecenes are terpene-derived secondary metabolites produced by multiple genera of filamentous fungi, including many plant pathogenic species of Fusarium. These metabolites are of interest because they are toxic to animals and plants and can contribute to pathogenesis of Fusarium on some crop species. Fusarium graminearum and F. sporotrichioides have trichothecene biosynthetic genes (TRI) at three loci: a 12-gene TRI cluster and two smaller TRI loci that consist of one or two genes. Here, comparisons of additional Fusarium species have provided evidence that TRI loci have a complex evolutionary history that has included loss, non-functionalization and rearrangement of genes as well as trans-species polymorphism. The results also indicate that the TRI cluster has expanded in some species by relocation of two genes into it from the smaller loci. Thus, evolutionary forces have driven consolidation of TRI genes into fewer loci in some fusaria but have maintained three distinct TRI loci in others.
Filamentous fungi can produce numerous secondary metabolites that include pigments, compounds that are toxic to plants and/or animals, plant growth regulators and antibiotics or other pharmaceuticals. The majority of fungal secondary metabolites described to date are derived from the activities of one of three classes of enzymes: terpene synthases, polyketide synthases and non-ribosomal peptide synthetases (Keller et al., 2005). Each enzyme class utilizes relatively simple primary metabolites as substrates and rearranges or condenses them into structurally more complex molecules, i.e. terpenes, polyketides or non-ribosomal peptides. These more complex molecules can undergo enzyme-catalysed modifications, such as oxygenation, cyclization, isomerization and/or condensation with other metabolites to form the myriad of fungal secondary metabolites that, collectively, are highly variable in chemical structure and biological activity. Secondary metabolite biosynthetic genes are often clustered in filamentous fungi (Keller et al., 2005). The clusters typically include a gene encoding a synthase as well as genes encoding structure-modifying enzymes. The clusters can also include genes encoding metabolite transport proteins and proteins that regulate transcription of cluster genes.
The fungal genus Fusarium (teleomorph Gibberella) consists of over 70 species, many of which are plant pathogens (Desjardins, 2006; Leslie and Summerell, 2006). Some species can produce secondary metabolites, known as mycotoxins, that are toxic to animals. Trichothecenes are a family of terpene-derived mycotoxins produced by some species of Fusarium and several other fungal genera in the order Hypocreales. Trichothecenes are among the most economically significant mycotoxins worldwide because of their widespread occurrence in important grain crops such as barley, maize and wheat (Council for Agricultural Science and Technology, 2003). Trichothecenes are also an agricultural concern because they are toxic to plants and can contribute to pathogenesis of Fusarium on some crops (Desjardins et al., 1996; Maier et al., 2006).
The trichothecene biosynthetic gene cluster has been well characterized in Fusarium graminearum sensu stricto and F. sporotrichioides, two species that have served as models to elucidate the genetics and chemistry of trichothecene biosynthesis (Brown et al., 2001; 2002; Lee et al., 2002). In both species, the core cluster consists of 12 genes that are responsible for synthesis of the core trichothecene molecule and several modifications to it. The cluster genes are: the terpene synthase gene TRI5, the cytochrome P450 monooxygenase genes TRI4, TRI11 and TRI13, the acyl transferase genes TRI3 and TRI7, the esterase gene TRI8, the regulatory genes TRI6 and TRI10, and the transporter gene TRI12. TRI9 and TRI14 are also located in the core TRI cluster, but they lack similarity to genes with known functions and their specific functions in trichothecene biosynthesis are not known.
Fusarium graminearum and F. sporotrichioides also have two smaller loci that encode trichothecene biosynthetic enzymes. Linkage analysis and genome sequence data indicate that these two loci and the core TRI cluster are all on different chromosomes in F. graminearum (Cuomo et al., 2007; Lee et al., 2008). The first of these smaller loci consists of a single acyl transferase gene, TRI101, that is responsible for esterification of acetate to the hydroxyl function at carbon atom 3 (C-3) of trichothecenes (Fig. 1) (Kimura et al., 1998a; McCormick et al., 1999). The second locus consists of two genes: TRI1, which encodes a cytochrome P450 monooxygenase, and TRI16, which encodes an acyl transferase. In F. sporotrichioides, the TRI1 enzyme catalyses hydroxylation of trichothecenes at C-8 (Fig. 1), and the TRI16 enzyme catalyses esterification of a five-carbon carboxylic acid, isovalerate, to the C-8 oxygen (Brown et al., 2003; Meek et al., 2003; Peplow et al., 2003). In F. graminearum, the TRI1 enzyme catalyses hydroxylation of trichothecenes at both C-7 and C-8 (Fig. 1), but TRI16 is non-functional because of the presence of frame shifts and stop codons in its coding region (Brown et al., 2003; McCormick et al., 2004; 2006). As a result, in F. graminearum-produced trichothecenes the C-8 oxygen is not esterified but instead undergoes oxidation to form a carbonyl function by an as yet unknown mechanism. The functional differences of the TRI1 and TRI16 enzymes in F. graminearum and F. sporotrichioides are responsible for important structural differences of trichothecenes produced by the two species. F. graminearum can produce trichothecenes (e.g. nivalenol and deoxynivalenol) with a C-7 hydroxyl and a C-8 carbonyl, whereas F. sporotrichioides can produce trichothecenes (e.g. T-2 toxin) that have an isovalerate ester at C-8 and lack the C-7 hydroxyl.
Comparisons of nucleotide sequences from multiple species have provided insight into evolution of fungal secondary metabolite gene clusters. Some clusters appear to have moved into fungal genomes by horizontal gene transfer from either prokaryotes or other fungi (Brakhage et al., 2005; Khaldi et al., 2008). In addition, vertical inheritance has resulted in distribution of clusters across multiple genera of fungi, and differential inheritance and/or deletion has resulted in discontinuous distribution of clusters among groups of related fungi (Kroken et al., 2003; Proctor et al., 2004; Patron et al., 2007; Glenn et al., 2008). High levels of sequence identity among some cluster genes indicate that gene duplication has contributed to formation of plant secondary metabolite biosynthetic clusters and clusters that regulate development in animals (Gierl and Frey, 2001; Benderoth et al., 2006; Lemons and McGinnis, 2006). In addition, intergenus comparisons indicate that gene relocation has contributed to formation of an allantoin utilization cluster in yeast (Wong and Wolfe, 2005) and some secondary metabolite gene clusters in plants (Field and Osbourn, 2008).
Gene duplication has been proposed to contribute to formation and growth of secondary metabolite gene clusters in filamentous fungi (Cary and Ehrlich, 2006; Saikia et al., 2008). However, there is limited evidence for the contribution of either gene duplication or relocation in the formation of such clusters (Cary and Ehrlich, 2006; Carbone et al., 2007). In this study, we compared the organization of TRI loci in fungi that represent a wide range of the phylogenetic and chemical diversity of trichothecene-producing species of Fusarium. We found that TRI1 and TRI101 are located in the core TRI cluster in some species but not in others. This difference among species provided an ideal system to address two competing hypotheses: (i) the core TRI cluster has grown by relocation of TRI1 and TRI101 into it from elsewhere in the genome versus (ii) the core cluster has grown by gene duplication, which was followed by relocation of TRI1 and TRI101 to other loci in the genome.
Fusarium equiseti TRI cluster
In F. graminearum and F. sporotrichioides, the core TRI cluster is identical with respect to order and orientation of genes in and immediately flanking it (Brown et al., 2002; 2004). In the current study, phylogenetic analysis with five primary metabolic genes indicated that F. graminearum and F. sporotrichioides are more closely related to one another than to some other trichothecene-producing species, such as F. longipes, F. equiseti and F. semitectum (Fig. 2A). Therefore, we sequenced the core cluster in a strain (NRRL 13405) of the more distantly related, morphological species, F. equiseti. The sequence analysis revealed significant structural differences compared with the cluster in F. graminearum and F. sporotrichioides (Fig. 3). The most striking difference is the presence of orthologues of TRI1 and TRI101 in the F. equiseti cluster. As noted above, TRI1 and TRI101 are at other loci in F. graminearum and F. sporotrichioides. The F. equiseti cluster also differs from the F. graminearum and F. sporotrichioides core cluster by the following: (i) it lacks the transporter gene TRI12, (ii) the region spanning TRI3, TRI7 and TRI8 is in the opposite orientation and located at the opposite end of the cluster and (iii) it includes a putative Cysteine 6 (Cys6) transcription factor gene (gene F in Fig. 3) between TRI5 and TRI6.
We also obtained limited sequence data for regions flanking the TRI cluster in F. equiseti NRRL 13405: 3.8 kb flanking the TRI4 end of the cluster and 4.6 kb flanking the TRI8 end. These data indicate that the genes immediately flanking the F. equiseti cluster are not the same as those flanking the core cluster in F. graminearum and F. sporotrichioides (Fig. 3) (Brown et al., 2004). However, among the genes flanking the TRI4 end of the F. equiseti cluster is an orthologue of a gene (gene D in Fig. 3) designated as FGSG_00072 in the F. graminearum genome sequence in the Fusarium Comparative Database. In F. graminearum, FGSG_00072 is located in the 3′-flanking region of TRI1 (Fig. 3). Thus, both FGSG_00072 and TRI1 have been transferred between the TRI1 region of F. graminearum and the core TRI cluster regions of F. equiseti. The deduced amino acid sequences of the F. graminearum and F. equiseti FGSG_00072 orthologues are 62% identical across the carboxy-terminal halves of the proteins but exhibit only 35% identity across their amino-terminal halves. In addition, there are multiple insertions and deletions in the amino-terminal halves of the protein orthologues relative to each other. In contrast, the FGSG_00072 orthologues in F. oxysporum (FOXG_05799) and F. verticillioides (FVEG_03669) are 58 and 64% identical, respectively, to the F. graminearum orthologue over the entire lengths of the proteins. This indicates that the amino-terminal half of the F. equiseti orthologue is a product of rapid divergence from the orthologues in the other three fusaria or a product of recombination with another locus.
We also amplified and sequenced a fragment of the F. equiseti TRI16 orthologue with primers 1472 and 1477 (Table 1) and used GenomeWalker analysis to obtain the entire TRI16 coding region as well as 3.3 and 1.8 kb of sequence for the 5′- and 3′-flanking regions respectively. Unlike in F. graminearum, the TRI16 coding region in F. equiseti did not include any frame shifts or stop codons relative to the F. sporotrichioides TRI16 coding region. Therefore, the F. equiseti TRI16 may be functional. The genes (A and B in Fig. 3) immediately flanking the F. equiseti TRI16 do not exhibit significant similarity/identity to the genes that flank the TRI1-TRI16 locus or the other TRI loci in F. graminearum or F. sporotrichioides. Together, our sequence analyses of the core TRI cluster and TRI16 regions indicate that TRI1 and TRI16 are a minimum of 12 kb apart in F. equiseti NRRL 13405. However, the exact distance between these two genes or whether they are located on different chromosomes remains to be determined.
Table 1. Oligonucleotide primers used for PCR amplification and sequence analysis of target sequences indicated.
The presence of TRI1 and TRI101 in the core TRI cluster in F. equiseti but not in F. graminearum and F. sporotrichioides raises the question, did the two genes originate in the cluster and subsequently move out of it during the evolution of some Fusarium species, or did TRI1 and TRI101 originate outside the cluster and subsequently move into it during the evolution of some species? To address this question with respect to TRI101, we compared the location of TRI101 orthologues in 16 trichothecene-producing species and three non-producing species of Fusarium as well as a more distantly related, trichothecene non-producing fungus, Nectria haematococca (synonym Haematonectria haematococca, anamorph F. solani) (Fig. 2A).
In F. graminearum and F. sporotrichioides, TRI101 is located between orthologues of a phosphate permease gene, PHO5, and a UTP-ammonia ligase gene, URA7 (Kimura et al., 1998b). We used a PCR approach with primers shown in Table 1 to amplify PHO5-TRI101 and TRI101-URA7 intergenic regions in 13 trichothecene-producing fusaria in addition to F. graminearum, F. sporotrichioides and F. equiseti. The ends of amplicons were sequenced to confirm that the expected products were amplified. With this approach, we obtained evidence that PHO5 and TRI101 are in the same orientation as in F. graminearum and are located within 2.5 kb of each other in 10 of the species examined (Fig. 2B). We also obtained evidence that TRI101 and URA7 were in the same orientation and within 2.5 kb of each other in seven of the species examined (Fig. 2B).
The PCR approach did not provide evidence for linkage of TRI101, PHO5 and URA7 in F. scirpi, F. semitectum or F. camptoceras. However, GenomeWalker analysis of these species yielded sequence data demonstrating that, in all three species, TRI101 is located between TRI3 and TRI14, the same position as in F. equiseti (Fig. 2B).
To examine the location of TRI101 in the trichothecene non-producing species F. oxysporum and F. verticillioides, we utilized the genome sequence databases of these fungi in the Fusarium Comparative Database. The databases indicate that these two species have a non-functional remnant or pseudogene of TRI101 (ΨTRI101) located between orthologues of PHO5 and URA7. These findings are consistent with previous reports of a ΨTRI101 between orthologues of PHO5 and URA7 in F. oxysporum and another trichothecene non-producing species F. fujikuroi (Kimura et al., 2003a; Tokai et al., 2005). We also examined the N. haematococca genome sequence database at the Joint Genome Institute and identified an orthologue of TRI101 (protein ID 97980) and PHO5 (protein ID 64827) 3.5 kb apart on chromosome 13. We found no evidence of a URA7 orthologue in this region of the N. haematococca genome sequence.
Our phylogenetic analysis with five primary metabolic genes resolved the 16 trichothecene-producing species of Fusarium into a well-supported clade that is distinct from the clade formed by the three trichothecene non-producing fusaria (Fig. 2A). The analysis also resolved F. equiseti, F. scirpi, F. semitectum and F. camptoceras into a well-supported clade, which, for the purposes of this study, we will hereafter refer to as the F. equiseti clade. The F. equiseti clade was part of the larger clade of trichothecene producers but formed a sister group to all the other species within it. Comparison of the phylogenetic tree and the locations of TRI101 suggests that TRI101 was located between PHO5 and URA7 prior to divergence of trichothecene-producing and non-producing species of Fusarium, and that TRI101 was located next to PHO5 prior to divergence of fusaria with the Gibberella teleomorph and those with the Nectria teleomorph (Fig. 2A and B). Because the F. equiseti clade lies within the trichothecene-producing clade, it follows that the presence of TRI101 in the core TRI cluster in the F. equiseti clade is a more recently derived condition than its presence in the PHO5-URA7 region.
The F. oxysporum, F. verticillioides and N. haematococca genome sequence databases were uninformative with respect to TRI1 movement, because a TRI1 orthologue was not detected in them. Therefore, we compared the genetic environment in which TRI1 occurs in the 16 trichothecene-producing species shown in Fig. 2. First, PCR with degenerate primers 1285 and 1292 (Table 1) was employed to amplify a ∼1200 bp fragment of the TRI1 coding region from a representative strain of each species for which we lacked data. Sequence data from the amplification products were then used in GenomeWalker analysis to obtain 3–6 kb of flanking sequence on each side of TRI1.
The resulting sequence data revealed that TRI1 can be located within one of four distinct genetic environments, which we have designated GE1, GE2, GE3 and GE4 (Fig. 2C). GE1 occurs in eight species, including F. graminearum (Brown et al., 2003; McCormick et al., 2004). In these species, an orthologue of FGSG_00070 and TRI16 (or a TRI16 pseudogene, ΨTRI16) are located in the 5′-flanking region of TRI1, and an orthologue of FGSG_00072 and sometimes FGSG_00073 are located in the 3′-flanking region (Fig. 2C). GE2 occurs in F. sporotrichioides (Brown et al., 2003; Meek et al., 2003; Peplow et al., 2003), F. armeniacum and Fusarium sp. NRRL 36351. In GE2, the gene PDB1 is consistently located in the 5′-flanking region of TRI1, but the 3′-flanking region varies. In F. armeniacum and F. sporotrichioides, an orthologue of TRI16 and an orthologue of a gene previously designated as Orf1/orf1 (Brown et al., 2003; McCormick et al., 2004) are located in the TRI1 3′-flanking region. However, in Fusarium sp. NRRL 36351, a gene encoding a putative WD repeat protein is located in the 3′-flanking region, and there is no evidence for either TRI16 or Orf1/orf1 in 3 kb region 3′ to the TRI1 stop codon. GE3 occurs only in F. longipes among the species examined here. In GE3, genes encoding a putative transcription factor and a cytochrome P450 monooxygenase are located in the TRI1 5′-flanking region, and a gene encoding a β-mannosidase-like protein is located in the 3′-flanking region. GE4 occurs in the four species that form the F. equiseti clade, i.e. F. equiseti, F. scirpi, F. semitectum and F. camptoceras (Fig. 2C). In GE4, TRI1 is located in the core TRI cluster, with TRI10 and TRI9 in the 5′-flanking region of TRI1, and TRI11 and TRI13 in the 3′-flanking region. The exception to this is F. camptoceras, where TRI11 and TRI14 are located in the 3′-flanking region.
The analysis of TRI1-flanking regions indicated that an apparently functional TRI16 is located adjacent to TRI1 in five species: F. sambucinum, F. venenatum, F. kyushuense, F. sporotrichioides and F. armeniacum (Fig. 2C). That the gene is functional is evident by the absence of insertions, deletions or stop codons in its coding region. The analysis also indicated that a ΨTRI16 is adjacent to TRI1 in five species: F. boothii, F. graminearum, F. crookwellense, F. culmorum and F. poae (Fig. 2C). PCR analysis with primers 1472 and 1477 (Table 1) followed by GenomeWalker analysis revealed that F. scirpi and F. semitectum have an apparently functional TRI16. However, the presence of TRI1 in the core cluster of F. scirpi and F. semitectum indicates that TRI16 is not adjacent to TRI1 in these two species.
The PCR analysis with multiple pairs of degenerate TRI16 primers (Table 1) failed to yield a TRI16 amplicon for F. camptoceras, F. longipes or Fusarium sp. NRRL 36351. In addition, TRI16 was not detected in F. camptoceras, F. longipes or Fusarium sp. NRRL 36351 by low-stringency Southern blot analysis (data not shown). Thus, TRI16 may be absent from the genomes of these three species or may be highly diverged compared with TRI16 orthologues in the other fusaria included in this study.
TRI1 location and species phylogenies
Comparison of the genetic environment in which TRI1 is located and the phylogenetic relationship of the 16 trichothecene-producing fusaria revealed that TRI1 location tends to be the same in more closely related species and different in more distantly related species (Fig. 2A and C). As noted above, TRI1 is located in the core cluster (GE4) in the F. equiseti clade. The three species in which TRI1 is located in GE2 formed another strongly supported clade, but this clade was resolved within a larger clade that included species with TRI1 in the GE1 context. The analysis also indicated that F. longipes, with its unique TRI1 location (GE3), represents a distinct lineage from species with TRI1 located in GE1 and GE2 (Fig. 2A and C). These findings are consistent with movement of TRI1 into or out of the core TRI cluster during the evolution of trichothecene-producing fusaria.
TRI1 and TRI16 trans-species polymorphism
We conducted a maximum parsimony analysis of a ∼1200 bp fragment of the TRI1 coding region in 51 strains that represent 16 trichothecene-producing species of Fusarium. We also analysed the entire ∼ 1800 bp coding region in one representative strain of each species. These analyses resolved TRI1 orthologues into four major clades, each of which was supported by bootstrap values of 97–100 (Fig. 4). A comparison of the TRI1 phylogeny with the species phylogeny, inferred from combined sequences of five primary metabolic genes, revealed incongruencies between the two phylogenies (Fig. 5). For example, F. kyushuense and F. venenatum were most closely related to F. poae and F. sambucinum in the species phylogeny, but more closely related to F. camptoceras, F. equiseti, F. longipes, F. scirpi, and F. semitectum in the TRI1 phylogeny (Fig. 5). The lack of correlation between TRI1 and species phylogenies is consistent with trans-species polymorphism, a phenomenon that has been described in animals, plants and fungi (Muirhead et al., 2002; Ward et al., 2002; Klein et al., 2007; Powell et al., 2007).Trans-species polymorphism can arise when an ancestral species carries multiple alleles of a gene and when the alleles are inherited differentially during subsequent speciation events (Klein et al., 2007). This can result in closely related species inheriting less similar alleles and distantly related species inheriting more similar alleles.
Phylogenies inferred from maximum parsimony analysis of DNA sequences of TRI101 and the representative TRI cluster genes TRI4, TRI5 and TRI11 were highly correlated with each other and with the species phylogeny derived from primary metabolic gene sequences. Because of the similarity of the TRI4 and TRI5 trees and because they employed an outgroup from the same organism (i.e. Myrothecium roridum), a phylogenetic tree was generated with the combined sequence data for TRI4 and TRI5 (Fig. 5). Sequence data for TRI11 and Tri101 were not combined with each other or with other sequences, and the phylogenetic relationships for these genes are presented as separate trees (Fig. S1). In the species phylogeny and the phylogenies based on TRI4, TRI5, TRI11 and TRI101, the 16 fusaria analysed were consistently resolved into five major clades, and the species within each of the five clades were the same in all of the phylogenies (Fig. 5, Fig. S1). These five clades are designated Clades 1 through 5 in the combined TRI4/TRI5 tree in Fig. 5 and in the TRI11 tree in Fig. S1. Given their similarities to the species phylogeny, it is not surprising that the phylogenies based on TRI4, TRI5, TRI11 and TRI101 exhibited marked incongruencies with the TRI1-based phylogeny (Fig. 5, Fig. S1).
We also conducted a maximum parsimony analysis with the DNA sequence of TRI16 and compared the resulting phylogeny with the TRI1 phylogeny (Fig. 6). The analysis of TRI16 was more limited than those described above, because TRI16 is either a pseudogene or not detectable in half of the trichothecene-producing species included in this study. Therefore, the TRI16 phylogeny was inferred from sequence data of eight species that had an apparently functional TRI16. In contrast to the comparisons noted above, the TRI16 phylogeny was correlated with that of TRI1. For example, F. sambucinum was resolved into a clade distinct from the clade that included F. kyushuense and F. venenatum in both the TRI1 and TRI16 phylogenies. In the species, TRI4, TRI5, TRI11 and TRI101 phylogenies, however, F. sambucinum, F. kyushuense and F. venenatum always resolved into the same clade (e.g. Clade 1, Fig. 5).
Previous molecular genetic analyses indicated that TRI1, TRI101 and the core TRI cluster are located at three distinct loci in F. graminearum and F. sporotrichioides (Brown et al., 2002; 2003; Kimura et al., 2003b; McCormick et al., 2004). Moreover, linkage analysis (Lee et al., 2008) and genome sequence data (Cuomo et al., 2007) indicate that the three loci are on different chromosomes in F. graminearum. Therefore, it is intriguing that TRI1 and TRI101 are located in the core TRI cluster in the F. equiseti clade. These findings raise the question, did TRI1 and TRI101 originate in the core cluster and subsequently move out of it, or did the two genes originate outside the cluster and move into it? The location of TRI101/ΨTRI101 in the PHO5-URA7 region in both the trichothecene-producing and non-producing clades of Fusarium indicates that TRI101 was located in this region prior to divergence of the two clades. Because the F. equiseti clade is within the trichothecene-producing clade, it follows that TRI101 most likely moved directly or indirectly from the PHO5-URA7 region to the core TRI cluster during evolution of the F. equiseti clade. Critical evidence for the directionality of TRI1 movement is the similar pattern of trans-species polymorphism observed for orthologues of TRI1 and TRI16 but not for orthologues of the TRI cluster genes TRI4, TRI5 and TRI11. The presence/absence of this pattern of trans-species polymorphism among the various TRI genes would have arisen more readily if TRI1 and TRI16 had been closely linked to one another but not to the core TRI cluster in the ancestral Fusarium. If this were the case, it follows that TRI1 moved into the cluster during the evolution of the F. equiseti clade. On the other hand, if TRI1 had originated in the cluster, orthologues of other TRI cluster genes would be expected to exhibit a pattern of trans-species polymorphism similar to that exhibited by TRI1. But, they do not. Thus, the most parsimonious explanation for the observed data is that TRI1 originated outside the cluster and subsequently relocated into it during the evolution of the F. equiseti clade. This explanation is consistent with relocation of TRI101 into the cluster.
The mechanism by which TRI1 and TRI101 relocated to the core TRI cluster is not known. However, analysis of TRI locus-flanking genes may provide insight into the mechanism. For example, the presence of TRI1 in the cluster and the FGSG_00072 orthologue flanking the cluster in F. equiseti NRRL 13405 suggests that recombination occurred between an ancestral core cluster and an ancestral locus that included TRI1 in a GE1-like context. The lack of identity of the 5′ half of the F. equiseti FGSG_00072 orthologue relative to orthologues in other fusaria suggests a possible link between relocation of the gene and the marked changes in its sequence in F. equiseti.
The presence of TRI1 and TRI101 in the core TRI cluster in the F. equiseti clade indicates that relocation of the genes into the cluster occurred relatively early in the evolution of this clade. It remains to be determined whether the relocation of the two genes resulted from independent or related events. Likewise, it remains to be determined whether TRI1 and/or TRI101 relocation are related to the absence of TRI12, the rearrangement of the TRI3-TRI7-TRI8 region, and the presence of the Cys6 transcription factor gene in the core cluster in F. equiseti NRRL 13405.
It is not clear what evolutionary forces drove relocation of TRI1 and TRI101 in the F. equiseti clade but maintained these genes at separate loci distinct from the cluster in other species. One hypothesis to explain the existence of secondary metabolite biosynthetic gene clusters is that clustering facilitates co-ordinated regulation of genes responsible for the same biosynthetic pathway (Keller et al., 2005). Although we have no direct evidence for its role in trichothecene biosynthesis, the presence of a Cys6 transcription factor gene between TRI5 and TRI6 in the F. equiseti cluster suggests that the gene may regulate expression of the other TRI cluster genes. This and the absence of a closely related orthologue of the Cys6 transcription factor gene in the F. graminearum genome suggest that there may be fundamental differences in regulation of TRI gene expression in F. equiseti compared with F. graminearum. Functional characterization of the Cys6 gene should provide insight into whether such a difference in TRI gene regulation exists in the two species.
Another question raised by the results of the current study is: which of the TRI1 loci has the more ancestral organization? As discussed above, the presence of TRI1 in the core cluster is more likely a derived rather than ancestral condition. Given the TRI1 and TRI16 trans-species polymorphism, it is likely that within trichothecene-producing fusaria the ancestral TRI1 locus included both TRI1 and TRI16. Moreover, TRI16 was probably functional, because it is more likely for a functional gene to degenerate into a non-functional gene than for a non-functional gene with multiple deletions and insertions to become functional. Among the genetic environments in which TRI1 occurs, both GE1 and GE2 include a functional TRI16. The four major types of TRI1 orthologues resolved by phylogenetic analysis of the gene (Fig. 4) likely correspond to different alleles in the ancestral, trichothecene-producing Fusarium. Thus, there were most likely at least four alternate alleles for the TRI1 coding sequence in the ancestral TRI1-TRI16 locus. Among the species examined these four types of TRI1 orthologues (ancestral alleles) are represented in GE1 by F. poae, F. graminearum, F. sambucinum and F. kyushuense. Likewise, two of the TRI1 orthologue types occur among species with GE2. Together, these findings suggest that the GE1 and GE2 contexts of TRI1 represent more ancestral organizations compared with the GE3 and GE4 contexts.
Fusarium is a member of the ascomycetous order Hypocreales, which includes several other genera of trichothecene-producing fungi. Three of these, Myrothecium, Stachybotrys and Trichoderma, produce trichothecenes that lack an oxygen atom at the C-8 position (Jarvis, 1991; Nielsen et al., 2005), whereas species of Fusarium, Spicellum and Trichothecium can produce trichothecenes with an oxygen atom at C-8 (Machida and Nozoe, 1972; Kralj et al., 2007). These differences in trichothecene production among genera suggest that the ability to oxygenate trichothecenes at C-8 arose after divergence of Fusarium from a common ancestor that it shared with Myrothecium, Stachybotrys and Trichoderma. This hypothesis is consistent with the origin of TRI1 outside the TRI cluster, because it is responsible for C-8 oxygenation in Fusarium. It is not known whether C-8 oxygenation of trichothecenes produced by Spicellum and Trichothecium is catalysed by an orthologue of the TRI1 enzyme or an unrelated hydroxylase.
Trichothecene C-3 oxygenation and the TRI101 enzyme-catalysed acetylation of the C-3 oxygen may also be relatively recent innovations in trichothecene biosynthesis, because trichothecenes produced by Myrothecium, Spicellum, Stachybotrys, Trichoderma and Trichothecium lack a C-3 oxygen. Analysis of the phylogenetic relationships of TRI genes in these species should provide further insight into the evolution of the trichothecene C-3, C-7 and C-8 oxygenations and other biochemical reactions in the trichothecene biosynthetic pathway.
Although trans-species polymorphism similar to that observed among orthologues of TRI1 and TRI16 was not observed among core TRI cluster genes, other patterns of trans-species polymorphism have been observed previously for TRI cluster genes in the F. graminearum species complex and F. culmorum (Ward et al., 2002; Chandler et al., 2003). The F. graminearum species complex consists of at least 11 closely related lineages, which have recently been elevated to species rank (O'Donnell et al., 2000; 2006). The complex includes the phylogenetic species F. graminearum sensu stricto (F. graminearum lineage 7) and F. boothii (F. graminearum Lineage 3). Although the function(s) of trans-species polymorphism among orthologues of TRI cluster genes has not been demonstrated, the polymorphisms are associated with different trichothecene production profiles. For example, deletions within TRI13 are trans-species in that the same deletions can occur in F. graminearum and F. culmorum (Chandler et al., 2003). Strains of these fungi with a functional TRI13 (i.e. without the deletions) produce NIV, whereas strains with a non-functional TRI13 (i.e. with deletions) produce DON rather than NIV (Lee et al., 2002; Chandler et al., 2003). Similarly, among the trans-species orthologues of TRI1, the F. graminearum TRI1 orthologue leads to DON or NIV production, whereas the F. sporotrichioides TRI1 leads to T-2 toxin production (Meek et al., 2003; McCormick et al., 2004; 2006). Thus, it is possible that trans-species polymorphism of TRI genes contributes to production of structurally diverse trichothecenes within and among species.
Relocation of TRI1 and TRI101 into the core TRI cluster of the F. equiseti clade provides evidence for growth of a fungal secondary metabolite gene cluster by gene relocation rather than by gene duplication. Phylogenetic analysis of 376 cytochrome P450 monooxygenase genes from four species of filamentous fungi provides independent support for this conclusion (Deng et al., 2007). The analysis revealed that the four F. graminearum monooxygenase genes (TRI1, TRI4, TRI11 and TRI13) involved in trichothecene biosynthesis are more closely related to other monooxygenase genes than they are to each other. Thus, even though the four trichothecene biosynthetic monooxygenases have chemically similar substrates and share a similar enzymatic mechanism, it is unlikely that the genes encoding them evolved directly from the same ancestral TRI gene. This contrasts proposals that fungal cluster genes with similar biochemical functions can evolve by duplication of a preexisting cluster gene (Cary and Ehrlich, 2006; Carbone et al., 2007; Saikia et al., 2008). The TRI gene data also contrast data showing that gene duplication has contributed to growth of some secondary metabolite gene clusters in plants (Gierl and Frey, 2001; Benderoth et al., 2006).
The results of the current and previous studies (e.g. Brown et al., 2001; Lee et al., 2002; Ward et al., 2002; Chandler et al., 2003) provide evidence for a complex evolutionary history of TRI loci that has included loss, non-functionalization and rearrangement of genes as well as trans-species polymorphism. Together, the studies demonstrate that multispecies comparisons of TRI genes can provide important insights into the evolution of secondary metabolism in filamentous fungi.
Strains and media
Strains of Fusarium used in this study are shown in Table 2. Specific epithets used for these species conform to names used by Leslie and Summerell (2006) except for F. boothii (O'Donnell et al., 2006). With the exception of F. avenaceum, all species listed in Table 2 have been previously reported to produce trichothecenes and were confirmed to produce trichothecenes during the course of this study (S.P. McCormick and R.H. Proctor, unpublished). Strains were stored as 15% glycerol stocks at −80°C and grown on V-8 juice agar medium (Tuite, 1969) and in liquid GYEP medium (2% glucose, 0.1% yeast extract, 0.1% peptone) (Seo et al., 2001).
Strains with FRC designation are from the Fusarium Research Center culture collection at Pennsylvania State University, State College, Pennsylvania. Strains with ITEM designation are from the culture collection at the Institute of Sciences of Food Production, National Research Council, Bari, Italy. Strains with NRRL designation are from the Northern Regional Research Center culture collection at USDA ARS NCAUR, Peoria, Illinois. Strain PH-1 was provided by F. Trail, Michigan State University and Strain Z-3639 was provided by Robert L. Bowden USDA/ARS Kansas State University.
Strains used for sequence analysis of TRI1 region and as representative strains of each species in the phylogenetic analyses shown in Figs 2, 4 and 5 and Fig. S1.
To prepare fungal genomic DNA, strains were grown in liquid GYEP medium for 2–4 days depending on the growth rate. Fungal growth was harvested by vacuum filtration, lyophilized and ground to a powder. The ground material was suspended in extraction buffer (200 mM Tris-Cl, pH 8, 250 mM NaCl, 25 mM EDTA pH 8 and 0.5% SDS) at 50 mg per 250 μl buffer. Subsequently, genomic DNA used for GenomeWalker libraries was purified with the DNeasy Plant Mini Kit as described by the manufacturer (Qiagen). Genomic DNA used for PCR only was sometimes purified by extraction with an equal volume of a 1:1 (v/v) mixture of TRIS-equilibrated phenol and chloroform : isoamyl alcohol (24:1). The resulting aqueous phase was mixed with 2 vols of NaI solution and 5 μl of UltraBind solution, and then further purified by the UltraClean DNA Purification kit (Mo Bio Laboratories) as specified by the manufacturer.
Fragments of TRI1, TRI4, TRI5, TRI11 and TRI101 were amplified from multiple Fusarium species and sequenced with primers shown in Table 1. PCR and sequencing primers were synthesized by Integrated DNA Technologies or by Sigma Life Science-Genosys. Standard PCR methods employed Taq DNA polymerase (Qiagen) or Platinum PCR SuperMix polymerase (Invitrogen Life Technologies) using conditions recommended by the manufacturers. PCR products were purified by agarose gel electrophoresis followed by band purification with the UltraClean DNA Purification kit (Mo Bio Laboratories).
The GenomeWalker protocol (Clontech) was used to amplify regions of DNA flanking TRI gene fragments that had been amplified with primers (Table 1) that were designed based on F. graminearum and F. sporotrichioides sequences. GenomeWalker libraries were prepared as specified by the manufacturer. Briefly, genomic DNA was digested separately with each of the restriction endonucleases DraI, EcoRV, PvuII and StuI. Each DNA digestion was then ligated separately to a DNA-adapter fragment supplied in the GenomeWalker kit. Dilutions of the resulting ligation products were employed as templates in nested PCR with Advantage II DNA polymerase (Clontech) and with the cycling conditions specified by the manufacturer. In both the primary and secondary reactions of the nested PCR, primer pairs consisted of one primer complementary to Fusarium DNA and the other complementary to the adapter DNA. The nucleotide sequence of the secondary PCR products was used to design additional primers specific to Fusarium DNA in order to extend the sequence data by the GenomeWalker procedure. Fusarium-specific primers were also designed to sequence the entire length of GenomeWalker PCR products that were longer than ∼2 kb and to obtain second-strand sequence data.
For Southern blot analysis of TRI16, hybridization probes were prepared with the Ready-to-Go DNA labelling kit as described by the manufacturer (Amersham Biosciences). Probe templates consisted of fragments of the TRI16 coding region that were amplified by PCR with primers 1472 and 1477 from genomic DNA of F. equiseti NRRL 13405 and F. sporotrichioides NRRL 3299. Hybridization conditions were as previously described (Proctor et al., 2004).
Nucleotide sequence analysis employed BigDye Terminator version 3.1 (Applied Biosystems) reagents and UltraClean-purified PCR products (Mo Bio Laboratories) as DNA templates. Sequence reactions were purified with the BigDye Xterminator Purification protocol and analysed with a 3730 DNA analyser at the USDA-ARS-NCAUR DNA Sequence Facility. Sequence data were viewed and edited with Sequencher version 4.5 (Gene Codes).
DNA sequence of the core TRI cluster in F. equiseti was determined by the GenomeWalker analysis as described above. Gene-specific primers for this protocol were designed based on the DNA sequences of fragments of TRI1, TRI4, TRI5 and TRI11 amplified with the primers shown in Table 1. The resulting sequence data were then used to design F. equiseti-specific primers for the GenomeWalker protocol to expand and connect sequences for the TRI genes. With this strategy, we obtained 37 kb of contiguous, double-stranded DNA sequence for the TRI cluster region in F. equiseti. For other Fusarium species, primers 1285 and 1292 were used to amplify and sequence a TRI1 fragment from a representative strain of each species. The resulting sequence data were used to design Fusarium-specific primers for the GenomeWalker protocol in order to amplify and sequence the regions flanking TRI1. For most species, multiple steps with the GenomeWalker protocol were necessary to obtain sequence data for 3–6 kb of DNA on both sides of TRI1. In a few instances, the GenomeWalker protocol did not yield a secondary PCR product. In these cases, sequence data were obtained by designing primers based on sequence data from one or more closely related species for which the GenomeWalker protocol did yield secondary PCR products. The primers were then used in standard PCR protocols and for sequence analysis to obtain sequence data of the desired region.
All phylogenetic relationships were determined with DNA sequences. DNA sequence alignments were done with the ClustalW+ programme in GCG version 11.1.3Unix (Accerlrys). When necessary, the resulting alignments were adjusted manually. Phylogenetic relationships were inferred by maximum parsimony analysis with paup version 4.0b10 for Unix. This programme was also used to determine statistical support for branches within phylogenetic trees by bootstrap analysis. Alignment of sequences revealed the presence of a 62-nucleotide intron and 39-nucleotide partial intron sequence that were present in the M. roridum TRI4 sequence but not in the Fusarium sequences. These M. roridum intron sequences were deleted from the alignment prior to phylogenetic analysis.
To determine phylogenetic relationships between species, we employed DNA sequences of five primary metabolic genes that have been used previously for phylogenetic analyses of Fusarium. The genes were: (i) CPR1, an NADPH-dependent cytochrome P450 reductase gene; (ii) HIS3, the Histone H3 gene; (iii) RPB2, the gene encoding the second largest subunit of RNA polymerase II; (iv) TEF1 (also ef-1α), the translation elongation factor 1α gene; and (v) TUB2, the β-tubulin gene (Steenkamp et al., 2002; Malonek et al., 2005; O'Donnell et al., 2006; 2007). Primers used to amplify and sequence the gene fragments are shown in Table 1. To determine phylogenetic relationships of TRI orthologues from different Fusarium species, we employed sequence data obtained from PCR-amplified fragments of TRI1, TRI4, TRI5, TRI11 and TRI101.
Some analyses employed previously generated sequence data. Data for the TRI1 and TRI101 regions in F. graminearum and F. sporotrichioides and for the TRI101 region in F. fujikuroi were obtained from the National Center for Biotechnology Information GenBank database (http://www.ncbi.nlm.nih.gov). Sequence data for the Neurospora crassa monooxygenase gene NCU05376 and the M. roridum TRI genes were also obtained from the GenBank database. Sequence data for the ΨTRI101 region, CPR1, HIS3, RPB2, TEF1 and TUB2 in F. oxysporum and F. verticillioides were obtained from the Fusarium Comparative Database website (http://www.broadinstitute.org/). Gene designations that include FGSG_, FOXG_ and FVEG_ are for genes identified in the F. graminearum, F. oxysporum and F. verticillioides genome databases, respectively, with the Comparative Database. DNA sequences from N. haematococca (F. solani) were obtained from the N. haematococca v2.0 genomic sequence database website (http://genome.jgi-psf.org).
Other than the trichothecene-producing species that were the subject of this study, we have not yet identified an organism with closely related homologues of all the TRI genes used in our phylogenetic analyses. Therefore, sequences used as outgroups in the different phylogentic trees were not always from the same organism. For example, the outgroup for the combined TRI4 and TRI5 tree was the combined sequence of the M. roridum orthologues of these genes. However, because we were unable to detect a M. roridum TRI1 orthologue, the outgroup for the TRI1 tree was the N. crassa gene NCU05376, which is the gene in the GenBank database with the highest identity to TRI1.
We are grateful to Marcie L. Moore, April Stanley and Kimberly M. MacDonald for technical assistance, to Jennifer N. Steel and Nathane Orwig at the USDA-ARS-NCAUR DNA Sequence Facility for clean-up and electrophoretic analysis of sequence reactions, and to Stephen W. Peterson for assistance with phylogenetic analyses.