Fragmentation of an aflatoxin-like gene cluster in a forest pathogen

Authors


Author for correspondence:

Rosie E. Bradshaw

Tel: +64 6350 5515

Email: r.e.bradshaw@massey.ac.nz

Summary

  • Plant pathogens use a complex arsenal of weapons, such as toxic secondary metabolites, to invade and destroy their hosts. Knowledge of how secondary metabolite pathways evolved is central to understanding the evolution of host specificity. The secondary metabolite dothistromin is structurally similar to aflatoxins and is produced by the fungal pine pathogen Dothistroma septosporum. Our study focused on dothistromin genes, which are widely dispersed across one chromosome, to determine whether this unusual distributed arrangement evolved from an ancestral cluster.
  • We combined comparative genomics and population genetics approaches to elucidate the origins of the dispersed arrangement of dothistromin genes over a broad evolutionary time-scale at the phylum, class and species levels.
  • Orthologs of dothistromin genes were found in two major classes of fungi. Their organization is consistent with clustering of core pathway genes in a common ancestor, but with intermediate cluster fragmentation states in the Dothideomycetes fungi. Recombination hotspots in a D. septosporum population matched sites of gene acquisition and cluster fragmentation at higher evolutionary levels.
  • The results suggest that fragmentation of a larger ancestral cluster gave rise to the arrangement seen in D. septosporum. We propose that cluster fragmentation may facilitate metabolic retooling and subsequent host adaptation of plant pathogens.

Introduction

Production of toxic secondary metabolites (SMs) is an important adaptation found in many fungi that are pathogenic to plants (Desjardins & Hohn, 1997; Turgeon & Bushley, 2010). Intriguingly, even closely related pathogens often produce quite different sets of SMs, corresponding with specific host adaptations or pathogen lifestyles, suggesting rapid evolution of fungal SM repertoires (Ito et al., 2004; Turgeon & Bushley, 2010; de Wit et al., 2012). An understanding of how these SM pathways have evolved is crucial for making predictions about how plant pathogens might adapt to new hosts and environments.

The biosynthetic pathways involved in the production of SMs can be complex and the currently accepted paradigm is that most SM biosynthetic genes are clustered in fungal genomes (Keller et al., 2005). Clustered genes are involved in the production of phytotoxins such as cercosporin in Cercospora nicotianae (Chen et al., 2007), sirodesmin in Leptosphaeria maculans (Fox & Howlett, 2008) and trichothecenes in Fusarium spp. (Alexander et al., 2009), as well as toxins with unknown effects on plant hosts, such as aflatoxin in the opportunistic pathogen Aspergillus flavus (Yu et al., 1995, 2004; Brown et al., 1996).

Despite the prevalence of clustered SM genes in fungi, a more dispersed arrangement of genes is sometimes seen. For example, in the fungal grass endophyte Neotyphodium lolii 10 indole–diterpene SM genes occur in three groups over a 100 kb region, separated by extensive repetitive sequences (Young et al., 2006). In Aspergillus nidulans, prenyl xanthone biosynthesis involves 10 clustered genes, as well as three single genes on different chromosomes required for the final biosynthetic steps (Sanchez et al., 2011).

A striking example of a dispersed set of secondary metabolite genes is seen in the pine needle pathogen Dothistroma septosporum (Zhang et al., 2007b). This organism causes Dothistroma needle blight disease, the incidence of which has increased substantially in the last two decades, with proven associations with climate change (Brown & Webber, 2008; Woods, 2011). In D. septosporum, 19 genes involved in biosynthesis of dothistromin, a toxin chemically similar to the aflatoxin (AF) precursor versicolorin B, are dispersed across a single chromosome, clustered in six separate regions (hereafter called ‘loci’ numbered 1–6; de Wit et al., 2012). The genes are coregulated (Chettri et al., 2013) but show an expression pattern atypical of secondary metabolites, being expressed mainly during primary exponential growth (Schwelm et al., 2008). Dothistromin accumulates in infected pine needles, is phytotoxic to plant cells (Shain & Franich, 1981; Franich et al., 1986) and is known to play a role in virulence of D. septosporum (M. S. Kabir & R. E. Bradshaw, unpublished data).

Putative dothistromin genes, in a similar fragmented arrangement to that seen in D. septosporum, were also found in the peanut pathogen, Passalora arachidicola (Zhang et al., 2010), and the tomato pathogen, Cladosporium fulvum (de Wit et al., 2012), both close relatives of D. septosporum. The occurrence of AF/sterigmatocystin (ST) /dothistromin (ASD) orthologs in a fragmented arrangement in plant pathogens, but a clustered arrangement in opportunistic pathogens and saprotrophs such as Aspergillus flavus and A. nidulans raises questions about which gene arrangement is ancestral and how phytoxicity evolved.

In recent years, many studies have been devoted to elucidating mechanisms underlying the evolution of fungal gene clusters (Ehrlich & Yu, 2010; Slot & Rokas, 2010, 2011; Khaldi & Wolfe, 2011; Collemare & Lebrun, 2012). Evolution of the well--studied aflatoxin cluster appears to have involved gene duplication, subfunctionalization and relocation. A basal set of early-pathway genes is required for production of anthraquinones, to which later-pathway genes were recruited to add the different chemical decorations (Cary & Ehrlich, 2006; Carbone et al., 2007b; Moore et al., 2009). Indeed, gene relocation into clusters is a common theme in gene cluster evolution (Wong & Wolfe, 2005; Proctor et al., 2009; Slot & Rokas, 2010). Clustering of SM genes has been proposed to confer certain advantages that could account for the stable maintenance of clusters once they have formed, including efficient coordinated regulation of gene expression (Walton, 2000), reduced probability of partial pathway gene loss (Collemare & Lebrun, 2012) and horizontal gene transfer (HGT) of biosynthetic pathway genes between species (Walton, 2000). In contrast to studies of cluster formation, only a few studies document gene cluster fragmentation. For example, there is evidence that aflatrem genes in A. flavus and Aspergillus oryzae (Nicholson et al., 2009), and meroterpenoid genes in A. nidulans (Lo et al., 2012), each involved one cluster that was subsequently split between two chromosomes.

The dispersed arrangement of dothistromin genes in D. septosporum provide a unique opportunity to investigate evolutionary events underlying the array of both clustered and fragmented ASD pathways involved in the production of SMs with different ecological roles. We investigated the origin and evolution of the dothistromin genes using comparative genomics and population genetics, which together represent a broad evolutionary perspective across phylum, class and species timescales. The results from these analyses support a model in which fragmentation and shuffling of a single common ancestral cluster enabled production of a potent phytotoxin and virulence factor, dothistromin.

Materials and Methods

Detection of homologs and related clusters

Homologs of dothistromin genes were detected in a local database of 129 fungal proteomes (see the Supporting Information, Table S1) using blastp. Hits that were 50–150% the length and ≥ 40% similar to the query in at least one high-scoring pair were retained for further analysis. Each homologous group was then reduced to a more similar set using orthomcl (version 1.4; Li et al., 2003) with an inflation rate of 2.0, treating all sequences as within-species. Genomic clustering of homologs of dothistromin genes identified from these 129 fungal genomes was inferred as described previously (Slot & Rokas, 2010) using custom perl scripts (available from the authors on request). Genes were considered clustered if separated by no more than six intervening annotated genes. Comparisons between C. fulvum (Cooke) and D. septosporum (G. Dorog) M. Morelet were performed using the Joint Genome Institute (JGI) synteny tool (http://www.jgi.doe.gov/), with manual curation and synteny map assembly. Dothistromin genes in D. septosporum and other Dothideomycetes are named based on their orthology with AF genes, using the descriptive AF names, but with first letter capitalization to follow the Dothideomycete gene-naming convention (Chettri et al., 2013).

Phylogenetic analyses of homologs

Amino acid sequences in each homologous group were aligned using mafft (v. 6.847; Katoh et al., 2002; Katoh & Toh, 2008) using default parameters. Sequences that shared < 30% position homology with the majority were removed from the alignment using trimA1 (v. 1.2; Capella-Gutierrez et al., 2009). Retained sequences were then realigned and characters represented in < 30% of sequences were removed with trimAl. Maximum likelihood analysis was performed in raxml (v. 7.2.0; Stamatakis, 2006) under the protgammajtt model from 100 random parsimony starting trees, and support for the optimal topology was assessed with 100 bootstrap replicates, using the −f a option.

PCR and sequence analysis of D. septosporum

Strains of D. septosporum used for PCR confirmation of gene arrangement and sequenced for recombination analysis are shown in Table S2. Cultures were grown on Dothistroma medium (DM; Bradshaw et al., 2000) or potato dextrose agar (PDA) and incubated at 22°C for 10–14 d. Fungal mycelia were harvested from cellophane-covered plates and genomic DNA extracted using the method of Moller et al. (1992). The PCR reactions included 0.5 U Platinum Taq polymerase (Invitrogen), 1× Invitrogen PCR buffer, 1.5 mM MgCl2, 50 μM dNTPs, 0.4 μM of each primer and 5 ng genomic DNA template. Cycling conditions included an initial denaturation of 94°C (2 min) followed by 36 cycles of 94°C (30 s), 55°C (40 s), 72°C (3 min) then a final extension of 5 min at 72°C in an Eppendorf Mastercycler Gradient (Eppendorf, Hamburg, Germany). Primer sequences used for PCR amplification of intergenic regions are shown in Table S3. The PCR products were purified using a High Pure PCR Product Purification kit (Roche) and sequenced using an ABI3730 Genetic Analyser (Applied Biosystems Inc., Foster City, CA, USA).

Population analyses

DNA sequences for each isolate, grouped by locus (1–6), were imported into sequencher 5.0 (Gene Codes Corporation, Ann Arbor, MI, USA) where they were manually aligned, edited, and trimmed. Sequence alignments were imported into SNAP Workbench (Price & Carbone, 2005) for population genetic analyses. Patterns of linkage disequilibrium (LD) and the putative positions of intra- and inter-locus recombination events were examined by: combining all sequenced regions using SNAP Combine (Aylor et al., 2006) into a single concatenated sequence alignment; collapsing the alignment to infer multilocus haplotypes using SNAP Map (Aylor et al., 2006) with the options of recoding indels (insertions/deletions) as binary characters and excluding infinite sites violations, both required for genealogical inference using genetree (Bahlo & Griffiths, 2000); and inferring the position of recombination events using the compatibility approach implemented in recmin (Myers & Griffiths, 2003); LD across the entire cluster was then quantified using r2, a measure of LD of allelic states at pairs of sites, as implemented in tassel version 1.1.0 (Bradbury et al., 2007). The presence of all four possible haplotypes for a pair of diallelic sites results in incompatibility, which is an indication of either parallel mutation or recombination (Hudson & Kaplan, 1985). Marker compatibility among pairs of informative sites was determined using the clade program (Bowden et al., 2008). An LD/compatibility plot showing the physical location of variable positions in the cluster was generated using the clade and matrix programs (Bowden et al., 2008). The LD/compatibility plot was examined to identify blocks of variable sites that were highly correlated (0.8 < r2 < 1) and largely compatible. Compatibility among closely linked sites gives rise to blocks of compatible sites along the diagonal of the LD/compatibility plot. Because highly divergent haplotypes sampled once or at a low frequency could be potential targets of balancing selection in AF clusters (Carbone et al., 2007a; Moore et al., 2009) they were not excluded in the LD/compatibility analysis and the strength of LD was assessed using both r2 and compatibility among pairs of sites. Genealogical analysis across the entire cluster was based on the largest nonrecombining partition extracted using cladeex (Bowden et al., 2008). Coalescent simulations using genetree were performed as described previously (Moore et al., 2009).

Results

Genes in the dothistromin pathway originated via vertical descent

A search across four fungal classes revealed a sparse cross-species distribution of ASD orthologs (Fig. 1, Table S4). For example, clusters of ASD orthologs are found in just five of 12 complete Dothideomycete genomes and five of 22 complete Eurotiomycete genomes examined. Phylogenetic trees of the ASD orthologs (Fig. S1, Table S4) showed that only one tree supports an origin of Dothideomycete genes by HGT from Eurotiomycetes, previously suggested by a limited taxon sample (Slot & Rokas, 2011), and only two trees support an origin by vertical transmission in both Eurotiomycetes and Dothideomycetes. By contrast, nine gene trees are consistent with ASD genes having a deeper phylogenetic history in Dothideomycetes than in Eurotiomycetes, and the remaining eight gene trees either suggest more complex patterns of inheritance or receive little support. Although the history of the ASD genes is complex, overall the phylogenies are most consistent with vertical inheritance of dothistromin genes, as generally seen for other fungal SM genes (Kroken et al., 2003; Turgeon & Bushley, 2010).

Figure 1.

Evolution of the dothistromin-like gene clusters in fungi. (a) The distribution of AF(aflatoxin)/ST(sterigmatocystin)/DOTH(dothistromin)-like (ASD) gene clusters among four classes of Ascomycota for which whole genomes were investigated. Taxon names in bold red contain one or more clusters of at least three orthologs of ASD genes. Phylogenetic relationships were drawn manually with reference to published analyses (James et al., 2006; Schoch et al., 2006; Zhang et al., 2007a; Gräser et al., 2008; Sharpton et al., 2009; Wang et al., 2009) and an RNA polymerase II subunit RPB2 phylogeny (see the Supporting Information, Fig. S1). (b) Clustering of all orthologs of ASD genes. Orthologous genes are identically colored. Gene pairs with two alternatively ordered colors (AvnA/VerB and AvfA/OrdB) represent paralogous genes. Grey boxes in clusters represent intervening genes not known to be involved in dothistromin synthesis. Known examples of gene truncation and pseudogenization are marked with ** and ψ, respectively. Functionally characterized clusters are labeled AF, ST or DOTH.

Evidence for a larger ancestral cluster in Dothideomycetes

We examined syntenic arrangements of ASD genes across a broad phylogenetic range of fungal species to determine if an ancestral gene order could be determined. However, with the exception of Podospora anserina, which has an ST cluster arrangement very similar to that of A. nidulans as a consequence of HGT (Slot & Rokas, 2011), a high level of gene reordering is evident, both between classes (Eurotiomycetes vs Dothideomycetes) and within Dothideomycetes (e.g. D. septosporum vs Rhytidhysteron rufulum; Fig. 1b). This extensive gene shuffling, coupled with a sparse distribution among fungal genomes, prevents the unambiguous inference of a complete ancestral gene order.

Despite extensive rearrangements, there are two lines of evidence, shown in Fig. 1(b), that suggest clustering of ASD genes was ancestral to the fragmented arrangement seen in D. septosporum. The first argument comes from certain gene linkages that are highly conserved across the ASD gene clusters. For example, the divergently transcribed HexA–HexB [aflA–aflB] gene cluster, responsible for the formation of core hexanoic acid units, is nearly universal. The divergently transcribed AflR–AflJ regulatory module is almost as highly conserved. The probability of these genes being adjacent by chance in such divergent fungi is extremely low, suggesting they were clustered as gene modules in an ancestral species.

The second line of evidence comes from the distribution of two paralogous gene pairs in the ASD cluster. Phylogenetic analyses confirmed paralogy of AvnA/VerB [aflG, aflL], which are P450 monooxygenases, and of AvfA/OrdB [aflI, aflX], which are NAD(P) reductases (Fig. S2). It can be assumed that tandem duplication, which is a prominent mechanism of gene family expansion in fungi (Chow et al., 2012), gave rise to each set of paralogous genes, because a tandem arrangement is evident for AvnA/VerB in AF clusters and for AvfA/OrdB in ST clusters (Fig. 1b). Paralogs AvfA and OrdB occur in different loci (2 and 5, respectively) in D. septosporum. If these two genes arose by tandem duplication, loci 2 and 5 must have originally been physically linked. Similarly, the positions of paralogs AvnA and VerB suggest ancestral linkage of loci 5 and 6. Together, the distribution of homologous gene pairs among clusters across all species supports the hypothesis that an ancestral configuration of clustered ASD genes preceded the fragmented configuration seen in Dothideomycetes today.

Evidence for dynamic gene organization within Dothideomycetes

To understand how the dothistromin gene cluster might have evolved over a shorter time-scale, and to further test the hypothesis of ancestral clustering, we examined differences in ASD gene organization between several Dothideomycete species. First, the saprophytic species R. rufulum, which is distantly related to D. septosporum within the Dothideomycetes, appears to have a full set of genes required for sterigmatocystin biosynthesis (Ohm et al., 2012). Its ASD gene order is different from that of D. septosporum (Fig. 1b), although the extent of gene clustering cannot be accurately assessed because of the highly fragmented R. rufulum genome assembly. However, in addition to the conserved HexA/B, AflR/J gene pairs discussed earlier, both R. rufulum and D. septosporum show linkage of CypX and AvfA to another ‘basal’ pathway gene PksA (Ds locus 2), supporting another clustered ancestral configuration.

The Dothideomycete L. maculans contains orthologs of seven dothistromin genes arranged in two regions adjacent to repetitive DNA (Fig. 1b, Table S5). Of these, six are ‘basal’ pathway genes (HexA/B, PksA, AflR/J, Nor1; Cary & Ehrlich, 2006). The arrangement of these genes is unusual because the gene pairs HexA/B and AflR/J do not show the conserved paired arrangement, although they are located close together. Instead, the pseudogene HexA has broken up into two shorter sequences; the smaller of these sequences remains paired with HexB adjacent to AflJ at one end of a cluster, while the larger is paired with AflR at the other end of the cluster. This arrangement in L. maculans provides evidence for loss of HexA function through gene rearrangement, which may have led to pathway degeneration. The L. maculans cluster is interspersed with typical SM cluster genes such as oxidoreductases, as well as a stress-responsive A/B barrel domain protein gene also associated with other ASD clusters, but with unknown function (Table S5). Overall, these results are suggestive of ancestral clustering, with rearrangements indicating rapid evolution of this region of the genome.

The ASD orthologs in C. fulvum, a close relative of D. septosporum, are also distributed between six loci found on three scaffolds (de Wit et al., 2012; Fig. 2a). However, the C. fulvum genome is not assembled to chromosome level because of its high repeat content and it is not known if these scaffolds map to one chromosome. The configuration of loci seen in D. septosporum is largely conserved in C. fulvum, although they are closer together. For example, loci 4, 5 and 6 are spread over a 98 kb region of one scaffold in C. fulvum, but over a 406 kb region in D. septosporum. Similarly, loci 1 and 2 are separated by only 133 kb in C. fulvum but 507 kb in D. septosporum. Furthermore, part of locus 1 is combined with cluster fragment 3 in C. fulvum. Although these comparisons do not indicate whether the tighter or highly dispersed clustering arrangement is ancestral, they are suggestive of ancestral linkages between these loci and further suggest that gene rearrangements have occurred in the relatively short evolutionary time separating these two species.

Figure 2.

Synteny comparisons between Dothistroma septosporum and Cladosporium fulvum scaffolds. (a) Overview of regions of three C. fulvum scaffolds corresponding to the six loci with dothistromin genes in D. septosporum chromosome 12. (b) Detail of synteny for each D. septosporum locus (1–6). In each case the top bar indicates a region of the D. septosporum chromosome (labeled 1–6) marked in 10 kb intervals. The lower bars correspond to the three C. fulvum scaffolds (colored red, blue and green) shown in (a), with additional scaffolds in brown. Note that none of the C. fulvum dothistromin gene loci is right at the edges of scaffolds, thus breakdown of synteny is not caused by constraints in assembly. Genes common to both species are indicated with black arrows (dothistromin genes) or grey arrows (other genes). Additional genes found at that location only in C. fulvum are shown as white arrows. Asterisks show positions of recombination detected in D. septosporum; a large black asterisk indicates a high level of recombination (Rmin ≥ 3) and a small grey asterisk a lower level (Rmin < 3).

A closer inspection of C. fulvum and D. septosporum genome synteny showed that sites of interspecific genome rearrangement often map to the ends of the D. septosporum dothistromin loci, as shown in Fig. 2(b). Sites of genome rearrangement are seen as loss of synteny, where different C. fulvum scaffolds (or different parts of the same scaffold) align to D. septosporum. Genome rearrangements are particularly evident in and around D. septosporum locus 1. This region, positioned remotely from the other dothistromin gene loci near one of the telomeres (Fig. 2a), includes an insertion/deletion immediately upstream of Ver1 and is split between two scaffolds in C. fulvum (Fig. 2b). Additional genes, DotB and DotC, that are not orthologs of ST or AF genes but are regulated by the pathway regulator AflR in D. septosporum (Chettri et al., 2013) may have been recruited into locus 1 at different times. DotB is a candidate late-pathway dothistromin gene (Chettri et al., 2013) that is also adjacent to Ver1 in C. fulvum; DotC encodes a major facilitator superfamily (MFS) transporter that affects dothistromin production (Bradshaw et al., 2009) and is located on a different scaffold from Ver1 in C. fulvum. Overall, this pattern is consistent with reorganization from an ancestral state, with extensive intrachromosomal rearrangements between these two species (de Wit et al., 2012). The relatively recent split between D. septosporum and C. fulvum indicates the evolutionary speed at which gene rearrangements and recruitment of new cluster genes can occur in fungal SM pathways.

Considering the Dothideomycetes together, the most parsimonious explanation for the highly dispersed arrangement of dothistromin genes in D. septosporum is that it is derived from a more tightly clustered ancestral arrangement, for which there is evidence in both L. maculans and C. fulvum. Furthermore there is extensive evidence for gene order reorganization both within and between loci, with differences in organization being more pronounced between the most distantly related species, as would be expected if there was vertical inheritance concomitant with ongoing genome rearrangements.

Recombination hotspots in D. septosporum are concordant with deeper phylogenetic breakpoints

We sought to determine whether the arrangement of dothistromin genes seen in the genome reference strain, D. septosporum NZE10, is representative of the species worldwide. This strain has been reproductively isolated as an asexually reproducing clone in New Zealand for at least 60 yr (Hirst et al., 1999; Groenewald et al., 2007). PCR amplification between adjacent pairs of dothistromin genes, and between dothistromin genes and flanking genes, showed an identical microsyntenic arrangement of genes in a sample of 17 strains representative of the global population of this species (Table S2, Fig. S3). Included in this sample were strains collected in New Zealand in 1969 (NZE3) and 2005 (NZE10), indicating no evidence of gene rearrangements over that period.

Population haplotype and LD analyses can provide useful insights into the ancestral history and underlying genealogy of chromosomes (Nordborg & Tavaré, 2002). We sequenced 15 regions of the dothistromin gene loci in 17 D. septosporum strains, straddling intergenic and flanking regions, as shown in Fig. 3(a) (details in Table S3). The patterns of mutations in these regions were used to reconstruct the evolutionary history of the strains (Fig. S4). The most divergent strain was from Guatemala (GUA1), and was confirmed to be D. septosporum by ribosomal ITS sequencing (White et al., 1990) and PCR amplification using species-specific mating type primers (Groenewald et al., 2007). Thus our data suggest that the organization of dothistromin genes is conserved even in a deeply divergent strain of D. septosporum.

Figure 3.

Linkage disequilibrium (LD) analysis in Dothistroma septosporum. (a) The block arrows at the top show the direction of transcription of 20 putative AF(aflatoxin)/ST(sterigmatocystin)/DOTH(dothistromin)-like (ASD) genes with orthologs shown in bold type. Grey rectangles indicate the (mostly intergenic) regions sequenced in 17 strains of D. septosporum. Vertical dashed lines indicate inferred recombination events within these sequenced regions. The 20 ASD genes are arranged in six groups (loci 1–6) that are physically separated across the chromosome, and in blocks (A–D) that indicate regions with low recombination. (b) Linkage disequilibrium (lower matrix) and site compatibility (upper matrix) pairwise analysis of 416 polymorphisms along the X and Y axes that span the sequenced regions shown in (a) across chromosome 12 of D. septosporum. The upper triangular matrix indicates incompatible sites (in black) that have different evolutionary histories. In the lower triangular matrix, the red color shows a high value of r2, a statistical coefficient indicative of reduced recombination and shared ancestry. Linkage disequilibrium and site compatibility was assessed for variation among loci and identified five ‘low recombination’ blocks (A, B, C5a, C5b, D), shown within the matrix. The size of a recombination block depended on the number of contiguous pairs of sites that were both strongly correlated (0.8 < r2 < 1) and highly compatible (< 10 incompatible sites). Arrows labeled (i–vii) indicate specific features that are discussed in the text.

We performed LD analysis to determine the extent that recombination has shuffled alleles from the dothistromin gene loci. A very low level of recombination would indicate a common evolutionary history, suggestive of ancestral clustering. As shown in Fig. 3, the telomeric sides of loci 1 and 6 (arrows (i) and (vii)) show more evidence of recombination (r2 < 0.5) than locus segments located more centrally in the chromosome (r2 > 0.8). Fig. 3 also shows that sequence variation found in the 15 intergenic regions could be separated into at least six distinct blocks within which very little recombination seems to have occurred (shown as absence of shading in the upper compatibility matrix, and red coloration indicating high correlation in the lower matrix in Fig. 3b). Recombination block A includes variation across loci 1, 2 and 3; subblock A1/A2 unites loci 1 and 2 and subblock A2 unites loci 2 and 3 (MoxY/AflR), showing that although these regions are far apart on the chromosome, they show high marker linkage usually associated with physically contiguous genes (arrow (iv) in Fig 3b). Interestingly, some of the high LD associations correspond to gene linkages that are physically close in C. fulvum. For example, the DotC and PksA/CypX gene regions, that are located far apart but share a common evolutionary history in D. septosporum (loci 1 and 2, respectively), are located much closer together in C. fulvum (the blue scaffold regions in Fig. 2a).

As well as showing some connections between specific loci, the data also suggest an overall low level of recombination and a shared evolutionary history across the entire orthologous cluster. This is shown by strong correlation (intense red shading in the lower portion of the matrix in Fig. 3b), connecting each of the loci even though they span > 1 Mb of sequence. The exception is locus 4 (Block B; arrow (v)), which provides a control because it contains just one gene, Est1, which is not an ASD ortholog but is co-regulated along with other dothistromin genes by the pathway regulator, DsAflR (Chettri et al., 2013). Locus 4 shows a recombination history distinct from the five other loci, with high incompatibility (black shading in top matrix) and low association (lack of red color in bottom matrix) when compared with other loci. Recombination block C5b also shows a high level of incompatibility and an independent history (arrow (vi)), but this is positioned just outside locus 5, downstream of VbsA, and therefore strictly outside the dothistromin gene region. In summary, this analysis suggests a lower than expected amount of recombination across all the loci that contain ASD orthologs, which is consistent with a common evolutionary history in an ancestral cluster.

Comparisons were made between recombination hotspots in the fragmented dothistromin cluster (indicated by asterisks in Fig. 2b and by multiple vertical dashed lines in Fig. 3a) and sites of genome rearrangements between C. fulvum and D. septosporum. Most striking is the high level of recombination associated with locus 1. This occurs both upstream of Ver1 and between DotB and DotC (indicated by arrows (ii) and (iii), respectively, in Fig 3b); these regions are associated with deletion and extensive rearrangement in the deeper phylogenetic comparison with C. fulvum (Fig. 2b). Hence, a high level of recombination appears to have occurred in a dothistromin-specific region of the fragmented cluster. Other recombination hotspots coincide with the ends of loci 5 and 6, and with D. septosporum genes EpoA and Est1 that are not true ASD orthologs based on phylogenetic analysis (Fig. S1). Taken together, the species-level analysis suggests a shared evolutionary history among all loci with ASD orthologs in D. septosporum, and recombination hotspots that are consistent with reorganization breakpoints between D. septosporum and C. fulvum.

Discussion

A large ancestral ASD cluster is fragmented in its descendants

We questioned whether the highly dispersed arrangement of dothistromin toxin genes seen in the pine pathogen D. septosporum was ancestral to the type of compact clusters seen in Aspergillus species. While evidence builds for gene cluster formation by recruitment of genes, for example in the DAL (Wong & Wolfe, 2005) and GAL (Slot & Rokas, 2010) clusters in yeast and the trichothecene and fumonisin clusters in Fusarium spp. (Proctor et al., 2009; Khaldi & Wolfe, 2011), there have been few investigations into the mechanisms of gene cluster fragmentation (Nicholson et al., 2009; Lo et al., 2012).

In D. septosporum, evidence from three levels of analysis supports a hypothesis of fragmentation from a more closely clustered ancestor (Fig. 4). At the phylum level the locations of paralogous ASD genes suggest ancestral linkage of genes within loci 2, 5 and 6. At the class level comparisons with gene arrangements in C. fulvum suggest linkage of loci 1 and 2, as well as 4, 5 and 6, while in L. maculans clustering of basal early-stage genes (PksA, HexA/B, AflR/J and Nor1) suggests ancestral linkage of loci 2, 3, 5 and 6 that contain these genes. Finally, at the species level LD analysis suggests a shared evolutionary history of loci 1, 2 and 3, with evidence for an overall shared history of all loci except 4, which contains just one non-orthologous gene (Est1). So, although it is not possible to infer direction (i.e. whether dispersed genes were clustered, or whether clustered genes were dispersed over time) in every line of evidence, overall the most parsimonious explanation is that the D. septosporum arrangement is a fragmented version of a tighter, although not necessarily contiguous, ancestral SM cluster. The alternative hypothesis, that a fragmented state was ancestral in ASD clusters, would involve several unlikely events. First, ASD paralogs (AvnA/VerB; AvfA/OrdB) would have to relocate from distant sites to become adjacent to each other in the genomes of the AF/ST-producing fungi as shown in Fig. 1(b). Second, relocation of ASD genes into tighter clusters would have to have occurred independently in both L. maculans and C. fulvum to explain their observed gene distributions compared with D. septosporum. This seems unlikely as of these three Dothideomycetes only D. septosporum has a functional ASD cluster; the first pathway gene, HexA, is truncated in L. maculans (Table S5) and pseudogenized in C. fulvum such that dothistromin is not produced (Chettri et al., 2013). Therefore, it is difficult to envisage selective mechanisms promoting cluster formation in those species. Third, population analyses would be expected to provide evidence of higher levels of recombination between the D. septosporum loci than is seen, judging by the levels of recombination observed with the ‘control’ (nonorthologous) locus 4.

Figure 4.

Evidence for ancestral linkage between Dothistroma septosporum dothistromin loci. Grey shading indicates pairwise combinations of loci 1–6 that share a common history based on evidence from linkage disequilibrium analysis. Letters within the shaded boxes indicate where there is further evidence of ancestral linkage from studies of gene associations in the related fungi Cladosporium fulvum (C) and Leptosphaeria maculans (L) or from paralogs (P) (AvfA/OrdB or AvnA/VerB) predicted to be ancestrally linked based on the assumption of tandem duplication, as explained in the text. Note that locus 4, which has the least evidence for ancestral clustering, contains only one gene (Est1) that is not an ASD ortholog but is thought to be involved in dothistromin biosynthesis based on its coregulation with other dothistromin genes.

Concordance of recombination sites and genome rearrangements suggests a mechanism for dothistromin cluster fragmentation

There is a general association between divergence time and the extent of ASD gene rearrangement among the Ascomycetes, which is consistent with a gradual process of gene shuffling over time since the last common ancestor. Reconstruction of a putative ancestral gene order was not possible because of extensive rearrangements that have occurred at species level and above. Mesosynteny, in which a chromosome contains the same set of genes but in different orders and orientations, is common in Ascomycetes, and particularly in Dothideomycetes (Hane et al., 2011). Mesosynteny is speculated to occur by inversions during meiosis (Hane et al., 2011) and the same mechanism may have contributed to cluster reorganization. Transposons and repeat sequences associated with some gene clusters have also been implicated in genome rearrangements (Young et al., 2006; Ehrlich & Yu, 2010) but dothistromin gene regions in D. septosporum do not exhibit these features (de Wit et al., 2012; Fig. S5).

We have shown that rearrangements of ASD gene regions between species are associated with recombination hotspots among D. septosporum populations. For example, the breakdown of gene synteny between D. septosporum and C. fulvum in regions flanking the dothistromin loci, or within locus 1, corresponds with sites of active recombination within D. septosporum (Figs 2, 3). We therefore propose that cluster fragmentation occurred as a result of intrachromosomal recombination events. Similar population studies involving AF clusters have previously revealed concordance between recombination and gene rearrangements. The AF cluster in A. parasiticus, for example, has five recombination blocks similar to those shown in Fig. 3 for dothistromin. Some of these blocks also showed a shared history despite being well separated from each other within the AF cluster. Intriguingly, orthologs of genes bordering these blocks (aflB, aflL in one pair of blocks and aflE, aflM in another) are adjacent to each other in the ST cluster. This has served as evidence that an ancestral ST-type arrangement of adjacent gene pairs was followed by rearrangement and separation, which resulted in the current AF cluster (Carbone et al., 2007a).

Recruitment of new genes may be facilitated by cluster fragmentation

The prevalence of SM gene clustering in fungi has fuelled many hypotheses about selective advantages of clustering (Ehrlich & Yu, 2010; Collemare & Lebrun, 2012). Dispersed dothistromin genes in D. septosporum clearly function effectively and the arrangement has been maintained in the species but there is no clear explanation for why this dispersed arrangement persists. Dothistromin is unusual among SMs in being expressed in exponential rather than stationary phase (Schwelm et al., 2008). It is possible that the genes can escape tight localized chromatin-level or other telomere-associated regulation (Palmer & Keller, 2010; Shaaban et al., 2010) by being dispersed to positions distant from telomeres. However, the expression level of dothistromin genes is independent of (i.e. not correlated with) their distance from the telomere (Chettri et al., 2013).

Another explanation for cluster fragmentation could be pathway diversification. In D. septosporum the most intense intraspecific recombination and interspecific cluster rearrangement occurred within and around locus 1, where both gene function reassignment and gene recruitment appear to have occurred. The region contains a single ASD ortholog, Ver1, whose isolation from other ASD orthologs contrasts with the locations of Ver1 genes within AF/ST clusters. Dothistroma septosporum Ver1 is required near the final stage of dothistromin biosynthesis (Bradshaw et al., 2002) but is thought to catalyse a different reaction from its AF/ST orthologs (Henry & Townsend, 2005). Alongside Ver1 are two genes, DotB and DotC, that are not orthologs of ASD genes and therefore may represent new recruits. The MFS transporter gene DotC is of particular interest as it is strongly regulated by the pathway regulator AflR (Chettri et al., 2013) and has a role in regulating dothistromin production (Bradshaw et al., 2009). It is immediately adjacent to a recombination hotspot in D. septosporum and to a site of interspecific gene rearrangement compared with C. fulvum. The other gene in this locus, DotD, has low expression and is not predicted to have a role in dothistromin production (Chettri et al., 2013). Because the dothistromin pathway is nonfunctional in C. fulvum owing to HexA pseudogenization, it is not known if differences in DotB/DotC gene arrangements in this species would have functional consequences.

A model for the evolution and retooling of the ASD gene clusters

Retooling an ancient cluster could lead to a wide range of related SMs with different biological properties and may facilitate adaptation to new niches, such as different pathogenic lifestyles and alternative hosts. The ASD pathways share common intermediates and enzymes until versicolorin A (VA; Schwelm & Bradshaw, 2010). The conversion of VA to ST, and subsequently into AF, requires several additional genes, including verA, omtB, omtA and ordA, for which D. septosporum does not have orthologs (Ohm et al., 2012). Fig. 5 suggests a model for the evolution of the ASD gene clusters, based on the assumption of a common ancestral cluster for Dothideomycetes and Eurotiomycetes. The model includes the concept of a ‘core module’ of basal ASD genes (i.e. HexA/B, PksA, Nor1, AflR/J), followed by recruitment of decorating genes, as previously suggested (Cary & Ehrlich, 2006; Ehrlich, 2008; Cary et al., 2012). In this scheme, Dothideomycetes and Eurotiomycetes are considered to have diverged after establishment of an ancient cluster capable of producing an unknown derivative of VA. Subsequent gene recruitment, duplication, subfunctionalization, rearrangement and loss led to the diversity of SMs made by present-day species. The ensuing fragmentation of clusters in Dothideomycetes likely occurred because of high levels of intrachromosomal recombination (Hane et al., 2011).

Figure 5.

A model for AF(aflatoxin)/ST(sterigmatocystin)/DOTH(dothistromin)-like (ASD) cluster evolution. A core cluster of basal genes (black arrows) required for biosynthesis of colored noranthrones is proposed to have arisen in a common ancestor of Eurotiomycetes and Dothideomycetes. This was followed by the addition of decorating genes (grey arrows) such as P450 monooxygenases and NAD(P) reductases. Duplication of AvnA (striped arrow) and AvfA (white arrow) provided paralogs VerB and OrdB (also shown in Fig. 1) that function at later pathway stages. The ancestral ASD cluster at this stage could have been a tight cluster or a loosely linked assemblage of ASD genes. Diversification of secondary metabolism occurred by recruitment, duplication, reassignment and even loss of existing genes. In the case of Dothistroma septosporum, and probably other dothistromin-producing Dothideomycetes, ASD pathway evolution was accompanied by fragmentation of the ancestral cluster, while in Eurotiomycetes further consolidation of the AF/ST clusters occurred following gene recruitment. Owing to the high levels of gene gain and loss the point of divergence of Eurotiomycetes and Dothideomycetes within this scheme is difficult to estimate with current data.

If we accept the argument of an ancestral cluster, it follows that gene loss must also have occurred in some species, as not all species with ASD orthologs carry the same set of genes. There are examples of gene loss in the evolution of other SM gene clusters, such as for ACE1 (Khaldi et al., 2008), bikaverin (Campbell et al., 2012) and lolines (Kutil et al., 2007). In the ASD cluster of L. maculans (Table S5) gene rearrangement bisected a gene (HexA); over time this nonfunctional gene will probably be lost.

While the enzymology facilitating the conversion of VA into DOTH is currently unknown, it is predicted to involve the ASD orthologs OrdB and Ver1 (Bradshaw et al., 2002), as well as an ortholog of NorB (chromosome 11), which have different biosynthetic roles compared with those in AF/ST clusters (Henry & Townsend, 2005; Chettri et al., 2013). This suggests that an ancient cluster could be retooled not only by gene recruitment, mutation or loss, but also by using common genes in different biosynthetic shunts to yield SMs with different characteristics. Indeed, several of the AF enzymes are known to operate in a ‘metabolic grid’ rather than a linear biosynthetic pathway, highlighting their versatility (Yabe et al., 2003; Sakuno et al., 2005). A similar phenomenon is reported in the biosynthesis of other SMs such as indole-diterpenes (Saikia et al., 2012), suggesting that metabolic versatility of enzymes is a commonly exploited feature during SM evolution in fungi and provides an important tool for niche adaptation by plant pathogens.

Given the success of D. septosporum as a needle blight pathogen and the virulence function of dothistromin, our results suggest that a biosynthetic pathway comprising genes located in separate regions of a chromosome does not negatively affect fitness compared with a fully clustered pathway. In terms of understanding the molecular basis of plant–pathogen interactions, this work suggests that highly effective SMs can evolve by gene dispersal and therefore some important virulence factors may be missed if pathogen genomes are only searched for clusters of SM genes.

Acknowledgements

This work was supported by the Tertiary Education Commission via Bio-Protection Research Center grants to R.E.B. and M.P.C., by the Royal Society of New Zealand via a Rutherford Fellowship to M.P.C. and by a Willie Commelin Scholten Foundation Fellowship to R.E.B. The D. septosporum genome was sequenced by the US Department of Energy Joint Genome Institute (http://www.jgi.doe.gov/), supported by the Office of Science of the US Department of Energy under Contract no. DE-AC02-05CH11231). Arne Schwelm, Justine Baker and James Wang (Massey University) are thanked for initial analysis on the dothistromin gene order. Joey Spatafora is thanked for sharing genome sequence data for R. rufulum before publication.

Ancillary