The nonribosomal peptide and polyketide synthetic gene clusters in two strains of entomopathogenic fungi in Cordyceps


  • Wen-Jing Wang,

    1. State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
    2. Department of Bioorganic Chemistry, Max-Planck-Institute for Chemical Ecology, Jena, Germany
    3. Graduate University of the Chinese Academy of Sciences, Beijing, China
    Search for more papers by this author
  • Heiko Vogel,

    1. Department of Entomology, Max-Planck-Institute for Chemical Ecology, Jena, Germany
    Search for more papers by this author
  • Yi-Jian Yao,

    Corresponding author
    • State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
    Search for more papers by this author
  • Liyan Ping

    Corresponding author
    • Department of Bioorganic Chemistry, Max-Planck-Institute for Chemical Ecology, Jena, Germany
    Search for more papers by this author

Correspondence: Yi-Jian Yao, Institute of Microbiology, Chinese Academy of Sciences, PO Box 2714, Beijing 100101, China. Tel.: + 86 10 64807496; fax: + 86 10 64807518; e-mail:


Species of Cordyceps Fr. are entomopathogenic fungi that parasitize the larvae or pupae of lepidopteran insects. The secondary metabolites, nonribosomal peptides and polyketides are well-known mediators of pathogenesis. The biosynthetic gene clusters of these compounds in two fungal strains (1630 and DSM 1153) formerly known as Cordyceps militaris were screened using polymerase chain reaction with degenerate primers. Two nonribosomal peptide synthetase genes, one polyketide synthetase gene and one hybrid gene cluster were identified, and certain characteristics of the structures of their potential products were predicted. All four genes were actively expressed under laboratory conditions but at markedly different levels. The gene clusters from the two fungal strains were structurally and functionally unrelated, suggesting different evolutionary origins and physiological functions. Phylogenetic and biochemical analyses confirmed that the two fungal strains are not conspecific as currently assigned.


Nonribosomal peptides (NRPs) and polyketides (PKs) are two large groups of secondary metabolites with remarkable diversity in both structure and biological function (Du & Lou, 2010; Parsley et al., 2011). NRPs refer to linear, cyclic or branched peptides that are synthesized nonribosomally by sequential condensation of proteinogenic or nonproteinogenic amino acids; PKs are produced by the sequential condensation of acetate or other short-chain carboxylic acids (Du et al., 2001; Ansari et al., 2004). In general, NRPs and PKs function as defensive compounds, metal-chelating agents, mediators of symbiosis, and sex hormones (Demain & Fang, 2000).

Modules of fungal nonribosomal peptide synthetase (NRPS) generally consist of an adenylation domain (A) for the recognition and activation of substrates, a thiolation domain (T) for the covalent binding and transfer of amino acids, and a condensation domain (C) for the peptide bond formation (von Döhren, 2004; Hoffmeister & Keller, 2007). Accessory domains of NRPSs, such as thioesterase (TE) and methyl transferase (MT) domains, are commonly found (Caboche et al., 2008). Fungal polyketide synthetase (PKS) modules also consist of three core domains: an acyltransferase domain (AT) for elongation unit selection, an acyl carrier protein (ACP) for shuttling biosynthetic intermediates, and a ketosynthetase domain (KS) for decarboxylative condensation (Hoffmeister & Keller, 2007). Accessory domains of PKSs include ketoreductase (KR), dehydratase (DH), enoyl reductase (ER), methyl transferase (MT), thioesterase (TE) and reductase (R) domains (Campbell & Vederas, 2010). The last two are known to mediate product release in both PKSs and NRPSs (Du & Lou, 2010).

Cordyceps militaris (L.) Link, which parasitizes the larvae or pupae of lepidopteran insects, is the type species of the genus Cordyceps. This fungus has been widely used in oriental traditional medicine (Kim et al., 2009; Sakurai et al., 2010) and in the isolation of bioactive natural products (Yuan et al., 2007; Paterson, 2008; Molnar et al., 2010; Wong et al., 2011). Among the six anamorphic genera of Cordyceps (Sung et al., 2007), only the biosyntheses of NRPS and PKS in Cordyceps bassiana have been systematically studied (anamorph: Beauveria bassiana) (Eley et al., 2007; Xu et al., 2008, 2009; Heneghan et al., 2011). Such reports for the great majority of species in Cordyceps are rare. Polymerase chain reaction (PCR) using degenerate primers targeting the core sequences of the different NRPS and PKS domains has been applied successfully in the isolation of these types of genes in fungi (Nicholson et al., 2001; Vizcaino et al., 2005). In the present study, four NRPS and PKS gene clusters in two Cordyceps strains, originally assigned as C. militaris, were identified by degenerate primer PCR. A preliminary analysis of their potential products and the phylogenetic relationship of the two Cordyceps strains are reported.

Materials and methods

Strains, chemicals and cultivation conditions

Cordyceps militaris strain 1630 (voucher number: HMAS 132153) was from lab stock at the State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences; strain DSM 1153 (named C. militaris, see the 'Results and discussion') was obtained from the German Collection of Microorganisms and Cell Cultures (DSMZ, Braunschweig, Germany). All of the chemicals and oligonucleotides were purchased from Sigma (Hamburg, Germany). Both of the strains were maintained at 4 °C on potato dextrose agar (PDA) slants in the dark. The fungus was transferred to fresh PDA plates and incubated at 20 °C for 7–14 days for further experiments (Zhan et al., 2006).

Degenerate primer PCR

Fungal genomic DNA was isolated from 8-day-old PDA liquid cultures according to a published procedure (Jiang & Yao, 2005). The NRPS and PKS genes were screened by PCR using the primers listed in Supporting Information, Table S1 in a 50-μL reaction containing 1.5 mM MgCl2, 0.2 mM each dNTP, 0.5 μM each primer, 2.5 units Taq DNA polymerase, and the buffer provided by the manufacturer (Invitrogen, Darmstadt, Germany). The thermal cycling conditions were as follows: initial denaturation at 94 °C for 3 min; 35 cycles of 94 °C for 45 s, 55 °C for 30 s, and 72 °C for 2 min; and a final extension at 72 °C for 10 min. The PCR products were separated on a 1.5% agarose gel, and the bands of the expected sizes were excised and purified using the Invisorb DNA Cleanup kit (Invitek GmbH, Berlin, Germany). The purified fragments were cloned using the TOPO TA Cloning kit (Invitrogen) and sequenced.

Constructing and sequencing of the fosmid libraries

The libraries were constructed using the CopyControl Fosmid Library Production kit (Epicentre Biotechnologies, Madison, WI). The libraries were screened using colony PCR under the conditions described above but with gene-specific primers designed from the determined PCR products (Table S1). The fosmids were isolated from overnight cultures of Escherichia coli EPI300 clones using a Nucleobond Xtra Midi Kit, according to the manufacturer's instructions (Macherey-Nagel, Düren, Germany). The insert size was estimated by digestion with restriction enzymes HindIII and EcoRI. The fosmids were sheared using a HydroShear DNA Shearing Device (GeneMachines, San Carlos, CA) and were cloned into an SmaI-digested pUC19 vector (Fermentas, St. Leon-Rot, Germany) for shotgun sequencing. Plasmid preparation was performed using the 96-well Robot Plasmid Isolation kit (NextTec, Leverkusen, Germany) and a Tecan Evo Freedom 150 robotic platform (Tecan, Männedorf, Switzerland). Pair-end reads were obtained using an ABI 3730xl automatic DNA sequencer (PE Applied Biosystems, Foster City, CA).

Vector clipping, sequence trimming and assembly were performed using the lasergene (DNAStar Inc.) and the staden ( software packages. The open reading frames (ORFs) were predicted using the SeqBuilder program of the lasergene package and confirmed with a blastp search using the encoded whole protein sequences at the National Center for Biotechnology Information (NCBI). The domain assignment was first performed by aligning the protein sequences with known sequences and was confirmed by identifying the signature sequences. The NRPS adenylation domain specificity was predicted using nrpspredictor2 (Rottig et al., 2011) and nrps prediction blast server (Bachmann & Ravel, 2009). The substrate specificity of the AT domains in the PKSs was predicted using the web server sbspks (Anand et al., 2010). The fosmid sequences were deposited at NCBI under the accession numbers JN121120JN121124.

Quantitative real-time PCR

Fungal mycelia were harvested from an 8-day PDA liquid culture by ultracentrifugation at 10 000 g for 15 min. The mycelia were kept at −80 °C before RNA extraction. The total RNA was isolated from 100 mg of frozen mycelia using the TRIzol reagent (Invitrogen) and was then treated with an RNeasy MinElute Cleanup kit (Qiagen GmbH, Hilden, Germany). The primers were designed on the exon regions in the fosmid sequences (Table S1). The quantitative real-time PCR (qPCR) was performed using the Mx3000P Real-Time PCR System (Stratagene, Waldbronn, Germany). The 25-μL qPCR reactions contained 5 ng RNA, 0.1 μm primers and 1× Verso 1-Step QPCR SYBR Green Mix (ABgene Ltd, Epsom, UK). The thermal cycling conditions were as follows: 50 °C for 15 min; 95 °C for 15 min; followed by 40 cycles of 15 s at 95 °C, 30 s at 55 °C and 30 s at 72 °C; and 95 °C for 30 s, 60 °C for 30 s, and 95 °C for 30 s for the dissociation curve analyses.

The elongation factor 1α genes (tef1) of C. militaris (Liu et al., 2009) and Cordyceps ninchukispora strain BCC 26678 obtained from NCBI (Table S2) were used for normalizing the gene expression in strains 1630 and DSM 1153, respectively. The expression level of the target genes (ER) was expressed as

display math

where Cttef1 is the threshold cycle for the tef1 gene by qPCR and Cttarget is the threshold for the target gene.

Morphological, biochemical and phylogenetic analyses

A colony radial growth assay was performed by inoculating 3 μL spore suspension (1 × 105 spores mL−1) on a sterilized filter paper disk placed in the center of a PDA plate. Images were taken after a 15-day growth at 20 °C in the dark. For microscopic observation, cultures were prepared by inoculating a small amount of mycelia on a 1-cm3 PDA block placed on a microscopic slide (Stevens, 1981). The blocks were then covered with a coverslip and incubated at 20 °C. After removing the slab, the mycelia on the coverslip were fixed with Carnoy's fixative and observed using a Zeiss Axioskop microscope (Carl Zeiss, Germany). To compare the biochemical signatures of the two strains, the growth medium and mycelia from 300 mL liquid culture were extracted with acetyl acetate and chloroform/acetone (1 : 1, v/v) and analyzed using high-pressure liquid chromatography (HPLC) coupled with mass spectrometry (MS). Details are provided in the electronic Supporting Information.

The internal transcribed spacer (ITS) of the nuclear ribosomal DNA sequences from the two Cordyceps strains was amplified by PCR using the primers listed in Table S1. The sequences were deposited at NCBI with accession numbers JN121119 and JN121122. The reference sequences were downloaded from NCBI (Table S2). A phylogenetic tree was constructed with Bayesian Inference using the beast v1.6.1 package (Drummond & Rambaut, 2007). The Bayesian tree was then compared with the trees generated with maximum likelihood using the phylip3.67 package ( The final tree was edited using dendroscope 2 (Huson et al., 2007).

Results and discussion

The NRPS and PKS coding genes

The A3 and A7 motifs of the NRPS adenylation domain are highly conserved and suitable for degenerate primer design (Tanaka et al., 2005; Wei et al., 2005; Johnson et al., 2007). A pair of degenerate primers was designed based on these sequences (Table S1). They amplified a 271-bp fragment from the genomic DNA of strain 1630 and an 858-bp fragment from strain DSM 1153. The primers targeting the KS domain of the PKS coding genes developed by Keller et al. (1995) amplified a 498-bp fragment from strain 1630 and a 760-bp fragment from strain DSM 1153.

The 271-bp fragment was located on a 3.8-kb open reading frame designated as nrps1 (Fig. 1a). This sequence turned out to be on the T domain, whereas the expected fragment of 671 bp on the A domain was only weakly amplified under our PCR conditions. The putative 138 kD NRPS1 protein showed 32% similarity to LPS2, a subunit of the ergopeptine synthetase enzyme complex in Claviceps purpurea (Correia et al., 2003), and 30% similarity to LpsB for ergovaline biosynthesis in Neotyphodium lolii (Fleetwood et al., 2007). The completely different genetic contexts surrounding nrps1 compared with the genes of these ergot alkaloid synthetases reemphasizes that NRPS1 most likely produces a molecule unrelated to ergot alkaloids (Fig. 1b). Cordyceps militaris belongs to the same Clavicipitaceae family as Claviceps purpurea and N. lolii, but C. militaris strain ATCC 26848 does not produce any ergopeptines, and a gene encoding LPS1, another protein in ergopeptine biosynthesis, was not detected in strain ATCC 26848 (Panaccione et al., 2001).

Figure 1.

The domain structure of NRPS1 from C. militaris strain 1630 and the gene cluster organization. (a) Comparison of the domain structure of NRPS1 with the NRPSs from other fungi. (b) Genetic syntenies surrounding the NRPS genes. Scale bar: 4 kb. Genes of similar function are represented by the same color; the corresponding functions are listed below the structures. The white arrow indicates a functionally unrelated gene.

The 858-bp fragment was located on a 6.9-kb NRPS coding gene in the strain DSM 1153 genome (Fig. 2a). The gene was named etplP for epipolythiodioxopiperazine (ETP)-like peptide synthetase because many of its surrounding genes showed similarities to genes in the ETP biosynthetic pathway in Leptosphaeria maculans and Aspergillus fumigatus (Fig. 2b). ETP biosynthetic gene clusters are common in Ascomycetes (Patron et al., 2007; Fox & Howlett, 2008) and at least 14 different ETPs from 15 different producing organisms have been predicted (Gardiner et al., 2005). EtplP showed 41% sequence homology to SirP, which is involved in sirodesmin PL production in L. maculans (Gardiner et al., 2004), and 28% homology to GliP, which is involved in gliotoxin production in A. fumigatus (Gardiner & Howlett, 2005).

Figure 2.

Modular structure of EtplP from strain DSM 1153 and the corresponding gene cluster. (a) The domain architecture of EtplP and those of the orthologous proteins. The dotted thiolation domain of SirP was not annotated in the original project but was confirmed by blastp. (b) The gene arrangements in the clusters of etplP and its orthologous genes. Scale bar: 4 kb. The genes are colored according to function and abbreviated as follows: T, thioredoxin reductase; I, amino cyclopropane carboxylate synthetase; J, dipeptidase; M and N, methyl transferase; G, glutathione S-transferase; A, transporter; Z, transcriptional regulator; D, prenyl transferase; O, oxidoreductase; Q, S and R, epimerases, C, F, B and E, cytochrome P450 mono-oxygenases; H, acetyl transferase; K, a hypothetical protein.

The 498-bp fragment from strain 1630 was on a 7.5-kb PKS coding gene that showed homology to two genes involved in lovastatin biosynthesis in Aspergillus terreus, i.e. lovB [encoding the lovastatin nonaketide synthetase (LNKS)] and lovF [encoding the lovastatin diketide synthetase (LDKS)] (Hendrickson et al., 1999; Kennedy et al., 1999) (Fig. 3a). The gene was named pks1, and the encoded protein showed 35% similarity to LNKS and 26% similarity to LDKS. Lovastatin was termed monacolin K when isolated from Monascus pilosus (Staunton & Weissman, 2001). A structurally related compound named compactin was isolated from Penicillium citrinum (Abe et al., 2002). Our PKS1 protein showed 36% similarity to both MokA in the monacolin K biosynthesis pathway (Chen et al., 2008) and compactin nonaketide synthase (CNKS) in the compactin biosynthesis pathway.

Figure 3.

Deduced protein structure and genomic context of pks1 from C. militaris 1630 and pks-nrps1 from DSM 1153. (a) The domain organization of PKS1, PKS-NRPS1 (P-N1), LNKS and LDKS from A. terreus, MokA from M. pilosus, CNKS from P. citrinum, EqiS from F. heterosporum, DmbS and TenS from B. bassiana, and FusS from F. moniliformis. The PKS modules and NRPS modules are denoted with different filling colors. Inactive domains are shown as ovals. (b) The genomic contexts of the pks1 gene and the pks-nrps1 gene and their orthologs. Scale bar: 4 kb. The functions of the encoded proteins are given at the bottom of the figure. The white arrows indicate functionally unrelated genes.

The PKS1 protein also showed 37% sequence similarity to the PKS-NRPS hybrid equisetin synthetase (EqiS) in Fusarium heterosporum (Sims et al., 2005). LNKS contains a truncated NRPS module, and the biosynthesis of lovastatin and equisetin shares a common pathway up to the Diels–Alder cyclization of hexaketide (Campbell & Vederas, 2010). Our PKS1 likely catalyzes a similar reaction, but the chain length of the polyketide cannot be predicted. The on-line software sbspks predicts that PKS1 accepts malonic or methylmalonic acid as a substrate, similar to LNKS and LDKS (Campbell & Vederas, 2010). There is a product template (PT) domain between the AT and ACP domains (Schuemann & Hertweck, 2009) controlling the chain length in non-reducing PKSs (Cox, 2007; Liu et al., 2011); however, the chain length determination in highly reduced PKSs, such as LNKS, LDKS and CNKS, is not well understood.

The 760-bp fragment was located on an 11-kb hybrid pks-nrps gene (Fig. 3a). Hybrid gene clusters are widely distributed in Ascomycetes (Collemare et al., 2008). The pks-nrps1 gene encodes a protein that displayed 36% similarity with three proteins: DmbS in the 2-pyridone desmethylbassianin (DMB) biosynthetic pathway (Heneghan et al., 2011), TenS in the tenellin biosynthetic pathway in B. bassiana (Eley et al., 2007), and FusS in the fusarin biosynthetic pathway in Fusarium moniliforme (teleomorph Gibberella moniliformis) (Song et al., 2004). sbspks predicts that malonic acid is the only accepted substrate for the AT domain of PKS-NRPS1. However, due to the highly variable signature sequences in the A domain binding pockets, we could not predict the substrates of all of the NRPSs reported here (Table S3). In the hybrid PKS-NRPS systems, the Dieckmann cyclase domain (also known as the R domain) often mediates product release (Halo et al., 2008; Du & Lou, 2010). Interestingly, the R domain of PKS-NRPS1 showed sequence similarity to the short-chain dehydrogenase/reductase superfamily proteins in TenS, EqiS and DmbS (Halo et al., 2008; Sims & Schmidt, 2008; Heneghan et al., 2011) and therefore potentially mediates product release.

Although PKS-NRPS1 contained an ER domain, it is likely to be inactive because there are three mutations in the reduced nicotinamide adenine dinucleotide phosphate (NADPH)-binding motif (Fig. S1). Although the ER domains of LNKS, TenS and DmbS are inactive, reduction was catalyzed via the trans-acting ERs encoded by lovC, tenC and dmbC, respectively (Eley et al., 2007; Ma et al., 2009; Heneghan et al., 2011). Genes encoding trans-acting ER domains were not detected in the pks1 and pks-nrps1 gene clusters (Fig. 3b). The fungal polyketide chemical structures are determined by the programming of their PKS proteins (Cox, 2007). The low sequence similarities and syntenies of the two gene clusters to known sequences do not allow any speculation on the structure of the product (Fig. 3b); however, the polyketides they produce would likely be unsaturated.

Expression of the genes

None of the cloned genes encoding NRPS and PKS produces a known product; however, all four genes were actively expressed under our experimental conditions (Fig. 4). The pks-nrps1 gene was most actively transcribed, suggesting that it may have an important function in strain DSM 1153 under the studied growth conditions. The two nrps genes were expressed at the same level in the two different Cordyceps strains (= 0.43805, paired t-test) and the pks1 gene in strain 1630 was expressed at a relatively low level, which was 19 176-fold lower than the tef1 gene. Whether these genes are inducible at other growth stages or under other environmental conditions is an interesting question to address.

Figure 4.

Gene expression level revealed by quantitative real-time PCR. ER, the relative expression level of the target genes with respect to the expression level of tef1. The inset shows the amplification efficiency of the primer pairs under normal PCR conditions. M, 100 bp DNA ladder, only the bands corresponding to 300, 200 and 100 bp are shown; C1, the tef1 control in strain 1630; 1 and 2, the PCR products of nrps1 and pks1, respectively, using the genomic DNA of strain 1630 as the template; C2, the tef1 control in strain DSM 1153; 3 and 4, the PCR products of etplP and pks-nrps1, respectively, using the genomic DNA of strain DSM1153 as template.

Phylogenetic relationship of the two strains

Because the two fungal strains did not share any of the detected NRPS or PKS genes, the phylogenetic relationship of these two strains was then examined. The 1630 strain was originally isolated in China, and the ITS sequence cloned from this strain was identical to that of C. militaris IFO 30377 isolated in Japan and C. militaris CM01 isolated in China (Table S2). The DSM 1153 strain was originally isolated in Japan by Y. Kobayashi (strain K-400) (P. Hoffmann, DSMZ, personal communication), and the ITS sequence from this strain showed 99% similarity to that of C. ninchukispora. The two clades containing strains 1630 and DSM 1153 were well separated on the phylogenetic tree, and the inferred evolutionary difference between the two clades was even higher than those of some other genera (Fig. 5a). Furthermore, the colony morphology, growth rate and structure of the mature conidiophores of the two strains were very different (Fig. 5b). The conidia of strain DSM 1153 were, instead, morphologically indistinguishable from the conidia of C. ninchukispora (Su & Wang, 1986). The chemical compositions of the mycelial extracts (Fig. 5c) and the extracellular secretions from the two strains (Fig. S2) were also very different, supporting the results of the genetic study. Taken together, the two C. militaris strains are not conspecific, as originally described, and should be classified as two different species.

Figure 5.

The phylogeny, morphology and chemical profiles of the two Cordyceps strains. (a) The phylogenetic tree constructed with the ITS sequences from selected Cordyceps species and the fungi discussed in the text. The four decimal posterior probabilities calculated by Bayesian inference are displayed on each branch. The sequences of C. militaris strains 1630 and DSM 1153 are indicated with arrows. The accession numbers of sequences are listed in Table S2. (b) Morphological differences between C. militaris strains 1630 and DSM 1153. The colonies that developed on plates after 15 days are shown on top. Scale bar: 2 cm. The middle panels show the young conidia of the two strains (3 days old), and the bottom panels the mature conidia (6 days old). Scale bar: 10 μm. (c) Part of the chemical profiles of the mycelial extracts from the two strains revealed by HPLC-MS (m/z = 400–650). The base ions of the MS spectra are shown beside each peak.


Four PKS and NRPS coding genes from the two selected Cordyceps strains were identified but none of these genes accounts for the biosynthesis of the published cyclic peptides and polyketides from Cordyceps sensu lato (Paterson, 2008; Molnar et al., 2010; Asai et al., 2012). While preparing this manuscript, a whole-genome shotgun sequencing project of C. militaris CM01 was completed (Zheng et al., 2011); only pks1 was found in the available sequences (accession no. JH126405). Moreover, neither pks1 nor pks-nrps1 contains the previously reported 232-bp KS-domain fragment cloned from C. militaris 5050 (Lee et al., 2001). These findings indicate that C. militaris and related species represent a rich reservoir of novel secondary metabolites. Further exploration of these genes and yet undescribed genes may greatly improve our understanding of the life history of these fungi and the richness of their secondary metabolites.


This work was supported by the MPG-CAS Joint Doctoral Promotion Program (DPP) and the National Natural Science Foundation of China (31170017). We thank D. Spiteller for providing the DSM 1153 strain, laboratory assistance, and helpful discussions. We also thank G. Li, H. Guo and A. Jia for laboratory assistance and C. Wang for helpful discussions. Special thanks are owed to the anonymous reviewers for their valuable comments and suggestions.