Cloning and characterization of the NADPH cytochrome P450 reductase gene (CPR) from Candida bombicola


  • Inge N.A. Van Bogaert,

    1. Laboratory of Industrial Microbiology and Biocatalysis, Department of Biochemical and Microbial Technology, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
    Search for more papers by this author
  • Dirk Develter,

    1. Ecover Belgium NV, Malle, Belgium
    Search for more papers by this author
  • Wim Soetaert,

    1. Laboratory of Industrial Microbiology and Biocatalysis, Department of Biochemical and Microbial Technology, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
    Search for more papers by this author
  • Erick J. Vandamme

    1. Laboratory of Industrial Microbiology and Biocatalysis, Department of Biochemical and Microbial Technology, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
    Search for more papers by this author

  • Editor: Barbara M. Bakker

Correspondence: Inge Van Bogaert, Laboratory of Industrial Microbiology and Biocatalysis, Department of Biochemical and Microbial Technology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, B-9000 Ghent, Belgium. Tel.: +32 9 2646034; fax: +32 9 2646231; e-mail:


Candida bombicola is a yeast with at least two appealing features. The species can grow on alkanes when provided as the sole carbon source, and it produces glycolipids, which have several industrial, cosmetic and pharmaceutical applications. Both metabolic processes require in their pathway the activity of cytochrome P450 monooxygenase. This enzyme needs and gets reducing equivalents from NADPH cytochrome P450 reductase (CPR). The CPR gene of Candida bombicola was isolated using degenerate PCR and genomic walking. The gene encodes an enzyme of 687 amino acids, which shows homology with known CPRs of other species. The functionality of the gene was proven by heterologous expression in Escherichia coli. The recombinant protein exhibited NADPH-dependent cytochrome c reducing activity. Cloning and characterization of this enzyme is an important step in the study of the cytochrome P450 monooxygenase system of Candida bombicola. The GenBank accession number of the sequence described in this article is EF050789.


Candida bombicola is a nonpathogenic yeast known to synthesize sophorolipids; these comprise a biosurfactant group of commercial interest, and consist of the disaccharide sophorose with a hydroxy fatty acid linked to it (Asmer et al., 1988). This hydroxy fatty acid is derived from a normal fatty acid by means of cytochrome P450 monooxygenase activity. The enzyme system has a strong preference for fatty acids with 16 or 18 carbon atoms, and influences in this way the sophorolipid structure and physicochemical properties. Furthermore, Ca. bombicola is also capable of growing on alkanes when provided as the sole carbon source. The first step in alkane assimilation is its terminal hydroxylation by – again – cytochrome P450 monooxygenase activity. Hence, the cytochrome P450 monooxygenase system plays a crucial role in both alkane degradation and sophorolipid synthesis.

Cytochrome P450 monooxygenases form a small electron transfer chain together with the NADPH cytochrome P450 reductase (CPR). Both enzymes are N-terminally anchored to the endoplasmic reticulum and derived structures (Edwards et al., 1991). CPR is a flavoprotein containing the flavin cofactors FAD and FMN. It transfers the hydride ion of NADPH to the lower redox potential FAD. FAD then transfers single electrons to FMN, which in turn reduces the cytochrome P450 monooxygenase heme center as required to activate molecular oxygen (Nebert & Gonzalez, 1987). Other final electron acceptor proteins are cytochrome b5 (Enoch & Strittmatter, 1979), heme oxygenase (Schacter et al., 1972), squalene epoxidase (Ono et al., 1977), and fatty acid elongase (Ilan et al., 1981). CPR is also capable of reducing cytochrome c in vitro.

The present article describes the cloning and sequence analysis of the CPR gene of Ca. bombicola ATCC 22214. The identity of the gene was demonstrated by its functional expression in Escherichia coli, and was supported by its homology with other eukaryotic CPRs, which was particularly high for the cofactor-binding regions.

CPR plays an important role in supporting cytochrome P450 monooxygenases, and can even be the limiting factor for monooxygenase activity (Pompon et al., 1996). Consequently, cloning and characterization of this enzyme is an important step in the study of the cytochrome P450 monooxygenase systems of Ca. bombicola involved in the initiation of alkane degradation and the biosynthesis of sophorolipids.

Materials and methods

Strains, plasmids and culture conditions

Candida bombicola ATCC 22214 was used for the preparation of genomic DNA. All PCR products intended for sequence analysis were cloned into the p-GEM-T vector (Promega). Escherichia coli DH5α was used in all cloning experiments, and the pTrcHis TOPO and pTrcHis TOPO/lacZ vector and E. coli TOP10 from Invitrogen were used for expression experiments.

Candida bombicola was cultured in medium containing 10% glucose, 1% yeast extract and 0.1% urea. Liquid yeast cultures were incubated at 30°C and 200 r.p.m. Escherichia coli was grown in Luria–Bertani medium (1% tryptone, 0.5% yeast extract and 0.5% sodium chloride) supplemented with 100 mg L−1 ampicillin and 40 mg L−1 5-bromo-4-chloro-3-indolyl-b-d-galactopyranoside (X-gal) if necessary. Liquid E. coli cultures were incubated at 37°C and 200 r.p.m.

DNA isolation and sequencing

Yeast genomic DNA was isolated with the GenElute Bacterial Genomic DNA Kit (Sigma), but cell lysis was performed by incubation at 30°C for 90 min with zymolyase (Sigma). Plasmid DNA was isolated with the QIAprep Spin Miniprep Kit (Qiagen). All DNA sequences were determined at the VIB Genetic Service Facility (Belgium).

Degenerate PCR

Part of the CPR gene of Ca. bombicola was amplified using degenerate primers DegFor and DegRev (Table 1). PCR amplification was carried out with an initial denaturation of 94°C for 4 min, 40 cycles of 94°C for 30 s, 50°C for 1 min and 72°C for 2 min with 5 s time increment per cycle, and a final 7-min elongation at 72°C.

Table 1.   Primers used for the isolation and cloning of the Candida bombicolaCPR gene (all primers were obtained from Sigma Genosys)
  1. GSP, gene-specific primer.

DegForDegenerate primer5′-ACH GGW ACB GCH GAR GAY TAY GC-3′
DegRevDegenerate primer5′-GAV GAM GAR ATV GAG TAG TAA CGW GG-3′
Up1Primary GSP for upstream amplification5′-GGC GTT GTC AGT AGG CTC TCC ATC ACC-3′
UpNNested GSP for upstream amplification5′-CCA TAT GAT GCC ATG AGG AAG ACA GCA AGA T-3′
Down1Primary GSP for downstream amplification5′-CGA TAA GAC CTC GAC TGT GCG TAT ACC TTC-3′
DownNNested GSP for downstream amplification5′-TTG TTC GCA AGC CAT GTG GCC GCG AAG A-3′
TotForPrimer for isolation of coding sequence5′-GCC GAT ATT AAT TTT ATC GCT TCG GTC GTT-3′
TotRevPrimer for isolation of coding sequence5′-GCT ACC AAA CGT CCT CTT GGT ACT-3′

Genome walking

The unknown genomic DNA sequences upstream and downstream of the degenerate PCR fragment were identified by genome walking, carried out according to the user's manual of the BD GenomeWalker Universal Kit (BD Biosciences). For both the upstream and downstream sequences, two PCR amplifications – a primary amplification followed by a nested PCR – were performed. Four gene-specific primers (GSPs) were designed, and are listed in Table 1. The PCR reaction mixture and cycles were optimized for use with the Expand Long Template PCR System (Roche Diagnostics), as described by De Maeseneire et al. (2006).

Cloning and expression

The isolation of the complete CPR gene with its upstream and downstream flanking regions was carried out with the two specific primers TotFor and TotRev (Table 1). The fragment was amplified from genomic DNA with the High Fidelity PCR Master kit (Roche Diagnostics), using the following temperature program: initial denaturation at 94°C for 2 min, 10 cycles of 94°C for 10 s, 50°C for 30 s, and 72°C for 2 min, 15 cycles of 94°C for 15 s, 50°C for 30 s and 72°C for 2 min with 5 s time increment per cycle, and a final 7-min elongation at 72°C. The obtained PCR fragment was cloned in pTrcHis TOPO, and a vector with the fragment in the proper orientation for expression was selected after restriction analysis. The resulting vector was called pTopoCPR and was used for CPR expression in E. coli TOP10 cells, following the guidelines of the pTrcHIS TOPO TA Expression Kit user's manual. The pTrcHis TOPO/lacZ control vector, constructed for the expression of a fragment of the lacZ gene, was used as a negative control in the CPR activity assay.

CPR activity assay

Expression in E. coli TOP10 was induced with 1 mM isopropyl thio-β-d-galactoside (IPTG), and from this stage cells were harvested every hour over a time period of 10 h. Cell-free extracts were obtained by enzymatic lysis with EasyLyse Bacterial Protein Extraction Solution (Epicentre). Protein concentrations were determined as described by Bradford (1976). The NADPH cytochrome c reductase activity was measured spectrophotometrically at room temperature with a Uvikom 922 spectrophotometer (BRS). The 1-mL reaction mixture contained an appropriate volume of cell-free extract containing 0.5 mg of protein, 1.3 mM NADP, 3.3 mM glucose 6-phosphate, 0.4 U mL−1 glucose-6-phosphate dehydrogenase, 3.3 mM MgCl2 and 0.95 mg of cytochrome c in 250 mM potassium phosphate buffer. The extinction coefficient for reduced cytochrome c at 550 nm under the described conditions is 21 mM−1 cm−1.


Escherichia coli DH5α cells were transformed as described by Sambrook & Russell (2001). Escherichia coli TOP10 was transformed following the guidelines of the pTrcHIS TOPO TA Expression Kit user's manual.

Sequence analysis

Sequences were analyzed with the clone manager professional suite software (Version 6.0). The blast program (Altschul et al., 1997) was used for similarity searches in databases available on the NCBI website ( Multiple sequence alignments were made with the clustalw program (Higgins et al., 1992). Subsequently, a phylogenetic tree was constructed in bioedit using the protein maximum likelihood (ProML) algorithm.

The hydropathy of CPR protein was analyzed using the method of Kyte & Doolittle (1982). A window size of 19 amino acids was used. Peaks with scores greater than 1.8 indicate possible transmembrane regions.

Nucleotide sequence accession number

The nucleotide sequence described in this article has been deposited at the GenBank nucleotide database under the accession number EF050789.

Results and discussion

Isolation of the Ca. bombicola CPR gene

The CPR genes of several yeast species of the subphylum Saccharomycotina were compared in a multiple alignment, which revealed several conserved regions corresponding to conserved amino acid sequences previously described by Vandenbrink et al. (1995). Degenerate oligonucleotides DegFor and DegRev were designed on the basis of the highly conserved FMN-1 and FAD-2 binding regions, respectively, (Fig. 1; amino acids 69–92 and 447–464 for Ca. bombicola).

Figure 1.

 Nucleotide and deduced amino acid sequence of the CPR gene of Candida bombicola (accession number EF050789). The first 21 amino acids of the putative and partial COX are also indicated. Possible promoter elements are underlined, and sequences presumed to be involved in polyadenylation are in bold. The conserved amino acid sequences of the CPR as described by Vandenbrink et al. (1995) are in bold and underlined.

Degenerate PCR conducted with the primers DegFor and DegRev resulted in the appearance of several amplified products, probably due to the ability of the primers to anneal with various genome fragments encoding proteins with similar cofactor-binding properties. However, on the basis of the CPR sequences of the Saccharomycotina, a fragment ranging from 1133 to 1199 bp was expected for the CPR gene of Ca. bombicola. A 1146-bp fragment was therefore isolated, and subsequent sequence analysis revealed 49% amino acid sequence identity with the CPR of Yarrowia lipolytica.

The regions adjacent to the obtained fragment were cloned using the BD GenomeWalker Universal Kit. As a result, a DNA sequence of 4362 bp was obtained, and was shown to contain the total coding sequence of 2064 bp and a 5′-untranslated region and a 3′-untranslated region of 461 and 1837 bp, respectively.

Characterization of the CPR sequence

No conventional TATA box was found in the 5′-untranslated region. This is, however, not unusual for yeast genes, as only 20% of the promoters are supposed to be regulated by TATA box elements (Basehoar et al., 2004). At position −358, however, the TATA-like structure TATAGTTT was observed. He & Chen (2005) reported a similar sequence 332 bp upstream of the CPR-a gene of Ca. tropicalis. Furthermore, a CAAT box was found at position −37, and the sequence just preceding the translation start site is extremely A-rich, revealing a possible role in transcription initiation. Kozak (1991) described two positions most critical for the function of the ATG initiator codon: a purine (usually A) in position −3, and a G in position +4. Both nucleotides are present in the Ca. bombicola CPR gene.

The complement of the more specific pentanucleotide sequence CACAT is present at position −163. This sequence or its complement ATGTG also occurs in the 5′-untranslated regions of the CPR genes of Ca. maltosa (Ohkuma et al., 1995) and Cunninghamella elegans (Yadav & Loper, 2000), and in the promoter region of cytochrome P450 monooxygenase genes of yeasts (Yadav & Loper, 1999). The consensus sequence of the polyadenylation signal found in most eukaryotic genes is AATAAA, although for yeasts several variations are possible (Guo & Sherman, 1996). In the 3′-untranslated region of the CPR gene of Ca. bombicola, a putative polyadenylation signal sequence (AAAATA) was observed 58 bp downstream of the TAG stop codon.

There were no indications of the presence of an intron. The CPR ORF is translated into a protein of 687 amino acids with a calculated size of 76.22 kDa and an estimated pI of 5.6.

The Kyte–Doolittle hydropathy plot of the CPR amino acid sequence suggests the presence of an N-terminal transmembrane region (data not shown). CPR and its partner enzyme cytochrome P450 monooxygenase are, indeed, both considered to be N-terminally anchored to the endoplasmic reticulum, with the remainder of the enzyme facing the cytoplasmic side of the membrane (Sanglard et al., 1993). The hydrophobic fragments are, in addition, thought to be uncleavable signal sequences targeting the CPR towards the endoplasmic reticulum (Black, 1992). Furthermore, Kargel et al. (1996) stated that the CPR of Ca. maltosa triggers strong proliferation of the membrane system and that the N-terminal signal–anchor sequence is involved in this process.

The protein shows 40–49% amino acid sequence identity with all known and putative CPRs of the Ascomycota. Furthermore, the reductase of Ca. bombicola turns out to be more homologous to those of vertebrates (37–39%) and insects (36–37%) than to the CPRs of higher plants (31–34%). The same trend was also observed by Yadav & Loper (2000) for the fungi Cu. elegans and Cu. echinulata, and is confirmed by the phylogenetic tree for the eukaryotic CPRs (Fig. 2). The tree was constructed on the basis of the protein maximum likelihood principle, and comprises all the 65 CPRs found in the GenBank database. For all species, only one protein was incorporated, and putative or hypothetical proteins were excluded. The sequence of Plasmodium falciparum was used as outgroup, and the tree was rooted against it. The CPR of Ca. bombicola is retrieved on a separate branch in the yeast group. It does not cluster together with the other Candida CPRs. This result agrees with a previous phylogenetic analysis on Candida species (Suzuki et al., 1999), in which Ca. tropicalis, Ca. albicans and Ca. maltosa clustered together and were found in the same group as Saccharomyces cerevisiae, whereas Ca. bombicola belonged to a different group.

Figure 2.

 Phylogenetic tree for eukaryotic CPR rooted against the Plasmodium falciparum CPR. The log n likelihood of the tree is −49 566, and all knobs and branches were statistically significant (P<0.01). The marker bar below denotes the integer branch length.

Despite overall amino acid sequence similarities of <50% between the CPRs of Ca. bombicola and those of other organisms, the domains believed to be involved in the binding of NADPH, FAD and FMN exhibit a very high similarity (Vandenbrink et al., 1995). These highly conserved regions are indicated in Fig. 1.

Putative cytochrome c oxidase assembly factor gene (partial) flanking CPR

In the 1837-bp region downstream of the CPR coding sequence, an ORF of 1456 bp preceded by a possible TATA box was found. The hypothetical gene is nearly complete and encodes a putative partial protein of 485 amino acids that shows homology with the cytochrome c oxidase assembly proteins (COX) of several species (54% amino acid identity match with COX15 of S. cerevisiae). Those proteins take part in the synthesis of heme A for cytochrome c oxidases. They display oxidoreductase activity, acting on NADH or NADPH, and are required for the hydroxylation of heme O to form heme A, which is an essential prosthetic group for cytochrome c oxidase.

Thus, like CPR, the putative protein should display oxidoreductase activity, and its putative function is also related to heme-containing proteins. It is not clear whether the colocalization of both genes is a coincidence or an indication of the clustering of related genes. However, in the genomes of other yeasts and higher eukaryotes, such colocalization is not observed.

Heterologous expression of CPR and activity testing

In order to verify the functionality of the proposed CPR gene product, the pTopoCPR expression vector harboring the coding region of the CPR sequence was constructed as described in Materials and methods. pTopoCPR was used to express the protein in E. coli, and heterologous expression was verified by sodium dodecyl sulfate polyacrylamide gel electrophoresis (data not shown). The activity was verified by cytochrome c reductase activity in the cell-free extract ((Fig. 3). Time course measurements revealed a clear reductase activity in the extracts of IPTG-induced cells, and the absorption spectra obtained correspond to that for cytochrome c in the reduced state. These results confirm the activity of the protein product of the isolated CPR gene of Ca. bombicola.

Figure 3.

 Activity measurement of the CPR from Candida bombicola by cytochrome c reduction to verify the functionality of the isolated gene. —, cell-free extracts of Escherichia coli expressing CPR 6 h after IPTG induction; …, blank, cell-free extracts of Escherichia coli expressing a fragment of the lacZ gene 6 h after IPTG induction; - - - -, blank, reaction mixture without cell-free extract. (a) Time course of cytochrome c reduction monitored at 550 nm. (b) Absorption spectra obtained after saturation of the reaction.


The authors wish to thank Ecover Belgium NV and the Bijzonder Onderzoekfonds (BOF) of Ghent University for financial support (grants A05/003 and 01D18604).