Cloning, expression and characterization of CYP102D1, a self-sufficient P450 monooxygenase from Streptomyces avermitilis


Byung-Gee Kim, School of Chemical and Biological Engineering, Seoul National University, Seoul, South Korea
Fax: +82 2 874 1206
Tel: +82 2 880 6774
E-mail: Park, Korea Bio-Hub Center, Bio-MAX Institute, Seoul National University, South Korea
Fax: +82 2 887 2662
Tel: +82 2 87 2654


Among 33 cytochrome P450s (CYPs) of Streptomyces avermitilis, CYP102D1 encoded by the sav575 gene is naturally a unique and self-sufficient CYP. Since the native cyp102D1 gene could not be expressed well in Escherichia coli, its expression was attempted using codon-optimized synthetic DNA. The gene was successfully overexpressed and the recombinant CYP102D1 was functionally active, showing a Soret peak at 450 nm in the reduced CO difference spectrum. FMN/FAD isolated from the reductase domain showed the same fluorescence in thin layer chromatography separation as the authentic standards. Characterization of the substrate specificity of CYP102D1 based on NADPH oxidation rate revealed that it catalysed the oxidation of saturated and unsaturated fatty acids with very good regioselectivity, similar to other CYP102A families depending on NADPH supply. In particular, CYP102D1 catalysed the rapid oxidation of myristoleic acid with a kcat/Km value of 453.4 ± 181.5 μm−1·min−1. Homology models of CYP102D1 based on other members of the CYP102A family allowed us to alter substrate specificity to aromatic compounds such as daidzein. Interestingly, replacement of F96V/M246I in the active site increased catalytic activity for daidzein with a kcat/Km value of 100.9 ± 10.4 μm−1·min−1 (15-fold).


cytochrome P450


polycyclic aromatic hydrocarbon


Cytochrome P450 monooxygenases (CYPs) are a super-family of heme-thiolate-containing enzymes that catalyse a variety of chemical reactions in a regioselective and stereoselective manner [1]. These enzymes generally act as monooxygenases, introducing one atom of molecular oxygen into the substrate [2,3], and more than 40 different types of reactions catalysed by CYP enzymes have been identified. Enantio/regioselective monooxidation, epoxidation and hydroxylation reactions performed by CYPs are some of the most desired reactions in modern industry for the development of drugs and biorefinery products, and they are the most common P450-catalysed reactions [4]. In addition, P450-mediated enzymatic oxidation reactions are a good alternative to organic synthesis.

Among the various monooxygenases, non-heme monooxygenases such as Bayer–Villiger monooxygenase have recently been successfully used in single-step enzyme reactions [5], whereas industrial application of CYPs is still very limited except for one or two CYP reactions in metabolic pathways involved in the synthesis of natural products such as antibiotics and hormones [2]. Major hurdles to the use of CYPs are their low kcat/Km values, poor stability, difficulty in supplying electron reducing power, and low coupling efficiency with reductases [2]. Generally, P450-catalysed oxidation reactions require sequential inputs of two electrons and two protons ([P450-RH] + 2e + 2H+ + O2→ P450 + ROH + H2O) to catalyse monohydroxylation of RH and release the hydroxylated product ROH with water. Thus, CYPs require tight interactions with auxiliary, electron-donating and transferring proteins from the cofactor to P450, which can be classified into four classes depending on the electron transfer system between the heme domain of CYP and redox proteins [6,7]. In many cases, electron transfer limits their catalytic activity, such that the supply of NAD(P)H and the enhancing-coupling efficiency of the electron transfer from the reductase to P450 by metabolic engineering and protein engineering become major issues [8–11]. For example, Nagamune’s group reported that bacterial P450s fused with electron-transfer-related proteins form a stable hetero-trimeric complex, thereby enabling efficient electron transfer within self-assembled enzymes [12]. Another way to overcome this limitation is by using a direct supply of electrons through modified electrodes in which CYP enzymes are immobilized in the presence of various mediators [13]. One of the strategies for enhancing the coupling efficiency of CYPs is by using a self-sufficient CYP system whose heme domain is fused with the reductase domain in a single polypeptide. As a result, this type of CYP exhibits the highest turnover frequency (> 1000 min−1) compared with other CYPs [14]. For example, the fatty acid hydroxylase CYP102A1 (so called BM3) from Bacillus megaterium is known to show the highest catalytic activity among the P450s [15]. In addition to CYP102A1, other self-sufficient CYPs belonging to the CYP102A family from various Bacillus species, such as CYP102A2, CYP102A3 from Bacillus subtilis, CYP102A5 from Bacillus cereus and recently CYP102A7 from Bacillus licheniformis, have been extensively studied [14–18]. Generally, the fusion between the heme and reductase domains is known to enhance the efficiency of kcat and/or Km. CYP102A1 catalyses fatty acid hydroxylation at a rate of up to ∼ 17 000 min−1, which is at least two orders of magnitude faster than the observed rates of other fatty acid hydroxylases [19,20].

Streptomyces avermitilis MA4680, a gram-positive bacterium known as an anthelmintic avermectin producer, possesses 33 CYPs in its genome [21]. Among these, CYP102D1 encoded by the sav575 gene is a unique self-sufficient P450. Until now, however, no information about its substrate specificity and kinetic parameters has been reported. Here, we report the cloning, expression and characterization of self-sufficient CYP102D1 from S. avermitilis. We found that it binds a range of long-chain fatty acids as well as some cyclic molecules, as expected, and catalyses their oxidation. Moreover, it catalyses fatty acid hydroxylation at a rate of up to ∼ 2000 min−1, which is at least two orders of magnitude faster than the observed rates of other bacterial class I CYPs [22]. However, one of the major limitations is its substrate specificity. CYP102D1 prefers fatty acids rather than polycyclic aromatic hydrocarbons (PAHs), which are industrially important in the fields of pharmaceuticals or ingredients such as anticancer, antioxidant agents and cosmetics [23,24]. Thus, many groups have attempted to obtain enzyme activity towards substrates other than long-chain fatty acids, such as PAHs, via protein engineering [15,25]. For this, the effects of various amino acid residues located in the substrate access channel and active site of CYP102A1 on the thermodynamics, kinetics and dynamics of fatty acid binding and oxidation have been investigated extensively by both site-directed and random mutagenesis [26]. For example, the activities of a triplet mutant (A74G/F87V/L188Q) towards naphthalene, fluorene, acenaphthene, acenaphthylene and 9-methylanthracene were improved by up to four orders of magnitude [27]. Further, a triplet mutant (R45L/Y51F/M237A) was reported to enhance the oxidation rate of phenanthrene and fluoranthene by up to 200-fold [28].

Interestingly, the substrate preference profiles and kinetic properties of CYP102D1 are similar to those of the CYP102 family enzymes. Moreover, key residues such as Phe87 and Met237, which were previously reported as responsible for controlling substrate specificities, are highly conserved in CYP102D1 [15,29,30]. Previous information and structural fine-tuning to use 4′,7-dihydroxyisoflavone (daidzein), a major phytooestrogen component of soybeans that is often used as an additive supplement in nutritional foods or cosmetics, as a substrate has allowed us to evaluate the relative contributions of these conserved residues to the catalytic process of CYP102D1, thus forming a clearer picture of how CYP102D1 mutants regulate daidzein hydroxylase function.


Sequence alignment of CYP102D1 with the CYP102A subfamily

A blast search of amino acids encoded by sav575 classified the protein into the flavocytochrome P450 (CYP102A) family. The amino acids aligned with two distinct domains – a heme domain and a diflavin (FMN/FAD) domain. Most members of the CYP102A family (CYP102A1, CYP102A2, CYP102A3, CYP102A5 and CYP102A7 from Bacillus strains) are fatty acid hydroxylases [14–18]. These five CYP102As (Fig. 1) have a high amino acid sequence identity of 38.7%. In pairwise alignments, CYP102D1 also has a high amino acid sequence identity with CYP102A1, CYP102A2, CYP102A3, CYP102A5 and CYP102A7 of 38.3%, 40.5%, 38.1%, 39.7% and 36.8%, respectively. This high identity suggests that CYP102D1 is a member of the CYP102 family and catalyses oxidation of fatty acids.

Figure 1.

 Sequence alignment of select regions of CYP102D1 with CYP102A family members. (A) Critical amino acid residues in CYP102D1 in the binding pocket and the opening of the substrate access channel; (B) mechanistically important I-helix region; (C) heme-binding region. Distinct differences in the active sites are indicated as boxes.

The most distinct differences in amino acid sequence between CYP102D1 and the CYP102 family members is in the opening of the substrate channel and the substrate binding pocket. The CYP102 family contains a conserved sequence of AXEXGP-IF in the opening of the substrate access channel and FAGDGLFTS-T in the binding pocket of the I-helix. In contrast, CYP102DQ has SXQXPE-LY and YAGAGLFTA-Q at these positions (Fig. 1). Whereas the Phe81 of CYP102A1 known as the recognition site of the substrate was well conserved in all the CYP102A family, Tyr90 is shown instead in CYP102D1 (Fig. 1). These regions in the sequence correspond to the substrate recognition sites in the 3D structure since the substrate is likely to pass through this region. The amino acid differences may cause the differing substrate specificity and substrate binding affinity of CYP102D1 discussed below.

Expression and purification of synthetic CYP102D1 gene in Escherichia coli and spectral features of the CYP102D1 enzyme

Since the original gene coding CYP102D1 could not be successfully expressed in E. coli cells despite several induction optimization experiments, the gene was chemically synthesized following codon optimization complying with the codon preferences of E. coli (see Experimental procedures). This gene expressed well in E. coli yielding 39.1 mg·L−1 of CYP102D1. The protein was purified with a Ni–nitrilotriacetic acid (Ni-NTA) column and finally concentrated to yield 0.18 mg·mL−1 (1.5 μm) (Fig. 2A).

Figure 2.

 Expression and UV–visible absorption spectra of CYP102D1. (A) SDS/PAGE analysis of total cell protein content as well as purified protein containing the His6 tag; the lanes, from the left, are molecular weight markers, total cell protein content before IPTG induction, total cell protein content, total soluble protein content and purified protein content, respectively. (B) Oxidized form (low spin) of CYP102D1 (dashed dotted line) was readily reduced by the addition of 10 mg of sodium dithionite (dotted line). Subsequently, carbon monoxide was added to the reduced CYP102D1 solution (0.5 mg·mL−1) followed by the UV–visible spectrum scanning from 350 to 500 nm in order to examine the absorbance Soret peak shift (at 450 nm, inset) (dashed lines) suggesting the proper expression and folding state of activated CYP102D1. The heme Soret band shifted to 392 nm (high spin) upon the addition of myristoleic acid (solid lines) and the broad absorbance shoulder between 450 and 510 nm were typical characteristics for a flavin-containing reductase domain.

UV–visible spectroscopy showed a characteristic absorbance at 450 nm for the carbon monoxide complexed to the reduced ferrous state of CYP102D1. The cysteine thiolate ligand coordinated to the heme iron atom of CYP regulates all spectroscopic and catalytic properties. Reduction of the oxidized form (low spin) of purified CYP102D1 (Fig. 1, dashed dotted line) by the addition of 10 mg of sodium dithionite yielded the reduced ferrous state (Fig. 1, dotted line). Addition of carbon monoxide shifted the Soret peak to 450 nm (Fig. 1, dashed lines) indicating that cysteine thiolate ligand remains intact in activated CYP102D1. Upon the addition of myristoleic acid, the heme Soret band shifted substantially to 392 nm (high spin) (Fig. 1, solid lines). The broad shoulder between 450 and 510 nm is typical of a flavin-containing reductase domain.

The presence of flavin cofactors was confirmed by releasing FAD and FMN from purified CYP102D1 by boiling for 30 min. After centrifugation, the supernatant showed two spots on TLC at the same Rf values as authentic standards of FAD and FMN. Catalytic activity of the diflavin reductase was determined based on the reduction of cytochrome c as an electron acceptor. CYP102D1 preferred NADPH as an electron donating coenzyme rather than NADH, and its oxidation rate was measured as ∼ 2000 μmol cyt c·min−1·μmol−1 CYP102D1.

The tight binding of saturated and unsaturated fatty acids to CYP102D1 was determined. Binding of fatty acids or cyclic substrates to CYP102D1 induced a shift in the equilibrium of the heme iron state towards a high-spin form, leading to changes in the absorption spectrum of the Soret region. Spectral binding titrations were examined to evaluate the enzyme affinity for each substrate (Fig. S3).

Substrate screening with NADPH oxidation rate

Substrate specificity of CYP102D1 was measured using three types of substrates: saturated fatty acids from C10 to C18, unsaturated fatty acids, and several cyclic compounds. Turnover number was measured by monitoring the substrate-dependent oxidation of NADPH at a saturating concentration (1 mm) of the pyridine nucleotide cofactor (Table 1). Saturated and unsaturated fatty acids were better substrates than cyclic compounds, which is consistent with the designation of CYP102 monooxygenases as fatty acid hydroxylases.

Table 1.   Substrate specificity of CYP102D1. The relative activities are represented by NADPH consumption rates and the substrate concentration was saturating in solution. ND, not determined.
SubstrateNADPH consumption (nmol·NADPH·min−1·nmol−1 CYP102D1)SubstrateNADPH consumption (nmol·NADPH·min−1·nmol−1 CYP102D1)
Saturated fatty acidCyclic compounds
 Capric acid (C10)147.1 ± 4.1 DihydroxybenzateND
 Lauric acid (C12)80.3 ± 44.5 CamphorND
 Myristic acid (C14)127.9 ± 38.6 Indole338.9 ± 20.5
 Palmitic acid (C16)342.1 ± 127 Coumarin138.1 ± 4.1
 Stearic acid (C18)626.1 ± 28.4 7-EthoxycoumarinND
Unsaturated fatty acid 7-HydroxycoumarinND
 Myristoleic acid (14 : 1)229.5 ± 127 Anthracene150.9 ± 63.4
 Palmitoleic acid (16 : 1)166.3 ± 48.1 Daidzein59.8 ± 11.6
 Oleic acid (18 : 1)138.1 ± 40.6  
 Linoleic acid (18 : 2)228.7 ± 110.7  

In general, the CYP102A family favours saturated fatty acids with carbon lengths ranging from C14 to C18. Hydroxylation activity gradually decreased with longer or shorter chain lengths. Similarly, CYP102D1 showed the highest NADPH consumption rates towards stearic acid (C18) at 626.1 ± 28.4 min−1, and its activity decreased with longer or shorter chain length fatty acids except for capric acid (Fig. 3).

Figure 3.

 Relative activities comparison against saturated and unsaturated fatty acids based on the NADPH oxidation rates between CYP102A member enzymes and CYP102D1. NADPH oxidation rates were measured in nmol NADPH·min−1·nmol−1 of CYP102D1. A solution containing 160 μL of potassium phosphate buffer (pH 7.0), 10 μL of substrates in dimethylsulfoxide (10–200 μm) and 1.5 μm enzyme solution (0.18 mg·mL−1 enzyme stock) in a 96-well microplate was incubated for 5 min. The reaction was initiated by the addition of 10 μL of 10 mm NADPH.

Although CYP102A family enzymes often favoured unsaturated fatty acids more than threefold over saturated fatty acids (e.g. CYP102A3 catalysed linoleic acid (C18 : 2) oxidation at 1110 ± 60 min−1, but it catalysed oxidation of stearic acid (C18 : 0) at only 370 ± 50 min−1), CYP102D1 favoured saturated long chain fatty acids [18]. Compared with CYP102A7 which was able to catalyse the oxidation of cyclic and acyclic terpenes with high activity and coupling efficiency, however, CYP102D1 showed no activity for dihydroxybenzoic acid, 7-ethoxycoumarin or 7-hydroxycoumarin, suggesting that the active site pocket is not large or hydrophilic enough to accept hydroxylated molecules larger than coumarin [17].

Representative substrates were further characterized by enzymatic kinetics (Table 2). CYP102D1 catalysed the oxidation of stearic acid with a relatively high turnover rate. This kcat number (834.3 ± 75.6 min−1) was lower than that reported for CYP102A1 (∼ 1000 min−1). Although CYP102D1 preferred saturated fatty acids to unsaturated fatty acids in terms of turnover rate, the apparent dissociation constants (Km) for unsaturated acids (myristoleic acid, palmitoleic acid and linoleic acid) were lower than those for saturated fatty acids.

Table 2.   Kinetic parameters for the oxidation of fatty acids catalysed by CYP102D1 and CYP102D1 mutants. Each substrate concentration was from 100 nm to 1 mm depending on solubility. ND, not determined.
SubstratesKmm)kcat (min−1)kcat/Kmm−1min−1)Coupling (%)
Lauric acid (C12)23.9 ± 5.6576.2 ± 26.324.1 ± 5.870
Myristic acid (C14)495.3 ± 41.22057.6 ± 103.54.2 ± 2.180
Palmitic acid (C16)47.1 ± 12.1587.9 ± 53.612.9 ± 2.2480
Stearic acid (C18)8.9 ± 3.6834.3 ± 75.6103.7 ± 36.5> 95
Myristoleic acid (14 : 1)2.4 ± 1.1977.5 ± 99.3453.4 ± 181.5> 95
Palmitoleic acid (16 : 1)4.6 ± 2.2873.8 ± 46.1224.1 ± 109.490
Oleic acid (18 : 1)46.0 ± 12.51154.5 ± 89.625.1 ± 4.6373
Linoleic acid (18 : 2)16.9 ± 5.11194.7 ± 69.774.0 ± 19.165
Camphor101.5 ± 12.3221.3 ± 10.12.2 ± 0.17ND
Indole390.3 ± 22.6644.0 ± 16.01.7 ± 0.5410
Coumarin36.9 ± 5.6101.4 ± 8.92.8 ± 0.18ND
Anthracene48.4 ± 10.6664.7 ± 10.414.2 ± 2.96ND
Daidzein11.54 ± 4.271.5 ± 2.16.8 ± 2.44ND
Daidzein by F96V/M246I4.1 ± 1.2409.9 ± 114.1100.9 ± 10.440

Contrary to the primary substrate screening results based on NADPH oxidation rates, unsaturated fatty acids, especially myristoleic acid and palmitoleic acid, showed the highest kcat/Km values (453.4 ± 181.5 μm−1·min−1 and 224.1 ± 109.4 μm−1·min−1, respectively) (Fig. 4, Table 2). These high specificity constants were mainly the result of better binding affinity (low Km value) rather than catalytic activity. Among the saturated fatty acids, stearic acid (C18) showed the highest kcat/Km value of 103.7 ± 36.5 μm−1·min−1, which was higher than those of its corresponding unsaturated fatty acids oleic acid (C18 : 1) and linoleic acid (18 : 2), but much lower than those of shorter unsaturated fatty acids such as myristoleic acid (C14 : 1) and palmitoleic acid (C16 : 1). The major reason for these differences was due to differences in the binding affinity of the unsaturated fatty acids. In the case of saturated fatty acids, the longer ones tended to show higher specificity constants, whereas in the case of unsaturated fatty acids the shorter ones showed higher specificity constants.

Figure 4.

 Relative catalytic efficiencies of CYP102D1 towards saturated fatty acids, unsaturated fatty acids and cyclic compounds. Relative activity was based on the catalytic efficiencies kcat/Kmμm−1·min−1 with myristoleic acid as 100%.

CYP102D1 reaction product identification

Products were identified based on GC/MS/MS fragmentation patterns. Similar to other characterized members of the CYP102A family, CYP102D1 is regioselective in fatty acid oxidation and favours the ω-1 position in the case of unsaturated fatty acids such as myristoleic acid or palmitoleic acid; > 50% of the oxidation occurred at the ω-1 position and < 50% at the ω-2 or ω-3 positions (Table 3). CYP102D1 also hydroxylated stearic acid at the ω-1 position but favoured the ω-2 position (ratio ω-1 : ω-2 : ω-3 ∼ 1 : 2 : 0.07).

Table 3.   Distribution of hydroxylated products of fatty acids for CYP102D1.
SubstratePosition of fatty acid hydroxylation
ω-1 (%)ω-2 (%)ω-3 (%)
Stearic acid36.161.42.5
Myristoleic acid51.618.130.3
Palmitoleic acid55.826.517.7

The reaction products of CYP102D1-catalysed oxidation of the cyclic compounds indole, anthracene, coumarin and daidzein were also elucidated by GC/MS. Indole, which showed the highest turnover rate against NADPH, showed two reaction product peaks in the chromatogram, each with mass values 205 m/z as the parent ion and 190 m/z as a fragment ion (data not shown). The 205 m/z peak was assigned to a trimethylsilylated hydroxyl-indole. The 190 m/z peak corresponded to the [M+ − 15] ion (loss of CH3 radical) from the hydroxyl-indole peak. Since no indigo or indirubin were formed from the pigment, their hydroxylation positions were expected to be in the benzene ring of indole. Mass fragmentation using ion trap GC/MSn and comparison with authentic compounds identified the hydroxylated products as 5-hydroxyindole and 6-hydroxyindole.

Mutation of CYP102D1 for enhanced catalytic activity with daidzein

Based on previous studies converting the CYP102D1 fatty acid hydroxylase into a PAH hydroxylase was performed next. The rationale was to improve the fit of the substrate by changing the substrate binding region and to expand the substrate range by enlarging the substrate access channel. As the side chains of R47 and Y51 at the entrance of the CYP102A1 active site are expected to anchor the carboxylate groups of fatty acids, both residues were targeted for mutagenesis [26,28,31]. The double mutant of CYP102A1, R47A/Y51F, bound fatty acids more weakly than single mutants. To accommodate aromatic substrates near the heme, F87 and M237 of CYP102A1A were replaced by valine and isoleucine, respectively, which have smaller side chains [29,32].

The same rationale was used to increase the activity of the CYP102D1 fatty acid hydroxylase [26,30,31] toward aromatic substrates. Sequence alignment with CYP102A1 identified the residues equivalent to F87 and M237 as F96 and M246 in CYP102D1. R47/Y51 residues in CYP102A1 were substituted for I55/F59 in CYP102D1, respectively. L86/F87/F158/M237 in CYP102A1 was conserved as L95/F96/F170/M246 in CYP102D1. To increase binding affinity of isoflavone, i.e. daidzein, to CYP102D1 and to stabilize the binding of daidzein in the active site, F96V and M246I mutations were carried out.

Wild-type CYP102D1 showed very low activity for daidzein, with a turnover number of 71.5 ± 2.1 min−1. The F96V mutation increased the turnover to 625.3 ± 25.3 min−1, nearly 10-fold. The F96V/M246I double mutation had an additive effect and increased the kcat/Km value by 15-fold to 100.9 ± 10.4 μm−1·min−1. Whole cell biotransformation of daidzein with the resting E. coli cells expressing CYP102D1 mutants was carried out. The CYP102D1 F96V/M246I mutant converted 100 μm of daidzein into 4.2 μm of 4′,6,7-trihydroxyisoflavone and 8.8 μm of 4′,7,8-trihydroxyisoflavone together (Table 4). Other mutants showed lower conversion than F96V/M246I. The F96V mutant oxidized daidzein at 8% with similar regioselectivity at the 6- and 8-positions. However, L95I/F96V favoured the 6-position (83%), while F96V/M246I favoured the 8-position (70% regioselectivity).

Table 4.   Whole cell biotransformation of daidzein with resting E. coli cells expressing CYP102D1 mutants. 100 μm of daidzein was used as the initial concentration of substrate. ND, not determined.
SubstrateConversion (%)

The identification of CYP102D1 F96V/M246I products was performed with GC/MSn (Fig. 6). Mass analysis of daidzein showed a molecular ion mass of 398 m/z, and monohydroxylated products were observed at 486 m/z, resulting in an 88 m/z increase due to incorporation of molecular oxygen (+16 m/z) after BSTFA (N,O-bis(trimethylsilyl)trifluoroacetamide) derivatization (+72 m/z). These values agree with the mass analysis patterns of the corresponding reference chemicals. The major hydroxylated products were identified as 4′,6,7-trihydroxyisoflavone (Fig. 6A, peak 1) and 4′,7,8-trihydroxyisoflavone (Fig. 6A, peak 2), which were monohydroxylated forms of daidzein at ortho positions of the 7-hydroxyl group at the A-ring. Mass spectral analysis revealed that each had a molecular ion mass of 486 m/z after BSTFA derivatization, which was exactly the same as those of the reference compounds (Fig. 6B). As a result, the CYP102D1 mutants oxidized daidzein to either the 6- or 8-monohydroxylated product of daidzein with very high catalytic activity, whereas wild-type CYP102D1 did not show any visible catalytic activity with daidzein. This result demonstrates that our previous understanding of the structure of CYP102A1 can be similarly applied to the protein engineering of CYP102D1, such that changes in the amino acids blocking access of the substrate to the inlet portion of the active site can alter the substrate specificities of CYP102 enzymes and enhance catalytic activity for aromatic compounds such as ortho-hydroxylated daidzein.

  • image(6)

[  GC/MS analysis of oxidative metabolites of daidzein by CYP102D1 F96V/M246I. (A) GC separation of daidzein and products; two major hydroxylated products were detected at peaks 1 and 2. (B) Mass analysis of daidzein products showed a molecular ion mass of 486 m/z, resulting from a 88 m/z value increase of molecular oxygen (+16 m/z) after BSTFA derivatization (+72 m/z). These values agreed well with the mass analysis patterns of the corresponding reference chemicals. The major hydroxylated products were identified as 4′,6,7-trihydroxyisoflavone (A, peak 1) and 4′,7,8-trihydroxyisoflavone (A, peak 2), which were monohydroxylated forms of daidzein at ortho positions of the 7-hydroxyl group at the A-ring. ]


This is the first report of a self-sufficient P450 from Streptomyces species and its expression and characterization in E. coli using codon-optimized synthetic DNA. Previously, many groups have reported the cloning, expression and characterization of CYP102A family enzymes (self-sufficient P450s) and their various mutants for enhancement of enzyme activities or alteration of substrate specificities [33]. Generally, CYP102A family enzymes are the fastest and most active CYP enzymes [20]. Since they do not need separated redox proteins to deliver electrons into their heme domain, they have higher turnover numbers than class I type CYPs [34]. The CYP102A family enzymes also have the advantages of being highly soluble and able to be expressed in E. coli. Despite some self-sufficient P450 enzymes having been annotated in the Streptomyces database (, many have been neither well characterized nor well studied due to difficulties in their heterologous expression in E. coli. This is mainly due to their high GC content or RNA secondary structure. Although we have tried to express the sav575 gene (CYP102D1) in E. coli, it could not be expressed. By synthesizing a synthetic gene using codons optimized for E. coli, and an N-terminus modified with the His6 tag, the gene could be overexpressed as soluble protein.

To prove the existence of the reductase domain, the cofactors in the reductase domain (diflavin molecules) were released from purified CYP102D1. The concentrations of FAD and FMN solutions were determined spectrophotometrically by recording the absorbance spectrum. The relative quantities of the FAD and FMN cofactors are determined as both are present in approximately stoichiometric amounts (0.87 nmol FAD·nmol−1 CYP102D1, 0.97 nmol FMN·nmol−1 CYP102D1) [35]. Since the rate-limiting step for CYP102A1 is electron transfer from FMN to the heme domain, we measured both the NADPH- and NADH-dependent cytochrome c reductase activities of CYP102D1 and found that CYP102D1 preferred NADPH (∼ 2000 nmol cyt c·min−1·nmol−1 CYP) over NADH (∼ 400 nmol cyt c·min−1·nmol−1 CYP) [7,9]. NADPH activity involves the transfer of electrons from FAD to FMN to cytochrome c, and CYP102D1 showed an activity one-third that of CYP102A1 (∼ 6000 nmol cyt c·min−1·nmol−1 CYP) [36]. However, the specific cytochrome c reductase activity of CYP102A1 varied, depending on pH, ionic strength and enzyme concentration [34].

CYP102D1 also catalysed the oxidation of fatty acids similar to other CYP102A enzymes, and it showed a close resemblance to CYP102A family enzymes in its regiospecificity of the hydroxylation position (at the terminus of the fatty acid). Further, as chain length increased, the NADPH oxidation rate against saturated fatty acids was enhanced. However, overall specific activity was somewhat less in CYP102D1 for the substrates studied. The NADPH oxidation rate of CYP102D1 against saturated fatty acids was 1.5–3 times higher than against unsaturated fatty acids of the same carbon length. However, general catalytic activity against unsaturated fatty acids such as myristoleic acid (2.44 μm) or palmitoleic acid (4.59 μm) was relatively high due to very high affinity.

CYP102D1 had a similar active site structure to that of CYP102A1, and the residues reported as being important in determining substrate specificity were conserved in CYP102D1. Most interestingly, all of them were equally conserved both at the channel entrance and at the active site around the heme centre. For example, Arg47 and Tyr51 are located at the substrate access channel, where they act as anchors for the carboxylate group of fatty acids, and the phenyl side chain of Phe87 extends into the heme pocket and is positioned above the porphyrin plane. The Phe87 position controls reaction selectivity. For example, the F87V variants markedly increase activities with farnesol showing a 5.7-fold increase in catalytic activity compared with wild-type CYP102A1. F87V variants also accept PAHs, while wild-type does not [26,29].

Overlapping models of CYP102A1 and CYP102D1 revealed that they have structural similarities in the substrate binding site, but that CYP102D1 has a narrower and longer binding site. Sequence alignment and structural comparison identified Leu95, Phe96, Phe170 and Met246 as residues close to the heme that restrict the binding and positioning of the substrate (Fig. 5A). Therefore, these residues were substituted with amino acids with smaller side chains to allow the daidzein substrate better access to the heme centre. Among the single and double mutants, F96V/M246I showed the highest turnover rate, as was the case for Phe87/M237 in CYP102A1. The structural model of CYP102D1 revealed that the phenyl ring was located toward the heme centre and positioned to block the approach of aromatic compounds, thus restricting appropriate positioning of the substrate. F96 lies within a van der Waals radius of the heme and is hypothesized to control heme access and determine regioselectivity and substrate specificities by controlling substrate lateral mobility along the longitudinal axis of the channel. Replacement of F96 with valine yielded a mutant protein that catalyses daidzein oxidation.

Figure 5.

 Computer simulation of CYP102D1 structure model and CYP102D1–indole complex. (A) Important residues of substrate binding sites and the opening of the substrate access channel were identified and compared with those of CYP102A1 by superimposing their residues. (B) Orientation of CYP102D1 F96V/M246I–daidzein complex revealed that the average distances between the heme centre and the hydroxylation position were 2.33 and 2.71 Å. Replacement of F96V revealed the proper docking position of daidzein in the active site.

CYP102A family enzymes have often been used for protein engineering as they are self-sufficient, fast and easy to handle [9]. However, the substrates for wild-type CYP102A enzymes are mostly fatty acids. CYP102D1 has a broader substrate range than the CYP102A family enzymes since it catalyses the oxidation of bulky cyclic compounds. We selected cyclic compounds to show the oxidation capacities of CYP102D1, but other bulky compounds could be their oxidation targets. This new, fascinating self-sufficient enzyme has the potential to overcome limitations such as specific regioselectivity or narrow substrate specificity. Further research will focus on protein engineering and structural analysis based on comparative structure–function studies, as well as screening substrates that can be converted into more valuable molecules by CYP102D1.

Experimental procedures


All chemicals used in this study were of analytical grade or higher quality. Substrates (fatty acid and cyclic compounds) were purchased from Sigma Aldrich (St Louis, MO, USA) and were of analytical grade or higher quality. BSTFA for derivatization for GC/MS analysis was purchased from Fluka (Buchs, Switzerland).

Phylogenetic analysis

Alignment of amino acid sequences was examined with the clustalw2 program via the European Bioinformatics Institute website at Alignments were visualized with bio-edit software.

Codon optimization and gene synthesis for heterologous expression in E. coli

The CYP102D1 gene was chemically synthesized after codon optimization complying with the codon preferences of E. coli (Bioneer Inc., Taejeon, South Korea). Correctly assembled DNA was cloned into the pET-24ma(+) expression vector (Pasteur Institute, Paris, France) [37], and plasmid was transformed into E. coli BL21(DE3). The transformant was grown in Terrific Broth medium containing 50 μg·mL−1 of kanamycin at 37 °C until an A600 nm of 0.6 was reached. Isopropyl-thio-β-d-galactopyranoside (IPTG) was then added to a final concentration of 0.01 mm along with 0.25 mmδ-aminolevulinic acid as a heme precursor. After IPTG induction, cells were grown at 30 °C for 12 h, after which the harvested cells were centrifuged and washed twice with ice-cold NaCl/Pi buffer. The cells resuspended in 50 mm potassium phosphate buffer (pH 7.2) were used for subsequent biotransformation.

Purification and concentration measurement of CYP102D1

E. coli cells containing CYP102D1 were grown as mentioned above, resuspended in 5 mL of sonication buffer composed of 10 mm Tris/HCl (pH 7.0), 2 mm EDTA, 1 mm phenylmethylsulfonyl fluoride and 0.01% (v/v) 2-mercaptoethanol, and disrupted by sonication. The disrupted soluble fraction was collected by centrifugation and then purified using an Ni-NTA His-tag purification kit (Qiagen Korea Ltd, Seoul, Korea). The purification was carried out by utilizing Ni-NTA resin. The Ni-NTA bound enzymes were washed twice with 50 mm potassium phosphate buffer (pH 7.0) containing 500 mm NaCl and 50 mm imidazole, and enzymes were eluted with the same buffer containing 200 mm imidazole. Finally imidazole and sodium chloride were removed by dialysis against 50 mm potassium phosphate buffer (pH 7.0). Purified proteins were subjected to SDS/PAGE and spectrophotometric analysis in order to measure CO binding activity [1]. UV absorption spectra of CO-bound recombinant CYP proteins after sodium dithionite reduction were measured by UV–visible spectrometry (Thermo Labsystems, Rochester, NY, USA) by scanning wavelengths from 400 to 500 nm. CYP102D1 protein concentration was estimated using reduced CO versus reduced difference spectra as developed by Omura and Sato [38]. The CYP102D1 protein content was determined using an extinction coefficient of 91.9 mm−1·cm−1 at 450 nm.

Chromatographic determination of flavin and reductase domain activity

NADPH cytochrome P450 reductase in the self-sufficient P450 binds two flavin cofactors, FMN and FAD, per molecule of enzyme. To release flavin compounds from the reductase, the protein was denatured at 60 °C for 60 min and isolated as described by Liu et al. [35]. Using both FMN (riboflavin 5′-monophosphate) and FAD (flavin adenine dinucleotide) as references, separation on a silica gel plate was carried out by TLC analysis. The TLC plate was visualized by exposure to UV light (254 nm). The concentrations of the FAD and FMN solutions were determined spectrophotometrically, by recording the absorbance spectrum. The extinction coefficients of 11.3 mm−1·cm−1 at 450 nm (FAD) and 12.2 mm−1·cm−1 at 446 nm (FMN) were used. Finally the relative quantity of each flavin molecule was determined.

Substrate screening and kinetic assays

NADPH oxidation was measured by the decrease in UV absorption at 340 nm using a UV–visible spectrometer (Thermo Labsystems, Rochester, NY, USA). A solution containing 160 μL of potassium phosphate buffer (pH 7.0), 10 μL of substrates in dimethylsulfoxide (10–200 μm) and 1.5 μm enzyme solution (0.18 mg·mL−1 enzyme stock) in a 96-well microplate was incubated for 5 min. The reaction was initiated by the addition of 10 μL of 10 mm NADH. For the determination of kinetic parameters, all data were fitted to the Michaelis–Menten equation by linear regression. For the reference sample, NADH consumption rates were measured without substrate.

Determination of the KD value of CYP102D1 with each substrate

Substrate binding spectra were measured as previously described by Ost et al. [39]. Purified CYP102D1 was diluted to 1.0 μm in 100 mm potassium phosphate buffer (pH 7.2), and binding spectra were recorded after each substrate was added to the enzyme solution. The final substrate concentrations were between 2.5 and 100 μm. To calculate the KD value of the enzyme–substrate complex, the experimental data were fitted to a hyperbolic curve (Vmax/(KD + [S])) using a nonlinear regression procedure based on the Durbin–Watson algorithm in the sigma plot software (Syst Software Inc., San Jose, CA, USA).

Resting cell assay for biotransformation of substrates

Cells were resuspended in 1 mL of potassium phosphate buffer (100 mm, pH 7.0) in deep well plates (dome type) containing daidzein dissolved in MeOH [dimethylsulfoxide 50 : 50 (v/v)] to a final concentration of 100 mm. Whole cell reactions were initiated by insertion of the substrate at 37 °C in a high speed incubator (2.8 g, Bioneer) for 24 h. The reaction was quenched by the addition of 1 mL of ethyl acetate into the reaction plate, followed by vigorous vortexing. The mixtures were centrifuged at 10 000 g for 10 min, after which the upper layer containing any remaining substrate and products (organic layer) was extracted, followed by evaporation of the supernatant using a vacuum (BioTron, Bucheon-Si, Kyungki-Do, South Korea). The sample was subsequently applied to GC/MS for structural and quantitative analyses.

Structural analysis of products using GC/MS

For structural analysis of fatty acids, samples extracted with chloroform were evaporated to dryness, dissolved in a mixture of 100 μL of methanol/toluene/sulfuric acid (50 : 50 : 2 v/v/v) and incubated at 55 °C overnight for ester formation ( The same volume of 0.5 m NH4CO3 and 2 m KCl solution was added to the samples, and the mixture was subjected to centrifugation at room temperature, after which the supernatant was transferred to a new vial for derivatization. To protect hydroxyl groups, its reaction products were derivatized using BSTFA by heating at 60 °C for 60 min. GC/MS was performed using a Finnigan MAT system (Gas chromatograph model GCQ, HP 19091j-433) connected to an ion trap mass detector. The BSTFA derivatives were separated through a non-polar capillary column (5% phenylmethylsiloxane capillary 30 m × 250 nm inner diameter, 0.24 μm film thickness, HP-5) containing a linear temperature gradient (for fatty acids, at 100 °C 0.5 min, 5 °C·min−1 to 200 °C, hold for 5 min, 10 °C·min−1 to 250 °C, hold for 3 min; for cyclic compounds, at 60 °C 1 min, 30 °C·min−1 to 250 °C, hold for 10 min, 1 °C·min−1 to 275 °C, hold for 3 min). The injector port temperature was 250 °C. Mass spectra were obtained by electron impact ionization at 70 eV, and scan spectra were obtained within the range 100–600 m/z. Selected ion mode was used for the detection and fragmentation analysis of major products.

Homology modelling and docking simulation of CYP102D1

Homology modelling of CYP102D1 was performed using 3hf2a from the PDB database as a template. The predicted 3D structure was constructed based on a comparative homology modelling method using the genefold and composer programs from the sybyl software package (Tripos Associates, St Louis, MO, USA). The initial state model was energy minimized by the conjugate gradient method at an energy gradient norm of 0.01 kcal·mol−1. Docking simulations were performed using the flexx program. All ligand structures were minimized using a Tripos force field until the rms gradient decreased below 0.05.


This research was supported by a National Research Foundation of Korea grant funded by the Korean government (MEST) (No. 20090083035) and World Class University programme (R322009000102130).