Adaptation of intronic homing endonuclease for successful horizontal transmission

Authors


T. Ohama, Department of Environmental Systems Engineering, Kochi University of Technology (KUT), Tosayamada, Kochi 782–8502, Japan
Fax: +81 887 572520
Tel: +81 887 572512
E-mail: ohama.takeshi@kochi-tech.ac.jp

Abstract

Group I introns are thought to be self-propagating mobile elements, and are distributed over a wide range of organisms through horizontal transmission. Intron invasion is initiated through cleavage of a target DNA by a homing endonuclease encoded in an open reading frame (ORF) found within the intron. The intron is likely of no benefit to the host cell and is not maintained over time, leading to the accumulation of mutations after intron invasion. Therefore, regular invasional transmission of the intron to a new species at least once before its degeneration is likely essential for its evolutionary long-term existence. In many cases, the target is in a protein-coding region which is well conserved among organisms, but contains ambiguity at the third nucleotide position of the codon. Consequently, the homing endonuclease might be adapted to overcome sequence polymorphisms at the target site. To address whether codon degeneracy affects horizontal transmission, we investigated the recognition properties of a homing enzyme, I-CsmI, that is encoded in the intronic ORF of a group I intron located in the mitochondrial COB gene of the unicellular green alga Chlamydomonas smithii. We successfully expressed and purified three types of N-terminally truncated I-CsmI polypeptides, and assayed the efficiency of cleavage for 81 substrates containing single nucleotide substitutions. We found a slight but significant tendency that I-CsmI cleaves substrates containing a silent or tolerated amino acid change more efficiently than nonsilent or nontolerated ones. The published recognition properties of I-SpomI, I-ScaI, and I-SceII were reconsidered from this point of view, and we detected proficient adaptation of I-SpomI, I-ScaI, and I-SceII for target site sequence degeneracy. Based on the results described above, we propose that intronic homing enzymes are adapted to cleave sequences that might appear at the target region in various species, however, such adaptation becomes less prominent in proportion to the time elapsed after intron invasion into a new host.

Abbreviations
cob

apocytochrome b

nt

nucleotide(s)

ORF

open reading frame

Various molecular phylogenetic analyses suggest that group I introns in fungi and terrestrial/nonaquatic plants were horizontally transmitted multiple times in the course of evolution among distantly related species [1–3]. We have shown this is also the case for algal mitochondrial introns [4,5]. For reasons yet unknown, the distribution of group I introns is strongly biased, most commonly found in fungi (e.g. the cox-1 of Podospora anserina contains 15 group I introns [6]). About half of group I introns contain an open reading frame (ORF) that encodes a DNA sequence specific endonuclease (intronic homing enzyme). These intronic homing enzymes cleave a target sequence that is usually 16–30 base pairs (bp) long and nonpalindromic (reviewed in [7]). Cleavage of the chromosome initiates repair of the damaged DNA through homologous recombination. Consequently, after the repair, the donor intronic DNA is copied into the recipient chromosome. Thus, homing endonucleases are essential for horizontal transmission of group I introns. Organelle introns are highly likely of no benefit to the host, i.e. they are thought to be selfish and parasitic elements that spread in populations. Therefore, when they integrate into the host genome, there is little or no selection for maintaining endonuclease function. Moreover, if there is any cost to the host cell for producing a functional endonuclease, then selection will work to fix the nonfunctional element. Therefore, regular horizontal transmission of an intron to a new species before its functional deterioration seems essential for its evolutionary long-term persistence. As an example, comprehensive analyses of the group I intron omega (also known as Sc LSU.1), which was first found in the Saccharomyces cerevisiae mitochondrial large subunit rRNA gene, clearly showed repeated horizontal transmissions, and the interval between the complete loss and reinvasion of the intron is estimated to be about 5.7 million years [8]. This leads to the hypothesis that intronic homing enzymes might be adapted to recognize variously degenerated target sequences among a wide range of organisms.

In addition to intronic homing enzymes, highly specific endonuclease activity is also detected among inteins, which are thought to be parasitic elements that exhibit horizontal transmission. Regular invasional transmission is likely essential for both homing introns and inteins. In fact, for the target site of intein homing endonuclease PI-SceI, which is found in Saccharomyces cerevisiae vacuolar membrane H+-ATPase, all of the nine nucleotides essential for the cleavage were mapped on the conserved codon first and second positions, and target sequence variations at codon third positions were tolerated for the endonuclease recognition [9]. On the other hand, the adaptations that permit efficient horizontal transfer of intronic homing enzymes have not been analyzed. To date, only three intronic homing enzymes that target a sequence within protein coding genes were investigated systematically for their recognition sequence ambiguity, i.e. I-SpomI that is encoded as an intronic ORF of a group I intron in the Schizosaccharomyces pombe COXI gene [10,11], I-ScaI is in the COB gene of Saccharomyces capensis[12,13], and I-SceII is in the COXI gene of Saccharomyces cerevisiae[14–16]. To address the question, we investigated the recognition sequence of I-CsmI, including its degeneracy. I-CsmI is a homing enzyme encoded in the group I intron (named alpha or Cs cob.1) located in the apocytochrome b (COB) gene of the unicellular alga C. smithii[17]. This enzyme has the typical two LAGLIDADG motifs. The intronic ORF is probably translated as a fusion protein with the preceding exon, and the N-terminal peptide may be proteolytically removed to become an active form as seen in I-SpomI [18]. Endonuclease activity of I-CsmI has been observed through artificial interspecific cell fusion between intron-bearing C. smithii and C. reinhardtii that lacks the intron in its COB gene [19]. However, systematic analysis of the target sequences and the homing endonuclease's enzymatic properties have not been previously attempted. We overproduced several N-terminally truncated I-CsmI polypeptides in Escherichia coli, and determined cleavable target sequences through an in vitro assay of substrates containing 81 different point mutations.

Based on the analyses of I-CsmI and these three intronic homing enzymes, we discuss the adaptation for successful horizontal transfer. Investigations performed for the intronic homing enzymes that have a recognition sequence in ribosomal RNA genes are less informative to answer our questions and are not considered in this paper.

Results

Activity of the N-terminal truncated I-CsmI polypeptides

Three N-terminally truncated I-CsmI polypeptides [I-CsmI(200), I-CsmI(217), and I-CsmI(237); the number in parentheses indicates the amino acid encoded in the ORF] were purified and yielded about 6 mg of protein per 1 g wet weight E. coli, while the entire I-CsmI ORF (i.e. I-CsmI(374)), which contains the upstream COB exon, did not express even after several modified conditions were tested (Fig. 1). We assayed the endonuclease activity of recombinant I-CsmI(200), I-CsmI(217), and I-CsmI(237) using linearized pCOB1.8Kb as a substrate. I-CsmI(200) is the smallest homing endonuclease containing two LAGLIDADG motifs analyzed to date. It is even smaller than the type II restriction enzyme EcoRI [20], which is a 277 amino acid homodimer that cleaves a symmetric six base restriction site. Recombinant proteins I-CsmI(200) and I-CsmI(237) cleaved the substrate at the expected target site, yielding two fragments of 1.2 kb and 3.7 kb in size. For I-CsmI(217), the quantity of protein was reduced from 1.5 to 1.0 µg and the incubation period was shortened from 24 to 6 h to reduce the amount of insoluble reaction products. Under these modified conditions, I-CsmI(217) exhibited sequence specific endonuclease activity.

Figure 1.

Schematic of open reading frames that encode whole I-CsmI or N-terminally truncated I-CsmI polypeptides. I-CsmI is denoted as a fusion protein with the preceding apocytochrome b gene exon encoded polypeptide. The target sequence of I-CsmI and the bordering intron sequences are shown in upright and italicized characters, respectively. Asterisks show the position of the LAGLIDADG motifs. a.a, amino acid residues.

To determine the optimal conditions for endonuclease activity, we tested the effect of Na+ and Mg2+ concentration, pH, and temperature. The optimal pH for all three proteins was around 7.0 (Fig. 2A). The optimal Na+ and Mg2+ concentrations were 25 mm and 5 mm, respectively, for both I-CsmI(237) and I-CsmI(217) (Fig. 2B,C). In contrast, 75 mm Na+ and 10 mm Mg2+ were optimal for I-CsmI(200). The optimal reaction temperature was 35 °C for both I-CsmI(200) and I-CsmI(237), and 30 °C for I-CsmI(217) (Fig. 2D) A higher concentration of Mg2+ was progressively detrimental to all I-CsmI polypeptides. The presence of Mg2+ was essential for the endonuclease activity as a cofactor, while the same concentration of Mn2+ (5 mm) reduced the enzyme activity to 15%, and no activity was observed with 5 mm of Zn2+, Ca2+, or Co2+ (data not shown).

Figure 2.

Effects of pH, Mg2+, Na+ and temperature on the substrate cleavage reaction using recombinant homing enzyme I-CsmI polypeptides. The conditions used to assay enzyme cleavage were as described in Experimental procedures. ◆, reaction with recombinant protein I-CsmI(237); bsl00066, I-CsmI(217); bsl00001, I-CsmI(200). Vertical axis of each graph (A–D) shows relative activity. The electrophoresis patterns of substrate cleavage by I-CsmI(200) are shown in (A′–D′). Each lane in the agarose gel corresponds to a specific condition denoted in the axis of abscissa shown above the graph. An arrowhead denotes the position of the original substrate, while arrows show the cleaved substrates.

Kinetic parameters of I-CsmI(200)

We determined the kinetic parameters of I-CsmI(200) based on the data obtained by time course monitoring of the cleaved products in various concentrations of the linearized substrate pCOB1.8Kb. The Km, Vmax, kcat were 2.5 × 10−9 m, 1.8 × 10−12 m·s−1, and 4.7 × 10−4 s−1, respectively. These parameters were similar to other representative intronic LAGLIDADG homing endonucleases (e.g. I-CeuI[21], I-SceIV [22], I-DmoI [23,24]) that show characteristics of high affinity to the substrate DNA and slow turnover (Table 1).

Table 1.  Kinetic properties of intronic LAGLIDADG endonucleases. n.d., Not determined.
 I-CsmII-CeuII-SceIVI-DmoI
Km2.5 × 10−9 m0.9 × 10−9 m0.14–0.77 × 10−9 m4 × 10−9 m
Vmax1.8 × 10−12 m·s−1n.d.0.9–1.5 × 10−10 m·s−1n.d.
kcat4.7 × 10−4 s−13.7 × 10−5 s−13–6 × 10−4 s−18.3 × 10−3 s−1
Number of motif per peptideTwoOneTwoTwo

Essential target region

Digestion was not observed using pC-18nt and pC-20nt, while almost complete cleavage was observed for pC-24nt by I-CsmI(200). This suggests that the recognition region of I-CsmI(200) resides between 12 nt upstream (+) and 12 nt downstream (–) of the intron insertion site, while 10 nt upstream and 10 nt downstream is insufficient for recognition.

Cleavage point and mutational analysis of cleavable sequences

The precise cleavage site on each strand was determined through DNA sequencing of the substrate whose termini were blunt-ended by T4 DNA polymerase treatment. It became clear that cleavage occurs five nt downstream of the intron insertion site on the coding strand and one nt downstream of the insertion site on the noncoding strand, creating 3′ overhangs of four nt (Fig. 3). This terminal overhang is typical for DNA cleaved by LAGLIDADG homing enzymes.

Figure 3.

Mutational analyses of the recognition efficiency by recombinant homing enzymes I-CsmI(200) and I-CsmI(217). The coding sequence of C. reinhardtii cob and the assigned amino acids are shown on top. Bases corresponding to the codon third position are shown with underline. The three possible base substitutions for each position are indicated to the left side. An arrowhead indicates the intron insertion site. An arrow with a dotted line shows the cleavage site of the noncoding strand, while an arrow with solid line denotes the cleavage site for the coding strand. The numbering is in relation to the intron insertion site. ‘+ + +’, substrate cleavage above the wild-type levels (more than 150%); ‘+ +’, cleavage almost the same or slightly less than the wild-type levels (120–80%); ‘+’, cleavage below the wild-type levels (50–20%); ‘–’ almost no cleavage (less than 10%); ‘/’, position of the wild-type nucleotide. N/D; not determined.

Eighty-one variants (104 bp each) containing single nt substitutions between −12 and +15 were assayed to discern the critical nucleotides involved in recognition. Positions −5 through −3, +2, and +6 through +8 are strictly recognized by I-CsmI(200) and I-CsmI(217), as the original bases are essential for cleavage, while any substitution was permitted for positions −12 through −6 and +12 through +15 (some examples of cleaved pattern are shown in Fig. 4). The majority of substitutions that blocked substrate cleavage were between −5 and +11 in relation to the intron insertion site. Therefore, the span of critical bases are not centered at the intron insertion site, but are spread almost symmetrically with respect to the cleavage points of coding and noncoding strands. A summary of substrate cleavability is classified into four groups (+++, ++, +, and –; see Experimental procedures for details) and shown in Fig. 3. As a result of cleavage with I-CsmI(200), 26% (8%), 21% (26%), 15% (14%), and 38% (51%) kinds of substrates were classified into four classes, +++, ++, +, and –, respectively [the results of I-CsmI(217) are shown in parentheses]. I-CsmI(200) and I-CsmI(217) showed almost identical sequence recognition properties (Fig. 3). A prominent difference in cleavage efficiency was observed for only two substitutions, the original G at position −2 for A and T. I-CsmI(217) did not cleave these mutated substrates, whereas I-CsmI(200) cleaved both, with the G to A mutation the most efficient of the two (Fig. 3).

Figure 4.

Cleavage pattern of linearized substrates containing single base substitutions by I-CsmI(200). The numbering is in relation to the intron insertion site, with ‘+’ indicating upstream, followed by the nucleotide that is the original base at the given position, while the nucleotide denoted below shows the base after substitution. M.W., 20 bp molecular mass marker ladder. An arrowhead indicates the position of substrate DNA (104 bp), while arrows indicate the positions of cleaved substrate (60 and 44 bp). W; substrate DNA containing the Chlamydomonas reinhardtii wild-type cob sequence.

Correlation between the type of amino acid substitution and cleavage efficiency

We analyzed whether there is any correlation between the type of amino acid substitution induced by single nt substitution (silent/tolerated change, or nonsilent/nontolerated change) and how efficiently the substrates are cleaved by two kinds of N-terminal truncated I-CsmI polypeptides. A survey of GenBank registered sequences of various organisms showed the target DNA sequences of I-CsmI, I-SpomI, I-SceII, and I-ScaI correlate to the amino acid sequences YGQMS(F/H), TGWT(A/V)PPL, FGHPEV, and W(G/A)TVI, respectively. Therefore, F/H, A/V, and G/A amino acid changes at the specific sites were functionally tolerated in this investigation.

Forty-eight substrates containing single nt substitutions at the core recognition region (between −5 and +11) were analyzed from this point of view.

Substrates containing a silent or tolerated amino acid change

Seven of 48 substrates contained a silent amino acid change. However, two of seven such substrates [containing TCT(Ser) changed to TCA and TCG(Ser), mutation position +9 in Fig. 3] were not cleaved at all by I-CsmI(217) and I-CsmI(200), and additionally the substrate contains the change GGC(Gly) to GGG(Gly) (position −1) was not cut by I-CsmI(217) even though these silent changes must be tolerated in nature. On the other hand, three silent substrates [TCT(Ser) to TCC(Ser), position +9; GGC(Gly) to GGT/GGA(Gly), position −1] were cut efficiently by the two I-CsmI polypeptides. Additionally, CAA (Gln) to CAG (Gln) (position +3) was efficiently cut by I-CsmI(200).

Substrates containing a nonsilent or nontolerated amino acid change

Forty-one of 48 substitutions caused nonsilent/nontolerated amino acid changes. Showing an adaptation to the possible target DNA sequences, I-CsmI polypeptides only slightly cleaved most of them (Table 2). Such property is also prominently detected in I-SpomI and I-ScaI. However, TAA(Stop) instead of CAA(Gln) (position +1), TGC(Cys) and TCC(Ser) instead of TTC(Phe) (position +11) were efficiently cleaved by the both I-CsmI enzymes, even though these codons are not observed at these positions in nature. In contrast, none of the nonsilent/nontolerated substitutions were cleaved efficiently by I-ScaI (Table 2).

Table 2.  Type of amino acid substitution contained in the substrate and the cleavage efficiency. Efficiently cleaved: efficiency more than 80% of the wild type substrate for I-SpomI and I-CsmI, while more than 78% for I-SceII; for I-ScaI, efficiency of originally described as ‘mutant cleaved as well as the wild type’. Moderately cleaved: 80–30% of the wild-type substrate for I-SpomI and I-CsmI, while 60–42% for I-SceII; for I-ScaI, efficiency of originally described as ‘reduced cleavage’. Not or scarcely cleaved: less than 30% of the wild-type substrate for I-SpomI and I-CsmI, while 33% for I-SceII; and for I-ScaI, efficiency of originally described as ‘no cleavage’.
Type of substitutionHoming endonucleaseEfficiently cleaved %Moderately cleaved %Not or scarcely cleaved %
Silent or tolerated amino acid changesI-SpomI67 (4/6)33 (2/6)0 (0/6)
I-ScaI13 (1/8)88 (7/8)0 (0/8)
I-SceII100 (7/7)0 (0/7)0 (0/7)
I-CsmI(217)43 (3/7)14 (1/7)43 (3/7)
I-CsmI(200)57 (4/7)14 (1/7)29 (2/7)
Non-silent or non-tolerated amino acid changesI-SpomI16 (3/19)21 (4/19)63 (12/19)
I-ScaI0 (0/22)32 (7/22)68 (15/22)
I-SceII28 (9/32)44 (14/32)28 (9/32)
I-CsmI(217)7 (3/41)10 (4/41)83 (34/41)
I-CsmI(200)15 (6/41)15 (6/41)71 (29/41)

Discussion

The original I-CsmI ORF is fused with the preceding exon, which is not rare for group I intronic ORFs. The entire ORF of I-SpomI also extends into the upstream exon of the COXI gene, and it has been reported that the N-terminal truncated polypeptide, including the two LAGLIDADG motifs, has similar sequence specificity to that detected using mitochondrial extracts [11]. Considering the above, we tried to overproduce three kinds of N-terminally truncated recombinant I-CsmI polypeptides that retain the two LAGLIDADG motifs instead of the entire I-CsmI (374 amino acid) (Fig. 1), because we failed to express the whole I-CsmI ORF for reasons that are unclear. We found that all of the N-terminal truncated I-CsmI polypeptides retain the specificity to cleave the target site, and the kinetic parameters of I-CsmI(200) are very similar to that reported for representative intronic homing enzymes of LAGLIDADG motifs (Table 1). The optimal conditions of selected factors were also very similar to other homing enzymes, with the exception of the preferred pH. I-CsmI displayed its highest activity at pH 7.0, which is very close to the reported physiological pH value of 7.5 in yeast mitochondria [25], while most of the LAGLIDADG enzymes show their highest activity at an alkaline pH between 8.5 and 9.5 (e.g. optimal pH is 2.9 for I-AniI [26], and between 8.5 and 9.0 for the recombinant I-ScaI [13]). Having a host pH that is lower than the optimum pH observed for many homing enzymes may act to reduce endonuclease activity and prevent overdigestion of the genomic DNA.

I-CsmI(200)'s optimal conditions for Na+ and Mg2+ are clearly shifted to a concentration higher than that of I-CsmI(217) and I-CsmI (237) (Fig. 2B,C). This suggests that the three-dimensional conformation of this enzyme is different from the others possibly because of the recessed N-terminal region, and may explain the differences in cleavage activity between I-CsmI(200) and I-CsmI(217). I-CsmI(200) seems to tolerate a higher degree of sequence ambiguity than I-CsmI(217) at position −2, because I-CsmI(200) can efficiently cleave the mutated substrates of −2 A and −2T (instead of the original −2G), while I-CsmI(217) only tolerates the original base −2G (Fig. 3).

Cleavage of a target DNA is an essential step for lateral transfer of an intron. Therefore, if a homing enzyme shows very stringent recognition of the target core sequence, this step could be a bottleneck for horizontal transmission of an intron. The target site of I-CsmI corresponds to the amino acid sequence of Trp-Gly-Gln-Met-Ser-(Phe/His). This is a highly conserved region in COB genes among a wide range of organisms. Our systematic induction of a point mutation and the cleavage assay showed a clear tendency that I-CsmI polypeptides efficiently cleave silent change containing substrates than nonsynonymous/nontolerated change containing ones (Table 2).

It is obvious that stop codons are never tolerated at the internal regions of a gene. However, our systematic induction of a point mutation introduced stop codons, i.e. TGA and TAG stop codons from TGG(Trp), and TAA stop codon from CAA(Gln). The substrate DNA that contains TGA or TAG was not cleaved, while the substrate containing a TAA stop codon was efficiently cleaved by the both I-CsmI polypeptides (Fig. 3). Moreover, substrates including a codon that highly likely appears in nature were not cleaved [e.g. TCA/TCG(Ser) from TCT(Ser), and three Ile codons AT(T/C/A) from ATG(Met)]. The above instances indicate that the recognition property of I-CsmI is not skillfully adapted to recognize target sequences that are highly likely to appear in nature.

It is possible that the recognition property of I-SpomI, I-ScaI, and I-SceII are adapted to recognize multiple possible target sequences, because these homing enzymes cleaved substrates containing various kinds of silent/tolerated amino acid changes efficiently, and none of them were remained uncleaved (Table 2).

Considering the above, we propose that homing enzymes are adapted to recognize diverse target sequences to facilitate horizontal transmission to a new species, as evidently seen with I-SpomI, I-ScaI, and I-SceII. However, immediately after a successful invasion, mutations begin to accumulate that lead to a loss of further adaptation, because homing endonuclease activity is only essential for intron invasion and thereafter it is useless to the cell. Invasion of I-CsmI might be evolutionarily older than the other three homing enzymes compared in this study, because I-CsmI showed the least adapted properties among the four. Actually, remnants of homing endonuclease ORFs that include frame shifts or stop codons within the ORF are frequently found (e.g. [4]). Comprehensive analysis of omega homing endonuclease and its associated group I intron revealed that it is more common to find an inactive intron/ORF combination than it is to find an active intron/ORF combination or an intron-less allele [8].

It has been proved that some of intronic homing enzymes are bifunctional. They work not only as an endonuclease but also as a maturase to preserve splicing. The bifunctional activity of I-SpomI [18], I-ScaI [12], and I-AniI [27] has been observed. I-CsmI could also be a bifunctional protein that acts as a maturase, which may also preserve its endonuclease activity for horizontal transmission. These bifunctional enzymes are recognized as intermediates, and may likely lose their endonuclease activity over time, retaining only their maturase activity [4,26].

Experimental procedures

Cloning and expression of wild type and N-terminally truncated I-CsmI ORFs

The entire COB gene and the alpha intron were amplified by PCR using total C. smithii (CC-1373) DNA as a template. We also used PCR to isolate the wild-type 374 amino acid I-CsmI ORF (i.e. ORF(374)) and three N-terminally truncated ORFs, ORF(200), ORF(217) and ORF(237) (the number in parentheses indicates the amino acid encoded in the ORF). These four ORFs have different N-termini, however, share the common wild-type stop codon. The two sets of primers used to amplify the original I-CsmI ORF(374) and ORF(237), contained XhoI sites at their tails. Forward primer containing an NdeI site, and reverse primers containing an FbaI site were used to amplify ORF(200) and ORF(217). After restriction enzyme digestion, the ORF(374) and ORF(237) PCR products were cloned into the XhoI site of pET19b (Novagen, CA, USA) in frame with a sequence encoding the 10-histidine tag, while ORF(200) and ORF(217) were cloned into the NdeI/BamHI site of pET15b (Novagen, CA, USA) in-frame with a His6 tag. The resulting plasmids were amplified in E. coli DH5α and E. coli BL21 CodonPlus (DE3) RIL (Stratagene, CA, USA) was for protein expression.

Expression and purification of whole or truncated I-CsmI polypeptides

Cultures containing whole or truncated ORFs were undertaken at 37 °C in 2.0 L of LB broth containing 100 µg·mL−1 ampicillin and 34 µg·mL−1 chloramphenicol until D600 = 0.6. Protein expression was induced by addition of isopropyl thio-β-d-galactoside (0.1 mm final). The cells were incubated at 30 °C for an additional 4 h, collected by centrifugation, and resuspended in 40 mL of sonication buffer [50 mm Hepes (pH 7.0), 400 mm NaCl, 6 mm 2-mercaptoethanol, and 20 µg·mL−1 lysozyme] and sonicated on ice. The lysate was centrifuged for 2 h at 10 000 g and the supernatant was loaded onto a Ni-NTA column (5 mL bed volume) (Qiagen, CA, USA) that was previously equilibrated with the wash buffer [50 mm Hepes (pH 7.0), 400 mm NaCl, 6 mm 2-mercaptoethanol, and 10 mm imidazole]. The column was washed with 50 mL of the wash buffer, and the protein was eluted with 100 mL of the elution buffer [50 mm Hepes (pH 7.0), 400 mm NaCl, 6 mm 2-mercaptoethanol, and 200 mm imidazole]. Homogeneity was assessed after staining with SDS/PAGE/Coomassie brilliant blue R-250. The products of ORF(200), ORF(217), ORF(237), and ORF(374) were named I-CsmI(200), I-CsmI(217), I-CsmI(237) and I-CsmI(374), respectively.

Reaction conditions to estimate the minimum target-site length

Substrate DNA

Chemically synthesized DNA fragments, which consist of 18, 20, or 24 nt symmetrically spanning the alpha intron insertion point of the C. reinhardtii COB gene, were cloned into the EcoRV site of the pCITE-4a + (Novagen, CA, USA). These plasmids were named pC-18nt, pC-20nt, and pC-24nt (the number indicates the length of the inserted DNA fragment). The plasmids were first linearized by ScaI digestion, and then used as a substrate to determine the region encompassing the recognition sequence.

Reaction conditions

Linearized substrate (1.5 µg) described above was added to 50 µL of the reaction mixture containing [50 mm Hepes (pH 7.0), 0.01% bovine serum albumin, 1 mm dithiothreitol, 25 mm NaCl, and 5 mm MgCl2] and about 1 µg of recombinant homing enzyme I-CsmI(237). The reaction was carried out at 25 °C for 24 h and 10 µL was loaded onto an 0.8% agarose gel to resolve the products.

Reaction to determine the cleavage point and its terminal shape

We determined the terminal shape of the substrate following the T4 DNA polymerase method by Nishioka et al. [28]. pC-24nt (2.0 µg) digested with I-CsmI(237) was recovered from an 0.8% agarose gel by electro-elution and then treated with T4 DNA polymerase (Takara Bio, Kyoto, Japan) in the presence of 0.2 mm dNTPs. The DNA mixture was then treated with T4 DNA ligase (Takara Bio) for self-ligation and transformed into E. coli. Nucleotide sequence analysis of the plasmid was performed to determine the nature of cohesive termini generated by I-CsmI(200).

Reaction conditions used to investigate the effect of Na+, divalent cations, pH, and temperature

Substrate DNA fragment

A 1.8 kb DNA fragment, containing the entire COB gene of C. reinhardtii (CC-124) and its flanking regions, was cloned into pT7-Blue2 vector (Novagen, CA, USA) and named pCOB1.8Kb. After linearization by NotI, the plasmid was used as a substrate for the reaction described below.

Reaction mixture

A 50 µL reaction mixture [25 mm NaCl, 5 mm MgCl2, 1 mm dithiothreitol, 0.01% (v/v) bovine serum albumin, 50 mm Tris/HCl (pH 7.0)] was used, which contained 0.5 µg of linearized pCOB1.8Kb and 1.0 µg of I-CsmI(217), or 1.5 µg of I-CsmI(200) or I-CsmI(237). One of the parameters [i.e. pH, NaCl concentration, species of divalent cations (5 mm), MgCl2 concentration, or the temperature] in the reaction was altered to determine optimal conditions. Reagents used to make the buffers of specific pH value are as follows; Mes for pH 6.0, Hepes for pH 7.0, Tris for pH 8.0 and 9.0, TAPS for pH 10.0. The reaction was incubated for 24 h with I-CsmI(237) and I-CsmI(200), and incubated for 6 h with I-CsmI(217), which reduced the formation of aggregates observed with this protein. The reaction products were resolved in an 0.8% agarose gel, and stained with ethidium bromide. The relative quantities of the digested fragments were calculated using the nih image program version 1.61.

Assay of cleavable DNA sequences

A limited part of the C. reinhardtii COB gene, which is 104 nt long and containing the I-CsmI target sequence, was chemically synthesized and converted to double strand DNA. This double-stranded DNA fragment was used as a control to compare the cleavage efficiency of various substrates containing single mutations. Each one of the 27 nucleotides composing the target site was changed to the other three possible nucleotides utilizing PCR primers containing a specific mutation. These 81 DNA fragments, each containing single point mutations were used for a detailed analysis of substrate cleavage. One hundred and fifty nanograms of each substrate was digested with 1 µg of I-CsmI(200) in the reaction mixture [50 mm Hepes (pH 7.0), 0.01% (v/v) bovine serum albumin, 1 mm dithiothreitol, 25 mm NaCl, 5 mm MgCl2] at 30 °C for 8 h. Electrophoresis of the samples was performed on a 3% agarose gel, and stained by 10 000-fold diluted SYBR Green I dye (Molecular Probes, OR, USA) for 40 min (SDS/heat-treatment of samples before electrophoresis, described below, was omitted for a clearer image, without affecting the results). The image was developed using LAS-1000 image analyzer (Fuji Film Co., Tokyo, Japan). The cleavage ratio, i.e. cleaved vs. uncleaved fragments, was quantified by NIH Image and compared to wild-type substrate cleavage (i.e. native C. reinhardtii cob sequence carrying substrate). The 81 substrates were grouped into four classes based on the following: (a) The substrate much better than the control (the cleavage ratio of mutated substrate vs. control is more than 1.5) is denoted as +++; (b) The substrate as good as the control (i.e. the ratio is between 1.2 and 0.8) is denoted as ++; (c) The substrate less efficiently cleaved (i.e. the ratio is between 0.5 and 0.2) is denoted as +; (d) Scarcely cleaved substrate (i.e. the ratio is below 0.1) is denoted as –.

Reaction conditions to measure the kinetic parameters

Linearized pCOB1.8Kb and a plasmid containing the N-terminally truncated homing endonuclease, I-CsmI(200), was used to measure the kinetic parameters. Two hundred and fifty microliters of reaction buffer [50 mm Hepes (pH 7.0), 0.01% (v/v) bovine serum albumin, 1 mm dithiothreitol, 25 mm NaCl, and 5 mm MgCl2] contained 1 µg of the recombinant protein and between 0.5 ng·µL−1 and 10 ng·µL−1 of substrate. Twenty-microliter aliquots were removed at different time points from the reaction mixture, and terminated by the addition of 1 µL of 0.5 m EDTA and 1.25 µL of 10% sodium dodecyl sulfate, followed by heating the mixture to 50 °C for 5 min to completely denature the protein. Samples were electrophoresed on an 0.8% agarose gel, then visualized by 10 000-fold diluted SYBR Green I dye. Relative intensities of the digested fragment were quantified using the Las-1000 and nih image. Km, Vmax and kcat were determined through a Hanes–Woolf plot [29].

Acknowledgements

We thank Professors Yoshihiro Matsuda (Kobe University) and Tatsuaki Saito (Okayama University of Science) for advice, and B.Eng. Yoshihiro Adachi (Kochi University Tech) for his technical support in determining the I-CsmI cleavage points and Ms. Mariya Takeuchi for her encouragement. This work was supported by the Sasagawa Scientific Research Grant, and the Regional Science Promotion Program.

Ancillary