SEARCH

SEARCH BY CITATION

Keywords:

  • PCB;
  • bphK;
  • Burkholderia LB400;
  • Site-directed mutagenesis;
  • Glutathione S-transferase

Abstract

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

The bphK gene encoding glutathione S-transferase (GST) is located in the bph operon (PCB co-metabolism) in Burkholderia sp. strain LB400 and the enzyme has recently been shown to have dechlorination activity in relation to 4-chlorobenzoate (4-CBA). Alignments using other glutathione S-transferase sequences found in PCB degradation operons identified a highly conserved region in the C-terminal domain of these enzymes that included a conserved motif implicated in protein folding in eukaryotic GSTs. Site-directed mutagenesis indicated that the region is indirectly involved in the catalytic activity and substrate specificity of BphK. Predicted hydrogen bond interactions involving Asp155 play an important role in the enzymatic properties of this glutathione S-transferase.


1Introduction

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

The glutathione transferases (GSTs: EC 2.5.1.18) are a family of multifunctional dimeric proteins that catalyse the conjugation of the sulphur atom of glutathione (GSH) with a large variety of electrophilic compounds of both endobiotic and xenobiotic origin [1]. GSTs are ubiquitous and have been purified from humans, animals, plants, fish, insects, fungi, yeasts and more recently from bacteria [2–5]. A number of site-directed mutagenesis studies have been carried out on bacterial GSTs. The majority of these studies focused on the N-terminal domain of the GST, which is involved in glutathione binding and is more strongly conserved than the C-terminal domain. Mutagenesis studies using Proteus mirabilis, Ochrobactrum anthropi, Methylophilus sp. strain DM11 and Escherichia coli GSTs have identified the Ser12 residue as responsible for catalytic activity in theta class GSTs [6–9]. The Pro 53 residue was found to participate in the maintenance of the proper conformation of the enzyme fold and to have a possible role in antibiotic binding in P. mirabilis GST [10]. In contrast to the highly conserved N-terminal domain, the sequences of the C-terminal domain are often too different (below 20% identity) to be detected as similar in automated searches of sequence databases. This lends support to the idea that the C-terminal domain of GST enzymes plays a crucial role in determining their functional specificity [11]. Site-directed mutagensis studies on pi class GSTs, identified a conserved motif (Ser/Thr-Xaa-Xaa-Asp), found at amino acids 150–153 in α 6 of human GST, that has a role in protein folding and stability [12]. The α 6 has contact with an element of the active site and changes to residues in this helix have been shown to affect the active site and substrate specificity [13,14].

Burkholderia sp. strain LB400 is a well-characterised polychlorobiphenyl (PCB) degrading microorganism. The genes in the bph operon are responsible for this PCB degrading ability [15–17]. The bphK gene present in the bph operon from Burkholderia sp. LB400 encodes a protein with significant sequence similarity to both prokaryotic and eukaryotic GSTs [18]. Recent studies suggested an additional dechlorination function for this enzyme in relation to the substrate, 4-chlorobenzoate [19], an end product of PCB degradation. However, its precise function in PCB co-metabolism is unknown. Alignments using glutathione S-transferase sequences found in PCB degradation operons identified a highly conserved region in the C-terminal domain of these enzymes, which included the conserved motif (Ser/Thr-Xaa-Xaa-Asp). Site-directed mutagenesis studies were performed on this motif and the amino acids surrounding this motif to establish its role in catalytic activity or substrate specificity of the enzyme. To our knowledge this is the first study of a detailed analysis of this region in a prokaryotic GST implicated in PCB degradation.

2Materials and methods

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

2.1Site-directed mutagenesis

The bphK gene was cloned in pNG4 in E. coli XL1-blue and expressed using 1 mM isopropyl thiogalactoside (IPTG) for induction of enzyme activity as previously described [19]. Site-directed mutagenesis was carried out on the pNG4 plasmid using the QuikChange? kit (Stratagene) as per the manufacturer's instructions. The kit involves the use of a set of user-defined primers, shown in Table 1, containing the mutation of interest, which anneal to a complementary sequence in the plasmid. A PCR reaction was then carried out to generate a mutated plasmid containing staggered nicks. Following PCR, the plasmid was treated with Dpn I, which is specific for methylated and hemimethylated DNA and is used to digest the parental DNA template. DNA isolated from almost all E. coli strains is dam methylated and therefore susceptible to Dpn I digestion. The mutated plasmids were then transformed into competent E. coli cells.

Table 1.  Primers used for site-directed mutagenesis at the target sites in bphK using the QuikChange? kit (Stratagene)
Mutant namePrimer sequence
  1. Underlined letters indicate nucleotides that were changed during mutagenesis studies.

Ser152Gly5′C GAC CAA CTG GGT GTG GCC GAC ATC TAT C3′
5′G ATA GAT GTC GGC CAC ACC CAG TTG GTC G3′ 
  
Ser152Ile5′GAC CAA CTG ATT GTG GCC GAC ATC TAT C3′
5′G ATA GAT GTC GGC CAC AAT CAG TTG GTC3′ 
  
Val153Arg5′GAC CAA CTG AGT AGG GCC GAC ATC TAT C3′
5′G ATA GAT GTC GGC CCT ACT CAG TTG GTC3′ 
  
Val153Ile5′C GAC CAA CTG AGT ATC GCC GAC ATC TAT C3′
5′G ATA GAT GTC GGC GAT ACT CAG TTG GTC G3′ 
  
Ala154Arg5′GAC CAA CTG AGT GTG CGC GAC ATC TAT C3′
5′G ATA GAT GTC GCG CAC ACT CAG TTG GTC3′ 
  
Ala154Ser5′C GAC CAA CTG AGT GTG TCC GAC ATC TAT C3′
5′G ATA GAT GTC GGA CAC ACT CAG TTG GTC G3′ 
  
Asp155Tyr5′G AGT GTG GCC TAC ATC TAT CTG TTC GTC G3′
5′C GAC GAA CAG ATA GAT GTA GGC CAC ACT C3′ 
  
Tyr157Trp5′G AGT GTG GCC GAC ATC TGG CTG TTC GTC3′
5′GAC GAA CAG CCA GAT GTC GGC CAC ACT C3′ 
  
Leu158Pro5′CC GAC ATC TAT CCG TTC GTC GTG CTC GG3′
5′CC GAG CAC GAC GAA CGG ATA GAT GTC GG3′ 

2.2Enzyme assays and thermostability

Cellular extracts were prepared using sonication as described previously [19]. GST activity was measured using the GST assay of Habig and Jakoby [20]. 4-CBA concentrations were analysed by high pressure liquid chromatography (HPLC) with a C18 polar end capped column (250 × 4.6 mm: Phenomenex, UK) as described by Van den Tweel et al. [21] and inorganic chloride ion concentration was measured using the spectrophotometer assay as described by Bergmann and Sanik [22].

Cellular extracts were incubated at each temperature (ranging from 25 to 75°C) for 15 min in 0.1 M potassium phosphate buffer (pH 6.8) containing 1 mM EDTA. The reaction was cooled and the enzyme activity was measured using the GST assay performed at 25°C.

2.3Bioinformatic techniques

A hypothetical 3D structure of BphK was constructed using SWISS-MODEL [23] and the PDB files 1PMT (P. mirabilis), 1A0F (E. coli) and 1F2E (S. paucimobilis) as templates. PDB files were viewed using Swiss PDB Viewer program. Alignments were carried out using ClustalW [24].

3Results

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

3.1Conserved residues of GSTs associated with the bph operon in biphenyl degraders

Alignment of GSTs implicated in biphenyl degradation or showing high sequence similarity to GSTs implicated in biphenyl degradation showed sequence identities ranging from 48% to 83% (Fig. 1). However 79 totally conserved residues were observed in the sequences. A highly conserved region (152–158) was observed in the C-terminal domain of the GSTs and this region was subjected to site-directed mutagenesis studies. This region contains the conserved motif (Ser/Thr-Xaa-Xaa-Asp) implicated in substrate specificity in the insect, Anopheles dirus GST [14] and in protein folding and stability in human GST [12].

image

Figure 1. Clustal W alignment of GST sequences implicated in biphenyl degradation or PAH degradation. The sequences were: LB400, Burkholderia sp. LB400 (Q59721), RB1, Cycloclasticus oligotrophus RB1 (Q46153), EPA505, Sphingomonas paucimobilis EPA505 (O33705), F199, Novosphingobium aromaticivorans F199 (O85984). The box indicates a region of high/identical residue conservation.

Download figure to PowerPoint

Computer aided site-directed mutagenesis was performed using the computer generated 3D model of BphK. The sequences used as templates were the GSTs most closely related to BphK whose 3D structure had been elucidated experimentally, namely GSTs of S. paucimobilis (1F2E), P. mirabilis (1PMT) and E. coli (1A0F). These amino acid sequences showed 54%, 48% and 48% identity, respectively, to the BphK sequence (Fig. 2). The Pro78 residue of the GSTs used as templates to construct the BphK model can be superimposed on to the model of BphK with a root mean square (RMS) of 0.39 Å (S. paucimobilis), 1.23 Å (P. mirabilis) and 1.29 Å (E. coli). The RMS deviation of Cα positions on superposition is 0.45, 0.76 and 0.88 Å, respectively, with no deviations greater than 1.3 Å. The Swiss PDB Viewer program allowed point mutations to be made to the model and the residues selected for mutation in BphK were substituted with all twenty amino acids and their effects were analysed. When, for example, Ala154 was replaced with arginine using the Swiss PDB Viewer program there were no hydrogen bond changes between residue 154 and the surrounding amino acids. In contrast when serine was inserted in the place of alanine a new hydrogen bond was formed with Ser152 (Fig. 3). By analysing the effect of each amino acid interaction with the surrounding residues, amino acids were selected for experimental site-directed mutagenesis on the basis of giving the most or least favourable disruptional “score” using SWISS MODEL. The mutations were introduced on the pNG4 plasmid using the Quikchange? site-directed mutagenesis kit (Stratagene?). The mutated bphK genes were sequenced to confirm the presence of the expected mutation.

image

Figure 2. Clustal W alignment of GST sequences used to generate the 3D model. The sequences were: LB400, Burkholderia sp. LB400 (Q59721), S. paucimobilis (1F2E), P. mirabilis (1PMT), E. coli (1A0F).

Download figure to PowerPoint

imageimage

Figure 3. 3D computer models showing changes from wild type to mutant BphK using Swiss PDB Viewer [23]. Pink represents the amino acid that is mutated and green dashed line represents the hydrogen bonds. The other residues shown are within 6 Å of the residue in pink. Distances shown are measured in Å. (Pink dashed lines are contact differences which indicate unresolved steric clashes introduced by the amino acid substitution.

3.2Conserved region in BphK has a role in catalytic activity

Both WT and mutants were analysed for catalytic activity with two substrates, 1-chloro 2,4-dinitrobenzene (CDNB) and 4-chlorobenzoate (4-CBA). Statistically, the amount of 4-CBA removed was similar to the amount of chloride produced in all samples. Of the nine mutations analysed most resulted in catalytic activity using CDNB and 4-CBA respectively of less then 60% of WT activity (Table 2). Therefore mutations in this conserved region have an effect on catalytic activity. In general there was not a great difference between the effects on the catalytic activity using 4-CBA or CDNB, except for example, the activity of the Ser152Ile enzyme using CDNB was 53% of WT whereas the activity using 4-CBA was 156% of WT. Therefore some of the residues in this region, i.e., Ser152 and Ala154, may be involved in substrate specificity of the enzyme.

Table 2.  Relative GST activity of site-directed mutants
MutantRelative activity (%) CDNBaRelative activity (%) 4-CBAb
  1. aGST activity using CDNB as a substrate was measured as μmol of product formed/min/mg of protein and is expressed relative to GST activity using CDNB of pNG4 in XL1-blue.

  2. bGST activity using 4-CBA as a substrate was measured as removal of 4-CBA μmol/min/mg of protein and is expressed relative to GST activity using 4-CBA of pNG4 in XL1-blue.

Ser152Gly4 ± 2.83 ± 6.5
Ser152Ile53 ± 7156 ± 2.4
Val153Arg4 ± 2.439 ± 4.5
Val153Ile56 ± 6.552 ± 6.4
Ala154Arg84 ± 4.557 ± 7.3
Ala154Ser86 ± 5.447 ± 8.7
Asp155Tyr96 ± 9.795 ± 9.4
Tyr157Trp49 ± 5.759 ± 5.7
Leu158Pro56 ± 2.450 ± 2.2

3.3Asp155 is important for thermostability

It has previously been shown that Asp152 in human GST P1 is important in thermostability and in BphK Asp155 corresponded to this residue [12]. An Asp155Tyr substitution showed no decrease in GST activity in relation to CDNB or 4-CBA but the thermostability of BphK was found to be affected by the mutation at residue 155 indicating that this residue may be involved in the thermostability of the enzyme (Fig. 4). The other mutants tested showed much less difference in thermostability when compared to wild type.

image

Figure 4. Effect of temperature on the stability of BphK in E. coli JM109 (pNG4) cells (□) and E. coli (pAsp155Tyr) cells (▄), E. coli (Ser152Gly) cells (▵), E. coli (Val153Ile) cells (▴), E. coli (Tyr157Trp) cells (♦) using CDNB as a substrate. The enzyme activity following incubation at 25°C was taken as 100%. Results are the mean of three replicates.

Download figure to PowerPoint

4Discussion

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

The alignment of the GSTs from biphenyl degrading strains showed 79 conserved residues suggesting that they evolved from a common ancestor gene, however, horizontal gene transfer can not be ruled out. The alignment of GSTs involved in biphenyl degradation showed a large number of conserved residues in the C-terminal domain of the sequences. The C-terminal has been implicated in xenobiotic substrate binding and normally there is a large variation between GSTs in the C-terminal domain. The large number of conserved residues in this domain suggests a common substrate for these GSTs. One highly conserved region from position 152–158 which contained a conserved motif implicated in protein stability was selected for site-directed mutagenesis studies. Computer aided site-directed mutagenesis predicted amino acid substitutions that would have a small effect and substitutions that would have a greater effect on enzyme structure. In fact, of the nine substitutions made on BphK, only one had no effect on enzyme activity, indicating that computer aided predictions need to be treated with some caution.

The GST activity of the enzyme was influenced by all mutations at the conserved region (152–158) with the exception of those involving Asp 155. Unlike the site-directed mutagenesis of the conserved motif in human GST [12], site-directed mutagenesis of this motif (Ser 152-Asp155) in BphK affected GST activity. This region is located in the mainly hydrophobic core of α 6 helix so changes to a non-hydrophobic amino acid could dramatically affect this core. This is reflected in the Val153Arg mutant (a polar aa substitution), which shows a relative GST activity of 4% with CDNB and 39% with 4-CBA. The hydrophobic core is disrupted by the addition of this polar amino acid. However, a drop in GST activity of approximately 50% was observed with both substrates when Ile, which has similar properties to Val, was inserted. There were no predicted bonds disturbed in the 3D model to explain the lower activity.

The greatest effect on activity was observed in the mutant Ser152Gly. In the 3D model there is a hydrogen bond between Ser152 and Asp155 (Fig. 3), which is removed by the substitution with glycine. This may be due to the ability of glycine to introduce kinks into the chain or the absence of the side chain carrying the OH group. Hydrogen bonds involving this amino acid have been shown to play an important role in stability of human GST [12,13]. Therefore the breaking of this bond may result in instability and thus a drop in activity. However, the substitution of Ser152 with Ile152 is also predicted to break this hydrogen bond and the GST activity drops by 50% with the substrate CDNB but increases to 156% with 4-CBA. Mutations in this region in A. dirus GST resulted in a change in substrate specificity at the active site possibly due to change in interactions with Arg96 in the α 4 helix, which is linked to an active site residue Arg66 [14]. In the BphK 3D model there was no evidence of an interaction of any of the residues (152–158) with any residues in the α 4 helix, but enzyme activities indicate a change in substrate specificity.

Asp155 makes a hydrogen bond with Leu146, which in human GST has been implicated in stability. This bond was broken in the Asp155Tyr mutant but as for the human GST, the enzyme activity did not change. However the thermostability of the mutated enzyme was decreased at 45°C and 55°C compared to wild-type. Therefore this bond may be important in thermostability of BphK and of GSTs in general. It should be noted, however, that the substitution of Asp155 with Tyr also resulted in loss of two H-bonds with Leu151, while two new H-bonds with Leu147 were formed.

In E. coli GST, Tyr157 has been shown to have a hydrogen bond with a water molecule, which in turn is bonded to His106. His106 is a catalytic residue and it is suggested to be involved in deprotonation of the GSH thiol. It has also been suggested that Tyr157 may assist His106 in its role [6]. GST activity with both the substrates tested was affected by the Tyr157Trp mutation. The drop in activity may be due to effects on the interaction with this catalytic residue.

The lower GST activity of the mutants Ala154Arg and Ala154Ser is very similar with a decrease of approximately 15% with CDNB and 43% and 53%, respectively, with 4-CBA. Ala and Ser have similar properties i.e., they are small, while Arg is polar and charged, but both the mutations had similar effects on activity. A Leu158Pro mutation resulted in approximately 50% decrease in GST activity with both substrates. This could be due to the fact that proline prefers to be located in turns, with a resultant effect on conformation of the protein.

In conclusion, the 152–158 region of BphK may not be directly involved in catalytic activity but may interact with residues which are involved in catalytic activity, as changes in this region affect enzyme activity and substrate specificity. It is of interest that activity following certain substitutions was substrate dependent, and future work will address this in more detail.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References

This work was in part funded by the Higher Education Authority of Ireland PRTLI programme, TSR Strand 3, EU Contracts QLK3-CT2000-00164, QLK3-CT-2001-00101 and Science Foundation Ireland (SFI) BRGP.

References

  1. Top of page
  2. Abstract
  3. 1Introduction
  4. 2Materials and methods
  5. 3Results
  6. 4Discussion
  7. Acknowledgements
  8. References
  • [1]
    Habig, W.H., Pabst, M.J., Jakoby, W.B. (1974) Glutathione S-transferases. The first enzymatic step in mercapturic acid formation. J. Biol. Chem. 249, 71307139.
  • [2]
    Clark, A.G., Shamaan, N.A., Sinclair, M.D., Dauterman, W.C. (1986) Insecticide metabolism by multiple glutathione S-transferases in two strains of the house-fly Musca domestica. Pestic. Biochem. Physiol. 25, 169175.
  • [3]
    Ramage, P.I.N., Rae, G.H., Nimmo, I.A. (1986) Purification and properties of hepatic glutathione S-transferases Atlantic salmon (Salmo salar). Comp. Biochem. Physiol. 83, 2329.
  • [4]
    Di Ilio, C., Aceto, A., Piccolomini, R., Allocati, N., Faraone, A., Cellini, L., Ravagnan, G., Federici, G. (1988) Purification and characterization of three forms of glutathione transferase from Proteus mirabilis. Biochem. J. 255, 971975.
  • [5]
    Tamaki, H., Kumagai, H., Tochikura, T. (1991) Nucleotide sequence of the yeast glutathione S-transferase cDNA. Biochim. Biophys. Acta 1089, 276279.
  • [6]
    Nishida, M., Kong, K.H., Inoue, H., Takahashi, K. (1994) Molecular cloning and site-directed mutagenesis of glutathione S-transferase from Escherichia coli. The conserved tyrosyl residue near the N terminus is not essential for catalysis. J. Biol. Chem. 269, 3253632541.
  • [7]
    Vuilleumier, S., Leisinger, T. (1996) Protein engineering studies of dichloromethane dehalogenase/glutathione S-transferase from Methylophilus sp. strain DM11. Ser12 but not Tyr6 is required for enzyme activity. Eur. J. Biochem. 239, 410417.
  • [8]
    Casalone, E., Allocati, N., Ceccarelli, I., Masulli, M., Rossjohn, J., Parker, M.W., Di Ilio, C. (1998) Site-directed mutagenesis of the Proteus mirabilis glutathione transferase B1–1 G-site. FEBS Lett. 423, 122124.
  • [9]
    Favaloro, B., Tamburro, A., Angelucci, S., Luca, A.D., Melino, S., Di Ilio, C., Rotilio, D. (1998) Molecular cloning, expression and site-directed mutagenesis of glutathione S-transferase from Ochrobactrum anthropi. Biochem. J. 335 (Pt 3), 573579.
  • [10]
    Allocati, N., Masulli, M., Casalone, E., Santucci, S., Favaloro, B., Parker, M.W., Di Ilio, C. (2002) Glutamic acid-65 is an essential residue for catalysis in Proteus mirabilis glutathione S-transferase B1–1. Biochem. J. 363, 189193.
  • [11]
    Vuilleumier, S. (1997) Bacterial glutathione S-transferases: What are they good for. J. Bacteriol. 179, 14311441.
  • [12]
    Dragani, B., Stenberg, G., Melino, S., Petruzzelli, R., Mannervik, B., Aceto, A. (1997) The conserved N-capping box in the hydrophobic core of glutathione S-transferase P1–1 is essential for refolding: identification of a buried and conserved hydrogen bond important for protein stability. J. Biol. Chem. 272, 2551825523.
  • [13]
    Aceto, A., Dragani, B., Melino, S., Allocati, N., Masulli, M., Di Ilio, C., Petruzzelli, R. (1997) Identification of an N-capping box that affects the α 6-helix propensity in glutathione S-transferase superfamily proteins: a role for an invariant aspartic residue. Biochem. J. 322, 229234.
  • [14]
    Wongtrakul, J., Udomsinprasert, R., Ketterman, A.J. (2003) Non-active site residues Cys69 and Asp150 affected the enzymatic properties of glutathione S-transferase AdGSTD3–3. Insect Biochem. Molec. Biol. 33, 971979.
  • [15]
    Bopp, L.H. (1986) Degradation of highly chlorinated PCBs by Pseudomonas strain LB400. J. Ind. Microbiol. 1, 2329.
  • [16]
    Hofer, B., Eltis, L.D., Dowling, D.N., Timmis, K.N. (1993) Genetic analysis of a Pseudomonas locus encoding a pathway for biphenyl/polychlorinated biphenyl degradation. Gene 130, 4755.
  • [17]
    Hofer, B., Backhaus, S., Timmis, K.N. (1994) The biphenyl/polychlorinated biphenyl-degradation locus (bph) of Pseudomonas sp. LB400 encodes four additional metabolic enzymes. Gene 144, 916.
  • [18]
    Bartels, F., Backhaus, S., Moore, E.R.B., Timmis, K.N., Hofer, B. (1999) Occurrence and expression of glutathione-S-transferase-encoding bphK genes in Burkholderia sp. strain LB400 and other biphenyl-utilizing bacteria. Microbiology 145, 28212834.
  • [19]
    Gilmartin, N., Ryan, D., Sherlock, O., Dowling, D. (2003) BphK shows dechlorination activity against 4-chlorobenzoate, an end product of bph-promoted degradation of PCBs. FEMS Microbiol. Lett. 222, 251255.
  • [20]
    Habig, W.H., Jakoby, W.B. (1981) Assays for differentiation of glutathione S-transferases. Methods Enzymol. 77, 398405.
  • [21]
    Van de Tweel, W.J., Kok, J.B., de Bont, J.A. (1987) Reductive dechlorination of 2,4-dichlorobenzoate to 4-chlorobenzoate and hydrolytic dehalogenation of 4-chloro-, 4-bromo-, and 4- iodobenzoate by Alcaligenes denitrificans NTB-1. Appl. Environ. Microbiol. 53, 810815.
  • [22]
    Bergmann, J.G., Sanik, J. (1957) Determination of trace amounts of chlorine in naphtha. Anal. Chem. 29, 241243.
  • [23]
    Guex, N., Peitsch, M.C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 27142723.
  • [24]
    Thompson, J.D., Higgins, D.G., Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 46734680.