Discovery of two novel families of proteins that are proposed to interact with prokaryotic SMC proteins, and characterization of the Bacillus subtilis family members ScpA and ScpB


  • Jörg Soppa,

    1. J. W. Goethe-Universität, Biozentrum, Institut für Mikrobiologie, D-60439 Frankfurt, Germany.
    Search for more papers by this author
    • For correspondence. *For characterization of the protein families. E-mail; Tel./Fax (+49) 69 798 29564.

    • Both authors contributed equally to this work.

  • Kazuo Kobayashi,

    1. Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara 630–0101, Japan.
    Search for more papers by this author
    • Both authors contributed equally to this work.

  • Marie-Françoise Noirot-Gros,

    1. Laboratoire de Génétique Microbienne, INRA, Domaine de Vilvert, 78352 Jouy-en-Josas cedex, France.
    Search for more papers by this author
  • Dieter Oesterhelt,

    1. Max-Planck-Institut für Biochemie, D-82152 Martinsried, Germany.
    Search for more papers by this author
  • S. Dusko Ehrlich,

    1. Laboratoire de Génétique Microbienne, INRA, Domaine de Vilvert, 78352 Jouy-en-Josas cedex, France.
    Search for more papers by this author
  • Etienne Dervyn,

    1. Laboratoire de Génétique Microbienne, INRA, Domaine de Vilvert, 78352 Jouy-en-Josas cedex, France.
    Search for more papers by this author
  • Naotake Ogasawara,

    1. Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara 630–0101, Japan.
    Search for more papers by this author
  • Shigeki Moriya

    1. Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara 630–0101, Japan.
    Search for more papers by this author
    • **For characterization of B. subtilis mutants. E-mail; Tel. (+81) 743 72 5432; Fax (+81) 743 72 5439.


Structural maintenance of chromosomes (SMC) proteins are present in all eukaryotes and in many prokaryotes. Eukaryotic SMC proteins form complexes with various non-SMC subunits, which affect their function, whereas the prokaryotic homologues had no known non-SMC partners and were thought to act as simple homodimers. Here we describe two novel families of proteins, widespread in archaea and (Gram-positive) bacteria, which we denote ‘segregation and condensation proteins’ (Scps). ScpA genes are localized next to smc genes in nearly all SMC- containing archaea, suggesting that they belong to the same operon and are thus involved in a common process in the cell. The function of ScpA was studied in Bacillus subtilis, which also harbours a well characterized smc gene. Here we show that scpA mutants display characteristic phenotypes nearly identical to those of smc mutants, including temperature- sensitive growth, production of anucleate cells, formation of aberrant nucleoids, and chromosome splitting by the so-called guillotine effect. Thus, both SMC and ScpA are required for chromosome segregation and condensation. Interestingly, mutants of another B. subtilis gene, scpB, which is localized downstream from scpA, display the same phenotypes, which indicate that ScpB is also involved in these functions. ScpB is generally present in species that also encode ScpA. The physical interaction of ScpA and SMC was proven (i) by the use of the yeast two-hybrid system and (ii) by the isolation of a complex containing both proteins from cell extracts of B. subtilis. By extension, we speculate that interaction of orthologues of the two proteins is important for chromosome segregation in many archaea and bacteria, and propose that SMC proteins generally have non-SMC protein partners that affect their function not only in eukaryotes but also in prokaryotes.


The structural maintenance of chromosomes (SMC) proteins were first discovered in yeast several years ago and are now known to be involved in chromosome segregation in many prokaryotes and in all eukaryotes. SMC proteins consist of more than 1000 amino acids and comprise five structural domains: (i) an N-terminal globular domain, including a P-loop nucleotide-binding motif; (ii) a first coiled-coil domain of 200–400 amino acids; (iii) a hinge domain of variable length; (iv) a second coiled-coil domain of 200–400 amino acids; and (v) a C-terminal globular domain, which includes a conserved DA-motif. Coiled-coil domains generally mediate protein–protein interactions, and dimerization of SMC proteins has in fact been observed. A few recent reviews summarize the current knowledge about SMC protein structure, function and distribution (Cobbe and Heck, 2000; Hirano, 2000; Holmes and Cozzarelli, 2000; Graumann, 2001; Soppa, 2001; Hirano, 2002).

Eukaryotes contain four different SMC proteins, SMC1 to SMC4. Despite their similarity, SMC proteins cannot functionally replace one another and the genes for members of all four subfamilies are essential in yeast. Eukaryotic SMC proteins form specific heterodimers along their coiled-coil domains, which involve either the members of the subfamilies SMC1 and SMC3, or, alternatively, of the subfamilies SMC2 and SMC4. The heterodimers interact with additional proteins, which influence the function of the resulting complexes. SMC1 + 3-containing complexes are involved in cohesion of sister chromatids and in recombinational repair, whereas SMC2 + 4-containing complexes are involved in condensation of DNA and in chromosome copy number-dependent transcriptional regulation (dosage compensation) in Caenorhabditis elegans.

Structural maintenance of chromosomes (SMC) homologues have been found in nearly all archaea and Gram-positive bacteria and in about 40% of Gramnegative bacteria with (partially) sequenced genomes (Soppa, 2001), and were shown to be essential for chromosome segregation in Bacillus subtilis (Britton et al., 1998; Moriya et al., 1998) and Caulobacter crescentus (Jensen and Shapiro, 1999). B. subtilis smc mutants are not viable in rich media at 37°C but can grow at 23°C. They display several typical phenotypes even at the permissive temperature, such as: (i) very frequent formation of anucleate cells; (ii) splitting of the nucleoid by septum in dividing cells (guillotine effect); (iii) frequent cell elongation; and (iv) extension and anomalous positioning of the nucleoid in elongated cells (Moriya et al., 1998). These phenotypes indicate that SMC is involved in chromosome segregation in bacteria. It was reported that B. subtilis SMC is essential for segregation of the replication termini but dispensable for that of the replication origins (Graumann, 2000). It should be noted that smc mutants are viable in a synthetic medium not only at low but also at high temperature, up to 45°C, indicating that the strict dependence upon SMC is confined to complex media (Britton et al., 1998).

Interestingly, more than half of Gram-negative bacterial species lack a smc gene. Instead, a variety of γ- proteobacteria contain a protein denoted MukB, which has a five-domain structure similar to that of SMC. The Escherichia coli protein is known to be involved in chromosome segregation (Niki et al., 1991). Electron microscopic analysis of B. subtilis SMC and E. coli MukB has indicated for both that two thin rods (coiled-coils) emerge from a central domain (hinge) and end in globular domains (Melby et al., 1998), as predicted by in silico analysis. The two monomers dimerize in antiparallel fashion, thereby generating a symmetrical, two-armed, complex. Both ends of the complex contain an N-terminal domain, which is important for ATP-binding, and a C- terminal domain, which carries the DNA-binding activity. The hinge domain allows movement of the two arms, from a stretched conformation, in which the DNA-binding domains are separated by 100 nm, to a closed conformation, in which they are close together (Melby et al., 1998). These data have been generalized to the eukaryotic SMCs, which, however, form heterodimers and are therefore not fully symmetrical.

Although MukB of E. coli was found to interact with two other proteins, MukE and MukF (Yamazoe et al., 1999), no interaction partners have been described for prokaryotic SMC proteins despite their similarity to their eukaryotic orthologues. Prokaryotic SMCs are thus thought to act as simple homodimers.

Here we describe the discovery and characterization of two novel protein families, widespread in archaea and bacteria, which we propose to be involved, together with SMC, in chromosome segregation and condensation. Mutants of the B. subtilis members of these protein families were characterized and their phenotypes strongly support this hypothesis. In addition, direct evidence for the physical interaction of one of the proteins with SMC is presented. We suggest that bacterial SMC proteins interact with non-SMC partners and that the complexes are essential for chromosome segregation, as is the case in eukaryotes.


A novel protein family of segregation and condensation proteins

Analysis of the Halobacterium salinarum genome sequence (D. Oesterhelt et al., unpublished; Ng et al., 2000) revealed a gene encoding an SMC protein. The deduced protein has a length of 1192 amino acids, exhibits the typical five-domain structure, as deduced from secondary structure predictions, contains the conserved P-loop motif in the N-terminal region and includes signature peptide sequences conserved in the SMC protein family. An open reading frame (ORF) encoding a 280-amino-acid long protein, designated ‘segregation and condensation protein A’ (ScpA) for reasons detailed below, was found only 14 nucleotides downstream of the haloarchaeal smc gene. The short distance between the two genes suggests that they form an operon, as the intergenic region is too short to contain the basal archaeal transcription initiation elements, encompassing the TATA box, centred around –27/–28, and the BRE sequence, centred around –33/–34 (Soppa, 1999). Supporting this view, no potential TATA box and BRE sequence were found in the region of more than 200 nucleotides upstream of the ScpA gene.

To determine whether the haloarchaeal protein is conserved in other archaea and in bacteria, all available (partial) genome sequences were searched for additional orthologues (compare with Experimental procedures). In total, 35 ScpAs were discovered. No genome contained more than one scpA gene. ScpAs represent a novel protein family, present in about 40% of the microbial species with (partially) characterized genomes. They are absent from all eukaryotes characterized until now and thus seem to be confined to archaea and bacteria. The fraction of species possessing a ScpA is much higher in archaea (nine out of 11) and Gram-positive bacteria (15 out of 22) than in Gram-negative bacteria (11 out of 49). If the analysis is restricted to prokaryotes with fully sequenced genomes, only two archaea (Aeropyrum pernix, Methanobacterium thermoautotrophicum) and two Gram-positive bacteria (Mycoplasma genitalium and Mycoplasma pneumoniae) lack a ScpA, as opposed to 18 Gram-negative bacteria. These data show that ScpAs are widespread in prokaryotes and are generally present in archaea and Gram-positive bacteria.

Properties of ScpAs

The members of this protein family typically contain between 220 and 280 amino acids. All but two have acidic isoelectric points, typically around pH 4.7. Many contain more than 25.7% of charged residues, which is the average value for proteins of enteric bacteria (Soppa et al., 1993), the value is close to or above 30% for more than half of ScpAs and is near or above 40% for seven of them. Secondary structure calculations suggest that they have a high alpha-helical content, but do not contain any membrane-spanning hydrophobic stretches. Therefore, ScpAs seem to be highly charged small soluble proteins.

A multiple sequence alignment of the 29 full-length sequences shows that the ScpA proteins are well conserved in the N-terminal region, typically around amino acids 10–120. The following highly conserved motif is found around position 80: (L/I)(L/V)XA(A/S)XL(L/V)XXK SXXLLP. The positions shown underlined are conserved in at least 90% of the sequences, the lysine residue is universally conserved, and it seems safe to assume that it is crucial for the biological function. The degree of conservation in the region of about 100 amino acids following the N-terminus is very low. About half of the ScpAs have a predicted short coiled-coil domain in this region, but neither the location nor the length is well conserved, and thus its relevance is unclear. The C-terminal region is again fairly well conserved, including the motif FLA(L/I)LEL about 40 amino acids from the C-terminus, where the amino acids shown underlined are found in at least 80% of the sequences.

Close proximity of smc and scpA genes in archaea

smc and scpA genes are localized very close to each other in H. salinarum . Interestingly, a similar gene arrangement is found in four fully sequenced and two partially sequenced archaea ( Table 1 ). Therefore, Methanococcus jannaschii is the only archaeal exception known to date. Close proximity of the two genes suggests that they are co-transcribed, as argued for H. salinarum . Analysis of microbial genomes has shown that the order of genes is much less conserved than the gene content or the primary sequences of proteins, and that the conserved order of genes is in fact rare ( Huynen and Bork, 1998 ). The gene order was found to be often conserved when the gene products interact physically to fulfil their biological function ( Huynen and Bork, 1998 ). Considerations of the evolutionary advantage of operons compared with separated genes led to a similar conclusion ( Lawrence and Roth, 1996 ). Therefore, we regard the close proximity of the smc and scpA genes in seven different species, belonging to five diverse genera of archaea, as strongly indicative that archaeal ScpAs might interact with SMC proteins. Although the smc and scpA genes are not linked in bacteria, we argue, by extrapolation, that their products might also interact.

Table 1. . Close linkage of smc genes and scpA genes.
1 Archaeoglobus fulgidus AOverlap
2 Ferroplasma acidarmanum AOverlap
3 Halobacterium salinarum A14 bp
4 Methanosarcina barkeri AOverlap
5 Pyrococcus abyssei AOverlap
6 Pyrococcus horikoshii AOverlap
7 Thermoplasma acidophilum A32 bp

Phylogenetic analysis of ScpAs and comparison to the SMC protein phylogeny

To carry out phylogenetic analyses the non-conserved C- and N-terminal extensions were removed from the multiple sequence alignment of all 29 full-length ScpAs, and indels present in only a few species were shortened (data not shown, the alignments are available upon request). The resulting alignment had 268 positions and was used to calculate phylogenetic trees with parsimony, distance matrix and maximum likelihood algorithms (Adachi et al., 1993; Felsenstein, 1996). Bootstrap analyses were performed and consensus trees were calculated. As an example, a parsimony consensus tree is shown in Fig. 1. In general, the trees were similar to the 16S rRNA tree. All three methods showed that archaeal ScpAs form one monophyletic group, which includes the proteins from Aquifex aeolicus and the cyanobacteria. The same result was seen in SMC protein trees (Soppa, 2001), indicating lateral transfer of both the smc and the scpA genes.

Figure 1.

. A consensus parsimony tree of ScpAs. The abbreviations of species names are summarized in Table 3 . The names of archaea are boxed, Gram-positive bacteria are denoted by boxes with round corners, and Gram-negative bacteria are shown in bold. The groups of Gram-positive bacteria with low and with high genomic GC content are indicated. A consensus tree is shown derived after a bootstrap analysis (200 replications) using the program PROTPARS . The bootstrap values are included.

Table 3. . Abbreviations of species names.
AbbreviationSpecies name
Agrob. rad. Agrobacterium radiobacter
Agrob. rhiz. Agrobacterium rhizogenes
Aqu. aeo. Aquifex aeolicus
Arch. fulg. Archaeoglobus fulgidus
Bac. halod. Bacillus halodurans
Bac. stear. Bacillus stearothermophilus
Bac. subt. Bacillus subtilis
Burk. cep. Burkholderia cepacia
Chl. tep. Chlorobium tepidum
Cor. diph. Corynebacterium diphteriae
Cyto. hut. Cytophaga hutchinsonii
Deinoc. rad. Deinococcus radiodurans
Ente. fae. Enterococcus faecalis
Ferropl. aci. Ferroplasma acidarmanus
Hal. sal. Halobacterium salinarum
Lac. lact. Lactococcus lactis
M. jann. Methanococcus jannaschii
mar. Synech. marine Synechocystics
Meth. bark. Methanosarcina barkeri
Mycob. lepr. Mycobacterium leprae
Mycob. tuberc. Mycobacterium tuberculosis
Mycopl. General Mycoplasma genitalium
Neiss. men. Neisseria meningitidis
Pseud. aer. Pseudomonas aeruginosa
Pyr. abys. Pyrococcus abyssei
Pyroc. fur. Pyrococcus furiosus
Pyroc. hor. Pyrococcus horikoshii
Ralst. eut. Ralstonia eutropha
Staph. aur. Staphylococcus aureus
Strept. mit. Streptococcus mitis
Strept. pneu. Steptococcus pneumoniae
Streptom. coel. Streptomyces coelicolor
Synech. Synechocystic sp.
Therm. mar. Thermotoga maritima
Thermom. fus. Thermomonospora fusca
Thermopl. aci. Thermoplasma acidophilum
Trep. pal. Treponema pallidum
Ureopl. ure. Ureoplasma urealyticum
Xyl. fast. Xylella fastidiosa

It is informative to consider the distribution of SMC and ScpAs in prokaryote genomes. Both are present in nine out of 11 archaeal genomes and absent in two (A. pernix and M. thermoautotrophicum). A similar simultaneous presence of the two proteins is observed in Gram-positive bacteria. However, this does not hold true for Gram-negative bacteria, as most species do not contain a SMC protein, and among some 40% that do only about half also contain an ScpA. Therefore, if ScpAs were indeed involved in a common function with SMC proteins, this function is not universally required in SMC-containing microorganisms. The unequal distribution of SMC proteins and ScpAs in Gram-negative bacteria might even shed doubt on the proposal that the two proteins functionally interact. This prompted us to investigate the function of ScpAs experimentally, using the microorganism with the best characterized prokaryotic SMC protein, B. subtilis.

Construction of B. subtilis mutants and characterization of their temperature sensitivity

A smc null mutant of B. subtilis exhibits several very characteristic phenotypes (see above; Britton et al., 1998; Moriya et al., 1998). It can be expected that mutants in genes encoding proteins essential for the same function would show similar phenotypes. For instance, E. coli mutants lacking MukE or MukF, two proteins that interact with MukB (Yamazoe et al., 1999), exhibit phenotypes similar to those of mukB mutants (Yamanaka et al., 1996). We therefore decided to disrupt the B. subtilis scpA, which until now was a gene with unknown function called ypuG. As scpA is located in an operon with scpB (until now called ypuH) and ypuI (Fig. 2A), the latter two genes were disrupted as well.

Figure 2.

. Construction of scpA, scpB and ypuI disruption mutants of Bacillus subtilis and characterization of their temperature-dependent growth.

A. Schematic overview of the genomic organization of the scpA operon. Arrows indicate genes and their transcriptional direction. Bars under the genes indicate the DNA regions cloned into the three plasmids pMUTΔscpA, pMUTΔscpB and pMUTΔypuI for disruption of the respective genes. inline image, ρ-independent transcriptional terminator.

B. Colony formation of the three null mutants. When scpA and scpB mutants were tested, 1 mM IPTG was added to LB medium to ensure transcription of downstream genes using an IPTG-inducible spac promoter in the vector plasmid as explained previously (Vagner et al., 1998). The same result was obtained with and without addition of IPTG.

It was shown previously that scpA (ypuG) and scpB (ypuH) are essential at 37°C in complex medium (Vagner et al., 1998). We cloned internal DNA fragments of the two genes and of ypuI into the vector pMutinT3 (Fig. 2A) and used the resulting plasmids to transform B. subtilis 168 cells to erythromycin resistance at 23°C. Transformants were obtained in each case and the proper integration of the plasmids into the chromosome was confirmed by polymerase chain reaction (PCR) for three independent clones (data not shown). To exclude the possibility that cell growth was due to putative suppressor mutations at distant chromosomal loci, we extracted chromosomal DNA from all clones and used it to re-transform B. subtilis, selecting at 23°C and 37°C. As shown in Fig. 2B, scpA and scpB mutants formed colonies solely at 23°C, but not at 37°C. The same result was obtained with two other DNA samples (data not shown). Clearly, ScpA and ScpB are essential at 37°C but not at 23°C. However, they fulfil (an) important function(s) also at 23°C, as their lack impairs growth rate severely (Table 2), as observed previously for the smc mutant (Moriya et al., 1998), and confers additional phenotypes, described below. In contrast, ypuI mutant formed colonies at both temperatures (Fig. 2B). We concluded that YpuI is not essential at 37°C and excluded the ypuI mutant from further studies.

Table 2. . Characteristic properties of scpA and scpB null mutants growing exponentially at 23°C.
StrainDoubling time (h)aAnucleate cellsb [%] (anucl./total)DNA–protein ratioc
  • a.

    As the mutants do not grow at high temperatures, their growth rates are highly temperature-dependent and the exact values at 23°C are influenced by the temperature measurement/control equipment. In repetitive experiments in the groups of S.D.E. and S.M. the mutants always grew slower than the wild type, however, values of up to 4 h have also been obtained (data not shown).

  • b.

    Cells with very low amounts of DNA were counted as anucleate cells.

  • c.

    Average values and standard deviations from two to three independent experiments are shown.

Wild-type1.00 (0/427)0.051 ± 0.003
scpA null 1.723 (102/448)0.046 ± 0.001
scpB null 2.228 (141/511)0.046 ± 0.003

It had been shown that the temperature-sensitive phenotype of the smc null mutant is confined to complex media, and that in synthetic medium with glucose the mutant is viable up to 45°C (Britton et al., 1998). Similarly, scpA and scpB null mutants formed colonies on solid synthetic medium (Spizizen's minimal, supplemented with 0.5% glucose) at 37°C (data not shown) and grew in liquid synthetic medium with rates only slightly lower than that of the wild-type strain (doubling times of 54, 52 and 48 min for scpA, scpB and the wild-type strains respectively). However, the mutants were producing anucleate cells at high frequencies (≈ 5%), compared with a very low frequency (<0.4%) in the wild-type strain.

The similar temperature-sensitive phenotypes of the null mutants of the three genes prompted us to characterize the scpA and scpB mutants further.

Abnormal nucleoid distribution and anucleate cell production in scpA and scpB null mutants

A smc mutant of B. subtilis is impaired in chromosome segregation, resulting in abnormal nucleoid morphology and in the production of a high fraction of anucleate cells (Britton et al., 1998; Moriya et al., 1998). To test whether scpA and scpB mutants share these phenotypes, both mutants were grown at 23°C, stained with DAPI to visualize chromosomal DNA, and characterized using fluorescence microscopy. Nucleoid distribution was abnormal in both mutants (Fig. 3) and the following phenotypes were observed: (i) a high fraction of anucleate cells was produced. In exponentially growing cultures the frequencies were 23% and 28% for scpA and scpB mutants, respectively, in the same range as observed for a smc mutant (Moriya et al., 1998); (ii) in some cells, the nucleoid was distributed in either half of the cytoplasm and was apparently undergoing splitting by the newly forming septum. This so-called guillotine effect had been described previously for a smc mutant of B. subtilis (Moriya et al., 1998) and a mukB mutant of E. coli (Yamanaka et al., 1996); (iii) some cells were elongated and contained very low amounts of DNA; (iv) a considerable fraction of cells had tangled and extended nucleoids.

Figure 3.

. DAPI-stained images of the wild-type and two null mutant ( scpA and scpB ) cells grown in LB medium at 23°C. White arrows indicate anucleate cells. Arrowheads represent nucleoids being guillotined by septation. Scale bars = 2 μm.

These results strongly indicate that ScpA and ScpB are involved in chromosome segregation and condensation. However, it has been shown that also defects in DNA replication can lead to the formation of anucleate cells (Sharpe and Errington, 1995; Moriya et al., 1997). Therefore, we wanted to exclude that scpA and scpB mutations primarily affect DNA replication at 23°C. A replication defect would lead to a pronounced decrease in the DNA–protein ratio (Moriya et al., 1997). However, we found essentially no difference in the DNA–protein ratios in exponentially growing wild-type and mutant cells (Table 2), excluding a primary replication defect in the mutants at 23°C.

Taken together, the shared phenotypes of smc, scpA and scpB mutants underscore that all three are involved in chromosome segregation and that they might even interact.

The ScpB protein family

To clarify whether ScpB is confined to B. subtilis or whether it belongs to a second novel protein family, the ScpB sequence was used to search protein sequence databases of the European Bioinformatics Institute, and several homologues were found. Furthermore, all fully sequenced genomes were searched for ScpBs, using sequences of related species. In total, 24 ScpBs were found, and additional copies can be expected in partially sequenced genomes. The proteins are about 200 amino acids long with molecular masses of about 22 kDa. Most have isoelectric points around pH 5, with a few exceptions. They do not contain hydrophobic regions and thus seem to be soluble. Secondary structure predictions indicate that they have a high alpha-helical content. No indications for the presence of coiled-coil domains, DNA-binding helix–turn–helix motifs or any signature sequence of known protein families (as deposited in the PROSITE database) could be found. The average fraction of charged amino acids is 27.8%, only slightly above the average value for enteric bacteria (25.6%; Soppa et al., 1993). The only exceptional feature of the amino acid composition is the fraction of leucines (12.8%), which is rather high for soluble proteins.

A multiple sequence alignment was constructed, and phylogenetic trees were calculated with parsimony, distance matrix and maximum likelihood methods. As an example, a parsimony consensus tree is shown in Fig. 4. In general, the trees were similar to the 16S rRNA tree. All three methods revealed that Synechocystis ScpB groups with the archaeal proteins. Thus, apparently cyanobacteria have acquired their genes for SMC, ScpA and ScpB by lateral transfer from an archaeon. The ScpBs of two species of agrobacteria group with or close to the high GC Gram-positive bacteria, but not with proteobacterial ScpBs, indicating a further gene transfer event.

Figure 4.

. A consensus parsimony tree of ScpB homologues. The abbreviation of species names are included in Table 3 . The names of archaea are boxed, Gram-positive bacteria are denoted by boxes with round corners, and Gram- negative bacteria are shown in bold. The groups of Gram-positive bacteria with low and with high genomic GC content are indicated. A consensus tree is shown derived after a bootstrap analysis (200 replications) using the program PROTPARS . The bootstrap values are included.

The distribution of ScpBs in prokaryotes is similar to but not identical to that of ScpAs. Among the 37 prokaryotic species with fully sequenced genomes, three contain ScpA but not ScpB (H. salinarum, M. jannaschii and A. aeolicus), and six contain only ScpB but no ScpA (Deinococcus radiodurans, Neisseria meningitidis, Pseudomonas aeruginosa, Xylella fastidiosa, M. genitalium and M. pneumoniae). Nevertheless, the common occurrence ScpA, ScpB and a SMC is widespread in Gram-positive bacteria and in archaea, whereas their common absence is widespread in Gram-negative bacteria, suggesting that they act together in chromosome segregation not only in B. subtilis but also in many other prokaryotes.

Physical interaction between SMC and ScpA

Conservation of gene order in many archaea and involvement in chromosome segregation in B. subtilis indicated a functional interaction between the SMC protein and the newly discovered proteins. Using the yeast two-hybrid system (Field and Song, 1989), we tested whether a direct physical interaction between SMC and ScpA and/or ScpB from B. subtilis could be detected. Full-size SMC fused to the GAL4 activation domain (AD-SMC) interacted with ScpA fused to the GAL4 DNA binding domain (BD-ScpA), as judged by the capacity of the diploid cells to grow on medium lacking adenine or histidine (Fig. 5). This interaction was also detected in the reverse orientation (BD-SMC/AD-ScpA) but appeared weak at least in yeast because, in this case, only the expression of the HIS3 reporter and not of the more stringent ADE2 reporter was triggered. In contrast, BD-SMC did not interact with the ScpB, DnaB, DnaC, DnaD, CcpA and RsbW proteins fused to the GAL4 AD (data not shown). From these results, we conclude that SMC interacts specifically with ScpA.

Figure 5.

. Specific protein interactions detected with the yeast two-hybrid system. The GAL4 BD fusion proteins (BD) and GAL4 AD fusion proteins (AD) expressed in each strain are indicated at the top of each column and at the left of each row respectively. An interaction was detected as the capacity to grow on –LUH or –LUA medium, and diploid colonies were scored on –LU medium. This experiment was repeated twice independently. SMC-Nter (N-terminal globular domain, amino acids 1–422) and SMC-Cter (C-terminal globular domain, amino acids 735–1186).

In addition, two subfragments of SMC corresponding to the two globular domains of the protein, SMC-Nter (amino acids 1–422) and SMC-Cter (amino acids 735–1186) were found to interact in the BD-SMC-Nter/AD-SMC-Cter orientation. This interaction is consistent with the proposed antiparallel structure of the SMC dimer (Melby et al., 1998; for a review, see Graumann, 2001). However, as dimerization was thought to be driven by the long coiled-coil domains, this experiment shows for the first time a specific interaction of the N- and C-terminal domains without their covalent attachment to a dimerization domain.

Interestingly, although the AD or BD fusion did not alter the capacity of the SMC domains to interact with each other, indicating that both domains are fully folded, we did not detect any interaction with ScpA. This suggests that other or additional structural features of SMC are required to interact with ScpA, for example the coiled-coil domains or the hinge domain, or that dimerization of SMC is a prerequisite for its interaction with ScpA.

To further prove the specific interaction between SMC and ScpA in their homologous host in vivo, we investigated complex formation in cytoplasmic extracts of Bacillus subtilis. Both proteins (ScpA and SMC) were marked at the carboxyl termini by addition of GFP and T7 tag respectively. Neither fusion affected cell growth at 37°C, indicating that both fusion proteins function normally in B. subtilis. T7-tag antibody agarose was used for affinity purification of SMC from soluble extracts of cell lysates, and an anti-GFP antibody was used to analyse whether ScpA could be co-purified, which would prove their specific interaction in a protein complex. As shown in Fig. 6, ScpA-GFP was identified in the SMC-T7 complex by immunoblotting (sample 2 of ‘bound’ in the figure), whereas no signal was detected in the bound fraction prepared from scpA-gfp cells containing native SMC and lacking the SMC-T7 fusion protein (sample 1). These results show that B. subtilis SMC foms a complex with at least one non-SMC subunit, ScpA, and confirms the results obtained by the yeast two-hybrid assays.

Figure 6.

. Detection of the ScpA-GFP protein in the SMC complex in vivo . Soluble fractions were prepared from two kinds of cells ( scpA-gfp , 1; smc -T7 scpA-gfp , 2) as described in Experimental procedures . The SMC-T7 complex was isolated from the soluble fraction by using the T7-tag antibody agarose. After unbound proteins were separated by centrifugation, the complex was eluted from the beads with citric acid. Each 1 μl of soluble and unbound fractions (6 ml each) and 10 μl of bound fraction (575 μl) were loaded into lanes of a SDS–polyacrylamide gel. After separation of proteins by electrophoresis, ScpA-GFP was detected by immunoblotting using an anti-GFP antibody followed by a chemiluminescence detection system.


We describe here two novel protein families widespread in archaea and in Gram-positive bacteria, which until now were regarded as unrelated ORFs with unknown function. The phenotypes of B. subtilis mutants lacking either ScpA (until now called YpuG) or ScpB (until now called YpuH) imply that both are involved in chromosome segregation and condensation, and act in concert with the SMC protein.

Complex formation of SMC, ScpA and ScpB

A specific physical interaction between B. subtilis ScpA and SMC has been detected using two independent methods, i.e. the usage of the yeast two-hybrid system in both orientations and the co-isolation of ScpA upon affinity purification of SMC. The failure to detect an interaction of SMC with ScpB and a variety of other proteins underscores the specificity of the SMC–ScpA interaction. The interaction seems to be rather weak, as (i) in one orientation only the sensitive His3 reporter was triggered in the yeast two hybrid system, and (ii) an earlier attempt to isolate an SMC complex including non-SMC subunits by immunoaffinity chromatography and heparin–sepharose chromatography was not successful and gave rise to the believe that SMC acts as a simple homodimer (Hirano and Hirano, 1998). The SMC–ScpA interaction could be verified in an independent study, which focussed on the intracellular localization of ScpA and ScpB. It could be shown that a ScpA–GFP fusion forms fluorescent foci exclusively in SMC-containing cells but not in an smc mutant (Mascarenhas et al., 2002).

At present, it is unclear whether B. subtilis ScpB also interacts directly with the SMC protein. The yeast two-hybrid system failed to indicate any such complex formation (see above). In E. coli, the SMC homologue MukB was found to form a ternary complex with MukE and MukF. However, direct physical interactions were only found between MuB and MukF and MukF and MukE, but not between MukB and MukE (Yamazoe et al., 1999). This could be similar in B. subtilis as an interaction between ScpA and ScpB was found using fluorescence energy transfer (Mascarenhas et al., 2002). Taken together, it seems to be likely that a ternary SMC–ScpA–ScpB complex is formed, but that the interaction between SMC and ScpB is either indirect or very weak. Although SMC and MukB belong to two subfamilies of the SMC protein superfamily (Soppa, 2001), no convincing primary sequence similarity between ScpA and MukF or ScpB and MukE could be found. However, all four are small soluble proteins with a high alpha-helical content.

The gene order SMC→ScpA is conserved in nearly all archaea, as is the very short distance between the two, indicating co-transcription of both genes in different archaeal genera and arguing that the interaction of gene products is not confined to the homologues in B. subtilis. These observations, together with the phylogenetic profiles of SMC, ScpA and ScpB protein families, lead us to propose that the three proteins act together in chromosome segregation in all prokaryotes that contain them.

Growth rate-dependent lethal phenotypes of smc, scpA and scpB mutants

A striking similarity of smc, scpA and scpB mutants is that all three are lethal at 37°C in complex medium, but can grow at the same temperature in synthetic medium. This shows that the three proteins, probably acting in a complex (see above), are absolutely essential for chromosome segregation at a high growth rate, but that they are not the only factors involved in this process, as at the low growth rate in synthetic medium chromosome segregation is possible without them, albeit with low fidelity (5% anucleate cells). Survival is also possible in complex medium when the growth rate is decreased by lowering the temperature, but again, mutants have a decreased segregation fidelity (20% to 30% anucleate cells). A candidate protein for the partial rescue of survival at low growth rates is SpoIIIE, as it was shown that an smc spoIIIE double mutant has a synthetic lethal phenotype. The combination of a spoIIIE mutation with a conditional smc mutation revealed that at the selective temperature the nucleoid is split by the newly forming septum, resulting in cell death (Britton and Grossman, 1999). Therefore, it seems that if normal segregation fails, for example because of a mutation in smc, scpA or scpB, SpoIIIE can move the replicated chromosomes out of the way of the invaginating septum at low growth rates, but not at high growth rates.

Possible roles of prokaryotic SMC-containing complexes in condensation and cohesion

In eukaryotes, SMC heterodimers form complexes with non-SMC subunits, which are thought to influence the function of the complex, e.g. cohesion, condensation, recombination, or gene silencing. Therefore, also non-SMC subunits of prokaryotic SMC complexes might participate in one or more of the subfunctions of the SMC repertoire.

Prokaryotic SMC-containing complexes are probably involved in a ‘condensing’ function. Evidence is accumulating that prokaryotic SMCs are required for chromosome folding. For instance, smc mutants of B. subtilis and C. crescentus and a mukB mutant of E. coli often show less condensed nucleoids (Niki et al., 1991; Britton et al., 1998; Moriya et al., 1998; Jensen and Shapiro, 1999), and isolated nucleoids of a mukB mutant seemed to be unfolded (Weitao et al., 1999). Furthermore, a mukB mutation can be suppressed by a concomitant mutation in a topoisomerase gene (topA), indicating that TopA and SMC have opposing effects on the superhelicity of chromosomal DNA (Sawitzke and Austin, 2000). An SMC-containing prokaryotic ‘condensing’ complex could introduce large positive supercoils into DNA and thereby fold it effectively (Hirano, 2000; Holmes and Cozzarelli, 2000). As the origins of replication seem to be transported towards the cell poles after replication independent of SMC (Graumann, 2000), the subsequent folding of chromosomal DNA by a SMC could be the driving force for moving the bulk of the nucleoid into the daughter cells (Webb et al., 1997; Sawitzke and Austin, 2000). This function might well require non-SMC subunits.

Indications for a possible role of prokaryotic SMCs in a ‘cohesion’ function are less clear. In B. subtilis, the oriC regions show a bipolar localization for most of the cell cycle (Glaser et al., 1997; Webb et al., 1997), and after replication initiation they move rapidly towards the opposite poles (Sharpe and Errington, 1998; Sharpe et al., 1998; Webb et al., 1998). It has been proposed that newly replicated origins pair (Lin and Grossman, 1998), but this period is rather short and a SMC requirement at this stage has not yet been proven. This situation seems to be very different in E coli, in which it was found that newly replicated origins remain linked with each other for about two-thirds of the replication period, and thus a cohesion-like function seems to be involved (Hiraga et al., 2000). In addition, it was found that MukB is involved in this sister chromatid cohesion as it does not take place in a mukB mutant (Sunako et al., 2001). In archaea, it has been shown that the cells are in the G2 phase for most of the cell cycle, which could require a cohesion function of a SMC protein after replication of the chromosome until the daughter chromosomes are segregated (Bernander, 2000). The relative importance of condensation and cohesion subfunctions in the course of the cell cycle could be different in different prokaryotic species for SMC- and MukB-containing complexes.

Species-specific variability in the prokaryotic cell cycle

In addition to the accumulating experimental evidence that the mechanisms of cell cycle regulation and chromosome segregation may vary among prokaryotes, this is also indicated by the distributions of the SMC, ScpA and ScpB protein families. SMC proteins are not universally conserved, but prokaryotes contain either an SMC, or a MucB, or are devoid of a protein with long coiled-coil domains (Soppa, 2001). With a single exception ScpAs and ScpBs are found exclusively in SMC-containing species. The occurrence of all three proteins is common in archaea and in Gram-positive bacteria, whereas their absence is common in Gram-negative bacteria. However, some SMC-containing species, mostly Gram-negative bacteria, harbour only ScpA or only ScpB or lack both of them, indicating variability of non-SMC proteins involved in chromosome segregation.

The identification of the two novel protein families involved in chromosome segregation and the analysis of their distribution among prokaryotes open a number of research avenues. Examples are: (i) the verification that SMC proteins and members of the ScpA family interact in different species of archaea and bacteria; (ii) the isolation and analysis of SMC-containing high molecular weight complexes of several prokaryotic species; (iii) unravelling of the details of the interaction of B. subtilis ScpA and ScpB with SMC; (iv) investigation of which activities of SMC are influenced by members of the two protein families in vitro and in vivo. The discovery of the two new protein families has thus the potential to lead to a much deeper understanding of chromosome segregation in archaea and bacteria and its relation to that of eukarya.

Experimental procedures

Sequence retrieval and primary sequence analyses

Protein sequence databases at the European Bioinformatics Institute were searched using their Sequence Retrieval System (SRS6) and FASTA (Madden et al., 1996), and protein sequence databases of the National Center for Biotechnology Information with BLAST (Pearson, 1990).

As a prerequisite for a systematic search for the distribution of ScpA among prokaryotes, the haloarchaeal sequence was used to search protein sequence databases, and several similar proteins were detected. The proteins from three archaea and three bacteria of phylogenetically diverse genera were retrieved and used to construct a multiple sequence alignment including the haloarchaeal protein. The following consensus sequence was derived from a conserved part of the protein, including ‘X’ (any amino acid) at sites where less than five species shared the same amino acid: PVD/LIERGEIDPWDIDIVDVTDXYLXXLXELXXLDLRXSG RALLXASILLRMKSEALL. The consensus sequence was used to search for the presence of scpA genes in each of the fully or partially sequenced genomes individually using the PEDANT genome analysis tool (Frishman and Mewes, 1997). In finished genomes, the BLAST function was used to search the annotated ORF sequences with consensus peptide sequences. In partially sequenced genomes an option of the BLAST function was used, which allowed us to compare the consensus peptide sequences with the available DNA sequences, translated in all six frames. In addition, partially sequenced genomes were searched with whole ScpA protein sequences of related species.

For protein sequence analysis, the Expert Protein Analysis System (Expasy) of the Swiss Institute of Bioinformatics was used (, coiled-coil regions were predicted using the program COILS (Lupas, 1996), helix–turn–helix motifs with the HTH-Predictor (Dodd and Egan, 1990), and hydropathy and secondary structure elements with the program MACVECTOR.

Multiple sequence alignments and phylogenetic analyses

Sequences were formed for multiple sequence alignments with the text editor WORD98, and multiple sequence alignments were generated using CLUSTALW (Thompson et al., 1994) and analysed with JALVIEW, which includes different colouring schemes, for example colouring based upon the degree of conservation. The alignments were edited with WORD98 before phylogenetic analyses, i.e. non-conserved N- and C-terminal extensions were removed, and indels present in only a few species were shortened.

All phylogenetic calculations were performed at the Institute Pasteur where the web-based usage of phylognetic programs was kindly made available ( seqanal/phylogeny/phylip-uk.html). Phylogenetic trees were calculated with the parsimony program PROTPARS, distance matrices with the program PROTDIST, and distance matrix-based trees with the program FITCH; all programs are included in the Felsenstein program package PHYLIP (Felsenstein, 1996). Maximum likelihood trees were calculated with the program PROTML (Adachi et al., 1993). A total of 200 bootstrap replications were performed whenever possible. Consensus trees were calculated using the program CONSENSE, and the trees were visualized using the Macintosh program TREEVIEW (Page, 2000).

Plasmids and strains

Three DNA fragments in the scpA operon were amplified by polymerase chain reaction (PCR) to construct three plasmids shown in Fig. 2A. The upstream and downstream primers of each primer set contain artificial HindIII and BamHI sites, respectively, so that the amplified fragments are cloned between these sites of a plasmid, pMutinT3 (Moriya et al., 1998; Vagner et al., 1998). The amplified regions are as follows: nucleotide nos. 2425805–2425578, 2425060– 2424831 and 2424424–2424156 at Bacillus subtilis (BS) ORF Data Base ( for pMUTΔscpA, pMUTΔscpB and pMUTΔypuI respectively.

To construct smc-T7 and scpA-gfp strains, two plasmids (pMUT′smc-T7 and pMm2′scpA-gfp) were prepared. A 3′-end of the smc gene (207 bp; 1669310–1669516 at BS ORF Data Base) was amplified by PCR. A 51 bp tag sequence was added to the reverse primer to fuse His and T7 tags (H6MASMTG2Q2MG) to the carboxyl terminus of SMC in frame. As artificial HindIII and BamHI sites were also added to 5′-ends of the forward and reverse primers, respectively, the PCR product was digested with both enzymes and cloned in between these sites of pMutinT3. A 3′-end of scpA (224 bp; 2425291–2425068 at BS ORF Data Base) was amplified by PCR with primers containing artificial SalI and ClaI sites respectively. After the scpA fragment was digested with the two enzymes, it was inserted in between these sites of pMm2 (its construction is described by Imai and colleagues; Imai et al., 2000).

Competent cells of B. subtilis 168 (trpC2) were prepared as described previously (Kunst et al., 1994) and transformed with the five plasmids described above. Transformants generated by integration of the plasmids by single crossing-over were selected on Luria–Bertani (LB) plates containing erythromycin (0.5 μg ml−1) at 23°C for scpA, scpB and ypuI disruption mutants or at 37°C for smc-T7 and scpA–gfp fusion strains. When two disruption mutants (scpA and scpB) and the two fusion strains were selected, 1 mM IPTG (isopropyl β-D-thiogalactopyranoside) was added to the plates to ensure expression of genes located downstream of the two genes by an IPTG-inducible promoter, spac. To construct a strain (smc-T7 scpA-gfp), the erythromycin-resistant gene in the smc-T7 strain was replaced with a tetracycline-resistant gene by transformation with a plasmid pEm::Tc (Ogura et al., 2001) and the scpA–gfp fusion was then introduced by transformation with pMm2′scpA-gfp.

Examination of cell viability at 23°C and 37°C

Chromosomal DNA was extracted from the disruption mutants and was used to transform B. subtilis 168 cells again. Then, 100 ng of each chromosomal DNA was mixed with 0.5 ml of the competent cell suspension. After shaking at 37°C for 60 min, 20 μl of the cell suspension was spread on two LB plates containing erythromycin and IPTG. One was incubated at 37°C overnight and the other was at 23°C for 2.5 d.

Microscopic observation of DAPI-stained cells and measurement of the DNA–protein ratio

Cells were grown in LB medium with erythromycin and 1 mM IPTG (for disruption mutants) or without them (for 168) at 23°C. They were harvested at OD600 = 0.4, stained with DAPI (4′,6-diamidino-2-phenylindole) and observed under fluorescence microscopy as described previously (Hiraga et al., 1989; Hassan et al., 1997).

DNA and protein content in exponentially growing cell population were determined as described (Moriya et al., 1997).

Yeast two-hybrid assay

This assay was performed using a two-hybrid system previously described (James et al., 1996). Genes encoding prey and bait proteins were amplified by PCR using B. subtilis strain 168 chromosomal DNA as a template, and cloned into the bait vector, pGBDU-C1, or into the prey vector, pGAD-C1. Sequence integrity of all the cloned genes was verified by DNA sequencing. The bait- and prey-producing plasmids were introduced by transformation in yeast PJ69–4a and PJ69–4α respectively. Interactions were tested by mating the bait- and prey-containing cells (Finley and Brent, 1994). Cells were mixed and grown for 24 h on rich medium at 30°C. Diploids were selected by transferring the cells with a replicating tool onto –LU medium and incubating 2–3 d at 30°C. Diploids were transferred to –LUA, –LUH supplemented with 0.5 mM 3-AT (3-amino-1,2,4-triazole) and –LU plates, and interaction phenotypes were scored after 15 d of growth at 30°C.

Analysis of complex formation of SMC and in vivo

Cells were grown in 1.4 l of LB medium containing erythromycin (0.5 μg ml−1) and IPTG (0.5 mM) at 37°C, and were collected by centrifugation when cell density (optical density at 600 nm) reached 0.7. The cells were suspended into 140 ml of sodium phosphate buffer (pH 7.5) with 150 mM sodium chloride. A cross-linking reagent, DSP [Dithiobis(succinimidylpropionate)] (Pierce), was added to the cell suspension at 400 μM and the suspension was incubated at 37°C for 30 min. The cells treated with the cross-linker were collected by centrifugation and washed with 20 ml of 20 mM Tris-HCl (pH 7.6). After the cells were suspended into 8 ml of T7-tag bind/wash buffer (Novagen), they were broken by sonication. Cell debris was removed by low-speed centrifugation and a soluble fraction was then obtained by ultracentrifugation (147 000 g for 30 min at 4°C) using a 70.1Ti rotor (Beckman). The soluble fraction was mixed with 1 ml of 50% T7-tag antibody agarose (Novagen) and further incubated for 40 min at room temperature with gentle shaking. Proteins bound and unbound to the T7-tag antibody agarose beads were separated by centrifugation. After the agarose beads were washed with 10 ml of the bind/wash buffer three times, proteins bound to the beads were eluted by addition of 0.5 ml of T7-tag elute buffer (100 mM citric acid, pH 2.2; Novagen) followed by neutralization with 2 M Tris-HCl (pH 10.4). Cross-link between protein molecules was cleaved by heating in sample-loading buffer containing 100 mM dithiothreitol.

Proteins were separated in a sodium dodecyl sulphate (SDS)-polyacrylamide gradient gel (10–20%) by electrophoresis and blotted on a Hybond-P polyvinylidene difluoride membrane (Amersham Pharmacia Biotech). The GFP protein was detected by a mouse monoclonal anti-GFP antibody (Boehringer Mannheim) and a goat anti-mouse IgG-horseradish peroxidase conjugate (Bio-Rad Laboratories) as first and second antibodies, respectively, and by the ECL Plus chemiluminescence detection system (Amersham).


This work was supported by Grants-in-Aid for Scientific Research (B) and for Scientific Research on Priority Area (C) from Japan Society for the Promotion of Science. It was also supported by the Deutsche Forschungsgemeinschaft through Grant So 264/7.