Biological significance of a family of regularly spaced repeats in the genomes of Archaea, Bacteria and mitochondria


  • Francisco J. M. Mojica,

    Corresponding author
    1. División de Microbiología, Departamento de Fisiología, Genética y Microbiología, Universidad de Alicante, Apartado 99, 03080 Alicante, Spain.
    • *For correspondence. E-mail; Tel. (+ 34) 965 903 870; Fax (+ 34) 965 909 494.

    Search for more papers by this author
  • Cesar Díez-Villaseñor,

    1. División de Microbiología, Departamento de Fisiología, Genética y Microbiología, Universidad de Alicante, Apartado 99, 03080 Alicante, Spain.
    Search for more papers by this author
  • Elena Soria,

    1. División de Microbiología, Departamento de Fisiología, Genética y Microbiología, Universidad de Alicante, Apartado 99, 03080 Alicante, Spain.
    Search for more papers by this author
  • Guadalupe Juez

    1. División de Microbiología, Campus de San Juan, Universidad Miguel Hernández, 03550 Alicante, Spain.
    Search for more papers by this author


A peculiar type of repeated element has been detected in different prokaryotes and the occurrence of similar elements in very distant phylogenetic groups is being reported subsequent to genomic sequencing. A comparative study of these peculiar elements, aimed at determining the common structural and sequence features, as well as their phylogenetic distribution, will contribute to elucidate their biological relevance.

These sequences share multiple features which are unique as a whole, being easily distinguishable from any other recurrent motif, and arising as a new family of prokaryotic repeats. They are repeated short elements generally occurring in clusters, but their main peculiarity is the layout: they are always regularly spaced by unique intervening sequences of constant length. For the sake of clarity, and ensuing from the mentioned characteristics, we will refer to the members of this family of repeats as Short Regularly Spaced Repeats (SRSRs).

Using a specific computer program, we have performed a SRSRs search in the completed microbial genomes and the available partial genome sequences of those close to completion. The organisms in which SRSRs have currently been found are listed in Table 1. In summary, the SRSRs are widespread among the various physiological and phylogenetic groups, probably being present in all the Archaea and hyperthermophilic Bacteria, in at least some members of the cyanobacteria and proteobacteria lineages, as well as in the two subgroups of Gram-positive bacteria (the low and high GC content groups). They thus represent the most widely distributed family of repeats among prokaryotic genomes.

Table 1. . Main features of the SRSRs.

size (bp)
Number of
SRSR units
per cluster
  1. A,B, Types of SRSRs distinct (more than 3 bp differences) within the same microorganism. ND, Not determined.

H. volcanii 30ND≥2NDMojica et al. (1995) Mol Microbiol9: 13–21
H. mediterranei 3033–39321/ ND / NDMojica et al. (1995) Mol Microbiol9: 13–21
M. jannaschii 28–3031–517A + 6B + 1C4–25Bult et al. (1996) Science273: 1058–1073 and this work
M. thermoautotrophicum 3034–382124/47This work
A. fulgidus 37A/30B≈ 371A + 2B42A/48B/60BThis work
S. solfataricus 25≈ 40geqslant R: gt-or-equal, slanted294/102Sensen et al. (1998) Extremophiles2: 305–312
P. abysii 29A/30B26–431A + 2B7A/22 B/27BThis work
P. horikoshii 2934–58318/26/66Kawarabayasi et al. (1998) DNA Res5: 55–76
A. pernix 24A/23B37–522A + 1B19A/27A/42BKawarabayasi et al. (1999) DNA Res6: 83–101
T. maritima 3039–4082–40Nelson et al. (1999) Nature399: 323–329
A. aeolicus 2936–3816This work
E. coli 2932–3332/7/13Nakata et al. (1989) J Bacteriol171: 3553–3556 and this work
S. typhi 2932geqslant R: gt-or-equal, slanted16This work
C. jejuni 363015This work
Y. pestis 2832–3326/9This work
C. difficile 2936–384A + 2B5–17This work
M. tuberculosis 3638–401VariableHermans et al. (1991) Infect Immun59: 2695–705
Calothrix sp. 3735–41>15Masepohl et al. (1996) Biochim Biophys Acta1307: 20–36
Anabaena sp. 3732–43>117Masepohl et al. (1996) Biochim Biophys Acta1307: 20–36
V. faba 4020–3516Flamand et al. (1992) Plant Mol Biol19: 913–923

The main features of the SRSRs are summarized in Table 1. They are typically short partially palindromic sequences of 24–40 bp, containing inner and terminal inverted repeats of up to 11 bp (see Fig. 1). Although isolated elements have been detected, the SRSR elements are generally arranged in clusters (up to 14 per genome) of repeated units spaced by unique intervening 20–58 bp sequences. The extent of the clusters is particularly noteworthy in the Archaea.

Figure 1.

Alignment of the SRSRs. Highlighted blocks indicate positions occupied by the most frequent base in the aligned sequence. Only the most abundant type of SRSR element has been considered for M. jannaschii and Clostridium difficile. Two types of SRSR (A and B) present in P. abysii, A. pernix and A. fulgidus have been aligned. A consensus sequence with the most frequent base at each position in the alignment is included. Arrows indicate the palindromic character of the SRSRs.

The SRSRs are very homogeneous within a genome, most of them being identical. However, there are examples of heterogeneity, specially in Archaea. Various SRSR sequences with less than 85% similarity can be distinguished in Pyrococcus abyssi, Archaeoglobus fulgidus, Aeropyrum pernix and Methanococcus jannaschii. In the latter, two clusters with 25 and five units of the same element were initially reported (Bult et al., 1996, Science273: 1058–1073). We have found 12 additional loci and three different SRSR elements, with more than 5 bp changes.

The sequence is conserved in members of the same phylogenetic group, and there is a high percentage of similarity even among domains (see Fig. 1), indicative of a common origin. Phylogenetic distance and the degree of sequence conservation closely concur. Haloferax volcanii differs from Haloferax mediterranei in 3 out of 30 bp, and Pyrococcus horikoshii differs from Pyrococcus abysii in 2 out of 29 bp. The high degree of homology between Escherichia coli and Salmonella typhi is remarkable, with one difference out of 29 bp.

The terminal and inner-inverted repeats of each element are the most conserved regions of the SRSRs (Fig. 1), suggesting that they must be playing an essential role.

In M. jannaschii, Methanobacterium thermoautotrophicum, A. fulgidus, Thermotoga maritima, A. pernix and Mycobacterium tuberculosis, some SRSR clusters are followed by larger (> 300 bp) repeated elements. This association is not detectable in other microorganisms, nor is its possible relevance known.

A general location pattern of the SRSRs loci is not recognizable. There is, however, a remarkable coincidence. Possible chromosomal origins of replication have recently been proposed for the Archaea M. thermoautotrophicum and P. horikoshii (Lopez et al. 1999, Mol Microbiol32: 883–886). In both cases, two clusters of SRSRs are located one to each side of the proposed origin of replication. The distance to the origin is similar, and relatively short, for both clusters (200 and 270 kb in M. thermoautotrophicum, 40 and 78 kb in P. horikoshii). The early and simultaneous appearance of the SRSR clusters in the nascent molecules can be interpreted as being indicative of their relevance.

Besides the sequence conservation, other remarkable features of this family of tandem repeats are the palindromic nature and regular spacing of the SRSR elements. The size of the repeated unit and the presence of inner short inverted repeats are characteristics that concur with those of recognizing sites for certain DNA-binding proteins. The regular spacing of the SRSR elements locate the inverted repeats to the same side of the DNA chain. Although cooperative binding to free proteins cannot be excluded, this peculiar arrangement, with such a length of regularly positioned sites, would rather suggest the need for a solid attachment to a cellular structure that is consequently organized. This would be in agreement with the previously proposed role in replicon partitioning for the SRSRs of haloarchaea (Mojica et al. 1995, Mol Microbiol9: 13–21)

The question emerges here as to whether the SRSRs have a common function in prokaryotes, or whether their presence is reminiscent of ancient sequences and their role diverged with evolution. The universality, phylogeny and biological significance of this peculiar family of repeats arises as an item to be elucidated.


This work was financed by a research grant from the Conselleria de Cultura Educació i Ciència, Generalitat Valenciana (GV97-VS-25–82). E.S. holds a graduate fellowship from the Conselleria de Cultura Educació i Ciència, Generalitat Valenciana.

The sequence data of unfinished genomes were produced by the S.typhi (Salmonella typhi), the C.jejuni (Campylobacter jejuni), the Y.pestis (Yersinia pestis), and the C.difficile (Clostridium difficile) Sequencing Groups at the Sanger Centre and can be obtained from,, and respectively.