A genetic and bioinformatic analysis of Streptomyces coelicolor genes containing TTA codons, possible targets for regulation by a developmentally significant tRNA


  • Editor: Jose Gil

Correspondence: Meifeng Tao, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China. Tel.:+86 027 87283702; fax:+86 027 87280670; e-mail: tao_meifeng@yahoo.com


The rarest codon in the high G+C genome of Streptomyces coelicolor is TTA, corresponding in mRNA to the UUA codon that is recognized by a developmentally important tRNA encoded by the bldA gene. There are 145 TTA-containing genes in the chromosome of S. coelicolor. Only 42 of these are represented in the genome of Streptomyces avermitilis, among which only 12 have a TTA codon in both species. The TTA codon is less represented in housekeeping genes and orthologous genes, and is more represented in functional-unknown, extrachromosomal or weakly expressed genes. Twenty one TTA-containing chromosomal genes in S. coelicolor were disrupted, including 12 of the 42 genes that are common to both S. avermitillis and S. coelicolor. None of the mutant strains showed any obvious phenotypic differences from the wild-type strain under tested conditions. Possible reasons for this, and the role and evolution of the observed distribution of TTA codons among Streptomyces genes were discussed.


Streptomycetes are Gram-positive mycelial bacteria that have an extensive secondary metabolism and undergo complex morphological differentiation to form a sporulating aerial mycelium. In the model organism Streptomyces coelicolor A3(2), and in other species tested, development and many secondary metabolism are pleiotropically defective in mutants of bldA, which encodes the only tRNA that can efficiently translate the rare leucine codon UUA (Leskiw et al., 1991b; Chater, 2006; Chater & Chandra, 2006). Thus, although bldA mutants show apparently normal vegetative growth, they are defective in the production of at least four known antibiotics and in the formation of aerial mycelium on most media (Merrick, 1976; Champness, 1988). The pleiotropic effects of bldA mutations in S. coelicolor are at least partially attributable to the presence of UUA codons in the mRNA of critical regulatory genes (Chater, 2006). For example, actII-4, which encodes the pathway-specific regulator of actinorhodin production, contains a TTA codon (Fernandez-Moreno et al., 1991), as does redZ, a regulatory gene required for undecylprodigiosin production (White & Bibb, 1997; Guthrie et al., 1998). Another TTA-containing transcriptional regulatory gene, adpA (also termed bldH), is the main route by which bldA affects morphological differentiation (Nguyen et al., 2003; Takano et al., 2003). Recent proteomic analyses showed that a bldA-deleted mutant had impaired production of several extracellular proteins, including a potentially developmentally significant trypsin-like protease inhibitor SCO0762 (Kim et al., 2005b); and two hypothetical proteins, SCO4244 and SCO4252, were absent (Kim et al., 2005a). SCO0762 does not contain a TTA codon, and its disruption mutant differentiates normally, but transcription of SCO0762 depends on the TTA-containing gene adpA (Kim et al., 2005b). SCO4244 and SCO4252 are in two operons that are located close to each other and the transcription of the operons were inactivated by disruption of the nearby TTA-containing regulatory gene, SCO4263, although disruption of SCO4263 had no obvious phenotype with respect to antibiotic production or morphological differentiation (Kim et al., 2005a; Hesketh et al., in preparation). Expression analysis indicated that the abundance of the bldA-encoded tRNA is at its highest in stationary phase, in contrast to what is expected for most tRNA species (Trepanier et al., 1997). In agreement with this, expression of TTA-containing genes was found to be delayed during early differentiation (Kataoka et al., 1999).

All Streptomyces ssp. have a very high G+C content (typically more than 70%), making the TTA codon rare. Many known TTA-containing genes have been found to be associated with morphological and physiological differentiation, and expression of these genes may be limited even in wild-type streptomycetes (Leskiw et al., 1991b; Chater, 2006). The genome sequences of two Streptomyces species –S. coelicolor (Bentley et al., 2002) and Streptomyces avermitilis (Ikeda et al., 2003) – are now available. Analysis of these genomes has shown that the position in mRNAs of UUA codons is biased towards the start of coding sequences, implying that translational selection of codon usage occurs in streptomycetes (Fuglsang, 2005; Chater & Chandra, 2006). In S. coelicolor, knowledge of the roles of TTA-containing genes has mostly resulted from investigations of mutants with obvious phenotypic defects. Here, we have attempted to approach this problem using different approaches, a bioinformatics analysis coupled with targeted mutagenesis of 21 of the 145 TTA-containing chromosomal genes.

Materials and methods


Genome sequences of S. coelicolor (NC_003888 for the chromosome, NC_003903 for plasmid SCP1 and NC_003904 for plasmid SCP2) and S. avermitilis (NC_003155) were downloaded from the NCBI ftp site (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/). Protein functional classification was taken from the S. coelicolor genome project at the Sanger Institute (http://www.sanger.ac.uk/Projects/S_coelicolor/scheme.shtml). Proteins identified by proteomic approaches, which were taken broadly to represent highly expressed genes, were downloaded from the S. coelicolor 2D Gel Protein Database (http://dbk.ch.umist.ac.uk/s_coeli/referencegel/) (Hesketh et al., 2002).

Finding orthologues of TTA-containing genes of S. coelicolor in S. avermitilis

Stand-alone blast (Altschul et al., 1990) (ncbi blast package, ftp://ftp.ncbi.nih.gov) was used to search for orthologues of TTA-containing genes. Each TTA-containing gene of S. coelicolor or S. avermitilis was used to search against all genes of S. avermitilis or S. coelicolor, respectively, at translated protein levels. For a TTA-containing gene of S. coelicolor (gc), an orthologue in S. avermitilis (ga) was defined as (i) ga is the best hit of gc in the blast; (ii) the E-value is below 1e-10; (iii) the alignable region of the two sequences is at least 50% of the longer sequence; (iv) there is at least 50% amino-acid identity.

Expression prediction

The codon adaptation index (CAI) is a measure of codon usage in a gene relative to that in a reference set of genes (Sharp & Li, 1987). CAI has been used to predict gene expression levels in S. coelicolor and S. avermitilis (Wu et al., 2005). Here, we used CodonW 1.4.2 (written by John Peden and available from http://codonw.sourceforge.net/) to calculate the CAI value for each TTA-containing gene, using ribosomal protein genes as reference gene set. The CAI values of TTA-containing genes were compared to those of ribosomal protein genes and all genes using a t test.

Disruption of some TTA-containing genes in S. coelicolor

Genes were disrupted in Escherichia coli hosts by the insertion of antibiotic resistance determinants, either by the use of restriction fragments (Kieser et al., 2000) or by PCR targeting (Gust et al., 2003), and then passaged through the non-methylating E. coli ET12567 before their reintroduction into S. coelicolor M145 by conjugation or protoplast transformation (Kieser et al., 2000). Exconjugants or transformants were screened for double cross-over replacements by loss of vector-encoded antibiotic resistance, and confirmed by PCR or Southern blotting. The phenotypes were observed by culturing the disruption mutant strains on three solid mediums (MM, MS and R2YE, see Kieser et al., 2000) at 30 for up to 10 days. Further details are given in the appropriate section of the Results.


TTA codons in Streptomyces genes are rarer than expected

The expected random frequency of TTA codons in the genome of S. coelicolor was estimated as 0.095% by multiplying together the overall frequencies of the relevant nucleotides at the three positions of codons (T1, 10.8%; T2, 25.8%; A3, 3.4%). Strikingly, the observed frequency of TTA codons, at 0.006%, is only 6% of the expected frequency. Only 145 of the 7825 chromosomal genes contain TTA codons (see Table 1, three of these are duplicates because they are in the terminal inverted repeats of the chromosome). Other 17/356 are in the large linear plasmid SCP1 and 1/34 in the circular plasmid SCP2. Most of these genes contain only one TTA codon, except for ten in the chromosome that contain two. As shown in Table 1, 31 of the genes fall within groups of genes considered to have been laterally acquired in the relatively recent evolutionary past (Bentley et al., 2002).

Table 1.   TTA-containing genes in the Streptomyces coelicolor genome
Genome segment
(by gene number)
TTA-containing genes*
  • *

    bold-face type indicates that there is an orthologue in S. avermitilis; italics indicates that the annotated gene has an inappropriate start or stop codon; underlining indicates that the genes fall within putative laterally acquired gene islands; and (brackets) indicates that the genes are part of the repeated ends of the chromosome.

SCO0001 to 1000(0010, 0014, 0020), 0075, 0101, 0124, 0145, 0182, 0239, 0308, 0383, 0399, 0588, 0797, 0856, 0992
SCO1001 to 20001004, 1093, 1187, 1227, 1242, 1273, 1331, 1420, 1434, 1592, 1604, 1980, 1983
SCO2001 to 30002320, 2426, 2524, 2603, 2604, 2706, 2792
SCO3001 to 40003257, 3262, 3265, 3268, 3294, 3423, 3468, 3469, 3487, 3490, 3496, 3498, 3570, 3682, 3693, 3770, 3776, 3897, 3929, 3930, 3934, 3955, 3982, 3983
SCO4001 to 50004015, 4060, 4063, 4114, 4144, 4213, 4262, 4263, 4301, 4312, 4346, 4349, 4395, 4431, 4464, 4481, 4493, 4615, 4636, 4642, 4671, 4794, 4823
SCO5001 to 60005007, 5017, 5040, 5083, 5085, 5203, 5222, 5276, 5345, 5350, 5411, 5460, 5495, 5606, 5633, 5786, 5799, 5881, 5913, 5968, 5970, 5995
SCO6001 to 70006034, 6075, 6209, 6255, 6315, 6324, 6384, 6386, 6387, 6401, 6476, 6595, 6623, 6638, 6717, 6741, 6925, 6930, 6936
SCO7001 to 78457070, 7080, 7091, 7092, 7137, 7212, 7233, 7251, 7273, 7351, 7465, 7614, 7798, 7801, 7802, 7807, 7812, 7814, (7827, 7833, 7837)

TTA codon usage in other organisms was also less than expected (15–52% of expected frequency) in other high G+C-content organisms (Table 2), but S. coelicolor and S. avermitilis, which were the highest G+C-content organisms analysed here, had the lowest ratio of observed/expected frequency (6% and 9%, respectively). There are, altogether, 260 TTA-containing genes in S. avermitilis. TTA codons were slightly over-represented in E. coli and Bacillus subtilis, which have chromosomes of medium or low G+C content, respectively. We found that the G+C content of genome was negatively correlated to the ratio of observed/expected frequency of TTA codon (Pearson's correlation, r=−0.93). A phylogenetic tree was drawn based on the 16s rRNA sequences of these genomes (Fig. 1). We suspected that (a) some differences of the obs/exp values in high G+C content organisms are caused by their taxonomies. For example, Deinococcus radiodurans and Halobacterium sp., two high G+C content organisms grouped together, have relatively high obs/exp values compared with other high G+C content organisms. (b) Mycobacterium tuberculosis, with a high G+C content and grouped together with Streptomyces, has a relatively high obs/exp value, which might be caused by its very slow growth rate compared with S. coelicolor and S. avermitilis.

Table 2.   The frequency of TTA codons in some bacterial genomes
Organism*Accession numberG+C content (%)Observed
frequency (%)
frequency (%)
Ratio of observed/
  • *

    Some genomes with G+C content >65% are included. Bacillus subtilis and Escherichia coli genomes are also chosen to represent well-studied bacteria with low or medium G+C content, respectively.

Bacillus subtilisNC_00096443.
Escherichia coliNC_00091350.81.40.841.66
Mycobacterium tuberculosisNC_00275565.60.160.320.51
Pseudomonas aeruginosaNC_00251666.60.0290.200.14
Deinococcus radioduransNC_001263, NC _001264670.0700.190.36
Ralstonia solanacearumNC_003295670.0250.170.15
Caulobacter crescentusNC_00269667.20.0350.140.25
Bordetella pertussisNC_00292967.70.0220.170.13
Halobacterium sp.NC_00260767.90.0680.180.39
S. avermitilisNC_00315570.70.0110.120.09
S. coelicolorNC_00388872.10.0060.0950.06
Figure 1.

 Phylogenetic tree of microorganisms in Table 3. Their G+C content and the ratio of observed/expected frequency of TTA codon (obs/exp) are also shown.

The functional classification of TTA-containing genes in S. coelicolor

The distribution of putative function among these TTA-containing genes is skewed in comparison with the whole genome (Table 3). Few of them are likely to function in cell processes, while function-unknown genes, and genes usually associated with mobile genetic elements are over-represented. Ten (7%) of the TTA-containing chromosomal genes and three of those on SCP1 are involved in secondary metabolism, including polyketide synthesis or non-ribosomal peptide synthesis. However, a smaller portion (3.5%) of all chromosomal genes is involved in secondary metabolism. There are nine TTA-containing chromosomal genes in the 22 gene clusters for secondary metabolism proposed by Bentley et al. (2002). They are actII-2 and actII-4 in the gene cluster of actinorhodin; redZ in the gene cluster of prodiginines; SCO0124 in the gene cluster of eicosapentaenoic acid production; SCO0383, SCO0399 in the gene cluster of deoxysugar synthases/glycosyl transferases; SCO1273 in the gene cluster of type II fatty acid synthase; SCO5222 in the gene cluster of sesquiterpene cyclase and SCO5799 in the gene cluster of siderophore synthetase. The fraction of putative regulatory genes is nearly the same in TTA-containing genes and in all other genes.

Table 3.   Function classification of all chromosomal genes and of TTA-containing chromosomal genes in Streptomyces coelicolor
Function classificationn (% in all genes)n (% in TTA-containing genes)
  • *

    ‘Extrachromosomal’ includes laterally acquired elements, phage-related genes, plasmid-related genes, transposon/insertion element-related genes.

Unknown function2371 (30.3)59 (40.7)
Cell processes802 (10.2)3 (2.1)
Macromolecule metabolism496 (6.3)6 (4.1)
Metabolism of small molecules1104 (14.1)15 (10.3)
Cell envelope1383 (17.7)21 (14.5)
Extrachromosomal*139 (1.8)14 (9.7)
Regulation965 (12.3)17 (11.7)
Not classified565 (7.2)10 (6.9)
Total7825 (100.0)145 (100.0)

S. avermitilis orthologues of TTA-containing genes of S. coelicolor

To find out how many of the 145 TTA-containing genes of the chromosome of S. coelicolor were present in S. avermitilis, we made a gene-by-gene search, based on reciprocal blast hits, at the translated protein level. In all, 30% (42) of the TTA-containing genes had an orthologue in S. avermitilis (Table 4), compared with 55% of TTA-free genes.

Table 4.   Possible products of 42 TTA-containing chromosomal genes in S. coelicolor with an orthologue in Streptomyces avermitilis
Protein in S. coelicolorOrthologue in S. avermitilisAnnotation in S. coelicolor
  • *

    The gene has been disrupted in this study.

  • The S. avermitilis gene also contains at least one TTA codon.

  • Genes in bold-face type show both overall and local synteny between the two chromosomes.

SCO0020SAV7545putative transposase
SCO0239SAV818hypothetical protein
SCO0797SAV7430putative integral membrane protein
SCO1093SAV1495putative hydroxylase
SCO1187SAV555 (CelA1)putative secreted cellulase B precursor
SCO1242*SAV7096putative DNA-binding protein
SCO1420SAV6926putative integral membrane protein.
SCO1434*SAV6911putative CbxX/CfqX family protein
SCO1592SAV6746hypothetical protein
SCO1980SAV6252hypothetical protein
SCO2706SAV5359putative transferase
SCO2792SAV5261AraC-family transcriptional regulator (AdpA)
SCO3257SAV3734 (traSA1)plasmid transfer protein
SCO3423*SAV4648putative regulator
SCO3770SAV1987 (Cyp8)putative cytochrome P450 oxidoreductase
SCO3955SAV4251conserved hypothetical protein SCD78.22c
SCO4015SAV4201hypothetical protein 2SC10A7.19
SCO4114*SAV4113(Sap)sporulation associated protein
SCO4144SAV4070conserved hypothetical protein SCD84.12c
SCO4312*SAV3919conserved hypothetical protein
SCO4395*SAV3854putative hydrolase
SCO4493*SAV4812putative AsnC-family transcriptional regulator
SCO4636SAV4901hypothetical protein SCD82.07
SCO4794SAV3466putative integral membrane protein
SCO5040*SAV3223conserved hypothetical protein
SCO5203SAV3056hypothetical protein 2SC3B6.27c
SCO5222SAV3032 (Tpc2)putative lyase
SCO5460SAV2785putative AbaA-like regulatory protein
SCO5495*SAV2747putative phosphodiesterase
SCO5968SAV2328putative bldA-regulated nucleotide binding protein
SCO5970SAV2326hypothetical protein
SCO6209SAV2020hypothetical protein SC2G5.30
SCO6255SAV1985putative dehydrogenase
SCO6384SAV6029putative integral membrane lysyl-tRNA synthetase
SCO6476SAV1908hypothetical protein SC9C7.12
SCO6623*SAV1814putative ATP/GTP binding protein
SCO6717*SAV1691putative acyl-[acyl-carrier protein] desaturase
SCO6741SAV1671putative oxidoreductase
SCO7233SAV2604putative secreted protein
SCO7251*SAV1237conserved hypothetical protein
SCO7273SAV1160hypothetical protein
SCO7351SAV653putative AraC-family transcriptional regulator.

As indicated by their numbering, many of the TTA-containing genes are in the same order in the chromosomes of the two species, and closer inspection showed that this is invariably associated with substantial local similarity of gene arrangement (synteny). Of the 42 TTA-containing genes also represented in S. avermitilis, only one (SCO4636) has an apparent orthologue in another sequenced actinobacterial genome (Thermobifida fusca, which is fairly closely related to streptomycetes: Chater & Chandra, 2006). Only 12 genes have a TTA codon in both S. coelicolor and S. avermitilis (Table 4).

Predicted expression levels of TTA-containing genes

To dissect the relation of predicted gene expression level and codon usage, the CAI value of genes was plotted against the G+C content at the third position (degenerate site) of the codon (GC3s) in S. coelicolor (Fig. 2). Ribosomal protein genes, having relatively high CAI values (from 0.60 to 0.88, mean value=0.76), were clustered at the upper end of the plot, as were genes previously identified by proteomic approaches, which had CAI values from 0.40 to 0.89 (mean value=0.69). In contrast, TTA-containing genes were spread over the lower end of the plot with generally low CAI values (from 0.28 to 0.74, mean value=0.53). Only three of the 646 proteins listed in the S. coelicolor 2D Gel Protein Database are encoded by TTA-containing genes (SCO4636, SCO6401 and SCO6638, with CAI values of 0.67, 0.40 and 0.58 respectively). The differences in CAI values between TTA-containing genes and ribosomal protein genes or all genes are highly significant (t-test; P=6E-27 and 5E-47, respectively). The CAI values of TTA-containing genes for S. avermitilis are also significantly lower than those of all genes or ribosomal genes (data not shown).

Figure 2.

 The codon adaptation index (CAI) values of genes in S. coelicolor. (a) CAI plotted against GC3s (G+C content at the 3rd position of codon) for each gene in S. coelicolor with a length of longer than 300 bases. Ribosomal genes (ribo), highly expressed genes identified by proteomic approaches (HEG) and TTA-containing genes (TTA) are represented by the blue squares, green triangles and pink triangles, respectively. All other genes are represented by the black circles. TTA-containing genes are clustered at the lower end of the plot, having relatively low CAI values. (b) CAI values of ribosomal genes (ribo), HEG identified by proteomic approaches, TTA-containing genes (TTA) and all genes (all). Error bar is the standard deviation of the mean.

Preference of C3s or T3s in highly expressed genes of Streptomyces have been characterized (Wright & Bibb, 1992). In agreement with this, we found that TTA-containing genes have low C3s, T3s and high A3s compared with ribosomal protein genes (t-test; P=2E-15, 0.0007 and 1E-37, respectively). Note that HEG (highly expressed genes identified by proteomic approaches) have high C3s like ribosomal protein genes (not significant in t-test; P=0.08), but have low T3s compared with ribosomal protein genes (t-test; P=8E-13), indicating that HEG might have a slightly different codon usage pattern compared with ribosomal protein genes, although they also have high CAI values.

Disruption of 21 TTA-containing genes in S. coelicolor

Since bldA mutants of S. coelicolor grow well, it was not expected that any TTA-containing genes should be essential for vegetative growth; but at least some of these genes must be involved in morphological differentiation or secondary metabolism to account for the defects of bldA mutants in development and secondary metabolism (such as adpA and some TTA-containing pathway-specific regulatory genes). The TTA-free version of adpA gene could only partially restore aerial mycelium formation to a bldA mutant (Nguyen et al., 2003; Takano et al., 2003), indicating that other unknown TTA-containing genes might have a role in morphological differentiation. To investigate the roles of other TTA-containing genes, a further 21 were disrupted in S. coelicolor M145. We chose these genes mainly by their annotations, which were considered by us to be possibly related to differentiation (we chose some regulatory genes, enzymes; and avoided laterally acquired genes). The genes targeted include 12 of the 42 that are also represented in S. avermitilis and within the 12, 9 contain TTA in both organisms (Table 5). The disruption mutant strains were cultured on minimal defined medium (MM) and on two rich, undefined media (MS, R2YE). None of the mutant strains showed any obvious phenotypic differences from the wild-type strain. Thus, if these genes are functional, their roles are cryptic under these experimental conditions.

Table 5.   Disrupted TTA-containing genes in S. coelicolor M145
Replaced region
or inserted site
Disruption cassetteMethods
  1. Genes with orthologues in S. avermitilis are given in bold-face type (see also Tables 1 and 4).

SCO0399possible membrane protein1224196aac (3)IV (Tao et al., 2002)traditional
SCO1242probable DNA-binding protein86140–861aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO1434possible CbxX/CfqX family protein185772–122aac (3)IV (Tao et al., 2002)traditional
SCO3423possible regulator4651–465vph (Blondelet-Rouault et al., 1997)traditional
SCO3496possible lyase precursor1482470aadA (Kieser & Melton, 1988)traditional
SCO3682possible delta fatty acid desaturase10386–548aac (3)IV (Tao et al., 2002)traditional
SCO3930hypothetical protein5671–167aac (3)IV (Blondelet-Rouault et al., 1997)traditional
SCO3934FtsK/SpoIIIE family protein1302986eryE (Bibb et al., 1985)traditional
SCO4114sporulation associated protein14221–1422aadA (Gust et al., 2003)PCR-targeting
SCO4301possible DNA-binding protein8402–823aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO4312hypothetical protein789214–789aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO4395possible hydrolase1059133–1026aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO4493probable AsnC-family transcriptional regulator50440–465aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO5040conserved hypothetical protein2214472–2205aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO5495possible membrane associated phosphodiesterase22411095hyg (Blondelet-Rouault et al., 1997)traditional
SCO5633probable fusion protein partially within putative integrated plasmid2307808aadA (Blondelet-Rouault et al., 1997)traditional
SCO5913probable secreted protease12361–599hyg (Blondelet-Rouault et al., 1997)traditional
SCO6034unknown132920–1246aac (3)IV+oriT (Gust et al., 2003)PCR-targeting
SCO6623probable ATP/GTP binding protein21961–472aac (3)IV (Blondelet-Rouault et al., 1997)traditional
SCO6717probable acyl-[acyl-carrier protein] desaturase987416–706aadA (Kieser & Melton, 1988)traditional
SCO7251hypothetical protein1038334hyg (Blondelet-Rouault et al., 1997)traditional


Biological roles of TTA-containing genes in S. coelicolor and other streptomycetes

Some TTA-containing genes have been shown in previous studies to mediate the bldA-dependence of aerial growth and production of certain secondary metabolites. If any other TTA-containing genes are important in the life of S. coelicolor, the most likely candidates should be those also present in other species. Just 42 such genes have orthologues in S. avermitilis, a species believed to have shared its last common ancestor with S. coelicolor some 250 million years ago, fairly early in the evolutionary history of the genus (A. M. Ward, personal communication cited in Chater & Chandra, 2006). Most (33) of these 42 genes occupy essentially the same positions on the chromosome in both organisms, making it very likely that they were part of the chromosome of the last common ancestor. This synteny is particularly true of the 12 orthologues having a TTA in both organisms. One of the 12 orthologues is adpA, which is a major target for bldA regulation of morphological differentiation in S. coelicolor (where it is also known as bldH) (Nguyen et al., 2003; Takano et al., 2003). Whether adpA provides the same function in S. avermitilis remains to be elucidated, but the S. griseus adpA orthologue, which also has a TTA, is well characterized as a regulator of both morphological differentiation and secondary metabolism (Chater & Horinouchi, 2003). The confinement of most of the 42 genes to streptomycetes, coupled with their apparent evolutionary conservation within the genus, implies that they should have genus-specific adaptive significance. Our choice of genes for disruption was therefore strongly biased towards this gene set: of the 21 genes disrupted, nine had a TTA-containing orthologue in S. avermitilis, and three had a TTA-free orthologue. However, the mutations had no obvious phenotypic effects. Two of the previously studied TTA-containing genes of S. coelicolor to which significant roles could be ascribed are absent from S. avermitilis, along with the gene sets that they control (i.e. actII-4 and redZ, both antibiotic pathway-specific regulatory genes). We disrupted nine more TTA-containing genes that were absent from S. avermitilis. None of the mutants constructed had an obvious phenotype.

A simple explanation of these might be that these 21 genes are all unimportant to growth and development of Streptomyces. Other possibilities are as follows:

  • 1Perhaps these genes are important for processes that are not seen under normal laboratory conditions, such as responses to a biofilm environment or interactions with the phytosphere. Probably, many unique TTA-containing genes were laterally acquired in the comparatively recent evolutionary past, and have adaptive significance only in specialized ecological or stressed circumstances that are subject only to intermittent selection over evolutionary time, and which are difficult or impossible to detect under normal laboratory conditions.
  • 2Disruption of some TTA-containing genes may have a molecular phenotype which was not detected by us. It is noteworthy that proteomic and transcriptomic analyses have shown that a phenotypically ‘silent’ mutation in another unique TTA-containing gene, SCO4263 (a regulatory gene), does have a molecular phenotype: genes in a nearby ‘function-unknown’ operon are inactive in the mutant (Kim et al., 2005a; Hesketh et al., in preparation).
  • 3We have found paralogues of the disrupted TTA-containing genes. We used the blastp program to search paralogues with overlap≥50% and identity≥30% and found that 12 (SCO1242, SCO3423, SCO3682, SCO3934, SCO4114, SCO4301, SCO4493, SCO5040, SCO5495, SCO5633, SCO5913 and SCO6623) of the 21 disrupted TTA-containing gene products have protein paralogues. However, only two of them (SCO5633 and SCO3682) have paralogues with identity>50%. It's not a surprise to find these paralogues, as many paralogous proteins were found in S. coelicolor genome (Bentley et al., 2002). Although orthologues typically occupy the same functional niche in different species, whereas paralogues tend to evolve toward functional diversification (Tatusov et al., 2003), it is still possible that paralogues may have very similar functions, so the disruption mutant of a single gene in a paralogue may have no phenotype.

How might the present distribution of TTA-containing genes have arisen?

The analysis of codon frequency, function classification and orthologous pairs presented here allows us to extend a simple hypothesis of the evolutionary pathway originally expounded by Leskiw et al. (1991a) for the evolution of TTA-containing genes in Streptomyces.

  • 1TTA codons occurred less and less during evolution as a result of mutation bias towards increased G+C content (Wright & Bibb, 1992), and the abundance of the bldA-encoded tRNA became correspondingly reduced.
  • 2TTA codons were selectively excluded from housekeeping and highly expressed genes by the force of translational selection, while they were retained by some genes that were subject only to intermittent selection over evolutionary time and/or were lowly expressed, or were functionally unimportant [selection pressure acting to improve translation efficiency is stronger for highly expressed genes than for weakly expressed genes (Duret, 2002), so an absence of such strong selection in weakly expressed genes may have allowed them preferentially to retain the TTA codon].
  • 3Only a limited number of TTA-containing genes acquired a role in morphological and physiological differentiation, and the expression of bldA became adapted to be maximized when these developmental genes were expressed, i.e. in severely growth-rate-limited or stressed cells.
  • 4Some other genes (‘fellow travellers’) might be ‘useless’ for the growth and development of Streptomyces, or be ‘useful’ only in certain physiological conditions.
  • 5Because some ‘fellow travelling’ genes are likely to have adaptive benefits only intermittently over evolutionary time, they are frequently represented in gene sets subject to lateral transfers, as represented by plasmids and chromosomal islands with atypical base composition.

Chater & Chandra (2006) discussed the possibility that the interactions of streptomycetes with bacteriophages might have provided some of the selective pressure for the evolution of the specialized role of TTA codons in streptomycetes.

Not unexpectedly, the fraction of putative regulatory genes is about the same (about 12%) in TTA-containing genes as in TTA-free genes, as the average number of genes regulated per regulatory gene is likely to be independent of the physiological circumstances to which the regulatory gene responds.

There is a relatively high frequency of TTA-containing genes in plasmids. These genes may either have undergone selection for developmentally associated expression, or have been acquired relatively recently from bacteria other than streptomycetes, in which the TTA codon does not have the same significance.

Author contributions

W.L. and J.W. contributed equally to this study.


This work was initiated during a Joint Project award to Z.D. and K.F.C. by the National Natural Science Foundation of China and the Royal Society, and was supported by grant NSFC, No. 30200005 from the National Natural Science Foundation of China.