DNA packaging bias and differential expression of gene transfer agent genes within a population during production and release of the Rhodobacter capsulatus gene transfer agent, RcGTA


E-mail aslang@mun.ca; Tel. (+1) 709 864 7517; Fax (+1) 709 864 3018.


Rhodobacter capsulatus produces a gene transfer agent (GTA) called RcGTA. RcGTA is a phage-like particle that packages R. capsulatus DNA and transfers it to other R. capsulatus cells. We quantified the relative frequency of packaging for each gene in the genome by hybridization of DNA from RcGTA particles to an R. capsulatus microarray. All genes were found within the RcGTA particles. However, the genes encoding the RcGTA particle were under-packaged compared with other regions. Gene transfer bioassays confirmed that the transfer of genes within the RcGTA structural cluster is reduced relative to those of other genes. Single-cell expression analysis, by flow cytometry analysis of cells containing RcGTA-reporter gene fusion constructs, demonstrated that RcGTA gene expression is not uniform within a culture. This phenomenon was accentuated when the constructs were placed in a strain lacking a putative lysis gene involved in RcGTA release; a small subpopulation was found to be responsible for ∼ 95% of RcGTA activity. We propose a mechanism whereby high levels of RcGTA gene transcription in the most active RcGTA-producing cells cause a reduction in their packaging frequency. This subpopulation's role in producing and releasing the RcGTA particles explains the lack of observed cell lysis in cultures.


Gene transfer agents (GTAs) are phage-like particles encoded in prokaryotic genomes that transfer fragments of a producing cell's genome to recipient cells (Stanton, 2007). The process is analogous to transduction, but GTAs are differentiated from transducing phages by two main features: GTAs appear to always contain DNA from the cell's genome, and they always package smaller amounts of DNA than are known or predicted to encode the particles (Lang and Beatty, 2007; Stanton, 2007). The first GTA identified was RcGTA, discovered as a DNase-resistant and protease-sensitive mediator of gene transfer in cell-free filtrates of Rhodobacter capsulatus cultures (Marrs, 1974). RcGTA particles look like small tailed phage (Yen et al., 1979), but cultures producing RcGTA do not exhibit detectable lysis (Solioz et al., 1975). We know of no reported instance where a tailed phage particle has been found to exit a cell without lysis. Screening of a transposon library for mutants that had lost the ability to make functional RcGTA identified an approximately 15 kb gene cluster on the R. capsulatus chromosome encoding the RcGTA particle (Lang and Beatty, 2000), genes rcc01682–1699 (GenBank Accession No. AF181080 and NC_014034). This RcGTA structural gene cluster has an organization conserved in tailed phages (Casjens et al., 1992; Lang and Beatty, 2000).

Previous work with RcGTA showed that the particles contain ∼ 4 kb of linear double-stranded DNA (dsDNA) (Solioz and Marrs, 1977; Yen et al., 1979), which had similar GC content, as quantified using CsCl2/CsSO4 gradients (Solioz and Marrs, 1977), and complexity, as determined by hybridization kinetics and restriction analyses (Yen et al., 1979), to that of the R. capsulatus genome. RcGTA packages DNA from all replicons in donor cells including introduced plasmids (Scolnik and Haselkorn, 1984). These findings have led to the assumption that RcGTA packages R. capsulatus DNA at random.

DNA packaging mechanisms in dsDNA phages are dependent on the action of an endonuclease complex known as the terminase. Terminases are responsible for initiation of packaging, translocation of the DNA into the particle, and the cutting of the DNA to complete packaging (Black, 1989). The activity of terminases can be categorized according to the nature of the ends of the packaged DNA, a grouping that corresponds closely with terminase phylogeny (Casjens et al., 2005). The RcGTA gene cluster contains a recognizable large terminase-encoding gene, orfg2 (Lang and Beatty, 2000). If RcGTA does indeed package DNA at random, its packaging must be by a ‘headful’ packaging mechanism assisted by a non-sequence-specific terminase. The best-studied example of such packaging is that of phage T4, where each head is packed full with 1.02 genome lengths of T4 DNA (Streisinger et al., 1967) and the large terminase appears to have no sequence specificity (Bhattacharyya and Rao, 1994).

We have performed quantitative analyses of the DNA packaged within RcGTA particles and single-cell RcGTA gene expression levels. These experiments and the identification of a putative lysis gene involved in release of RcGTA particles from cells lead us to propose a mechanism by which the variation in RcGTA gene expression among cells in a population explains an observed packaging bias and the lack of observable lysis within an RcGTA-producing culture.


Genome-wide quantification of DNA packaged in RcGTA particles

To quantify the packaging of each gene from the R. capsulatus genome, DNA extracted from RcGTA particles harvested from DE442, an RcGTA overproducer of uncertain ancestry (Table 1), was hybridized to a whole-genome microarray [NCBI Gene Expression Omnibus (GEO) database Accession No. GSE33176]. All 3645 chromosomal ORFs present on the microarray were present in the particles. The raw signal intensities varied from 265.6 to 838.9, and the average signal intensity was 545.7 ± 72.2. The microarrays are based on the genome-sequenced strain, SB1003 (Strnad et al., 2010), which contains a ∼ 130 kb plasmid. The data for the plasmid showed that 156 of 157 ORFs had very low signals, with an average signal intensity of 6.1 ± 7.8. There was one exception, rcp00051, which had an intensity of 671.7. We examined the genome sequence and found that a 99% identical paralogue of rcp00051 is located on the chromosome, rcc01445. Therefore, other than the single ORF that is duplicated on the chromosome, no plasmid genes were detected in the RcGTA DNA and gel electrophoresis confirmed strain DE442 lacks the plasmid (data not shown).

Table 1. R. capsulatus strains and experimental plasmids used in this study.
Strains and plasmidsDescriptionReferences
R. capsulatus strain
SB1003Genome-sequenced strainYen and Marrs (1976); Strnad et al. (2010)
DE442RcGTA overproducer, crtD (provenance uncertain; believed to be derived from RcGTA overproducer Y262)Yen et al. (1979); Fogg et al. (2011)
SB1685SB1003 with KIXX insertion in rcc01685This study
DE1685DE442 with KIXX insertion in rcc01685This study
SB2539SB1003 with KIXX insertion in rcc02539This study
DE2539DE442 with KIXX insertion in rcc02539This study
SB555SB1003 with KIXX insertion in rcc00555This study
DW5SB1003 ΔpuhAWong et al. (1996)
pXPBpucB′::′lacZ fusionThis study
pX2RcGTA orfg2′::′lacZ fusionThis study
pX2NPPromoterless RcGTA orfg2′::′lacZ fusionThis study
pX3RcGTA orfg3′::′lacZ fusionThis study
pX3NPPromoterless RcGTA orfg3′::′lacZ fusionThis study
pR555rcc00555 and 193 bp of 5′ sequence in KpnI site of pRK767This study

Signal intensities for the ORFs on the array are shown as a histogram (Fig. 1, bottom). The frequency distribution of the signal intensities was unimodal. A probability plot of the data (Fig. 1, top) showed that variation from normal occurs in the top and bottom 1% of signal intensities. Too many genes were packaged infrequently (signal intensities < ∼ 378) and too few genes were packaged most frequently (signal intensities > ∼ 714) for a strictly normal distribution. We examined a variety of properties of the genes that were most and least often packaged: predicted gene function, orientation, location in the genome, GC content and transcript levels. Such examinations of the 100 most frequently packaged and 100 least frequently packaged ORFs revealed no obvious patterns or trends. Plotting of the RcGTA packaging signal, GC content and transcript levels against genome position (Fig. 2) identified a region with a pronounced drop in packaging frequency that corresponded with a spike in transcript levels (Fig. 2A and C; approximately position 1700). This corresponded to the RcGTA gene cluster, and this region had the lowest average packaging in the genome with the moving average window of 20 ORFs. The average packaging intensity of these genes was only 433.8 ± 66.8. This region also showed an obvious differential expression in the RcGTA overproducer strain relative to the wild type (Fig. 2D).

Figure 1.

Distribution of signal intensities from hybridization of DNA packaged in RcGTA particles to an R. capsulatus microarray. A quantile–quantile plot is shown on the top and a frequency histogram on the bottom. On the bottom, genes are grouped within signal intensity ranges of 10 (e.g. 119 ORFs had a signal intensity between 470.0 and 479.9). The highest signal intensities represent genes packaged most often, the lowest those packaged least often. On the top, the percentages represent the fraction of total signal intensities that fall within a range (e.g. 98% of the signals were between 380 and 720). The top and bottom 1% of signal intensities are demarcated with dashed lines. The solid black lines represent normal distributions and points of departure are seen where the actual data in grey deviate from these lines.

Figure 2.

Relationships between chromosomal location and RcGTA packaging frequency, transcript levels and GC content. The packaging signal intensity (A), GC content (B), normalized transcript signal intensity of the RcGTA overproducer (C) and ratio of transcript levels in the RcGTA overproducer relative to wild type (D) are plotted versus Rhodobacter capsulatus genome position. In (A), (C) and (D) genome position represents the relative position of each ORF (i.e. position 500 is the 500th ORF from base 1) and each point represents one ORF; all trend lines represent a moving average with a window of 20 ORFs. In (B), genome position represents the moving average over 20 of the 1026 bp windows used to calculate GC content.

Transfer rates of genes from different locations in the R. capsulatus genome

To confirm the microarray-based observation of under-packaging of the RcGTA genes, we compared the transfer frequencies of a kanamycin resistance marker inserted in the chromosome inside and outside the RcGTA gene cluster. The insertion inside the RcGTA gene cluster is located in a putative ORF (rcc01685) between the RcGTA genes encoding the predicted portal (orfg3, rcc01684) and protease (orfg4, rcc01686) proteins. This ORF had a packaging intensity of 409.5 on the RcGTA DNA array. The insertion outside of the cluster is located in gene rcc02539, predicted to encode a c-di-GMP signalling protein. This ORF had a packaging intensity of 582.8 on the RcGTA DNA array. Transfer rates were normalized to the transfer of puhA (rcc00659, packaging intensity of 674.6 on the RcGTA DNA array), the photosynthetic reaction centre H protein-encoding gene deleted in the transfer assay recipient strain, DW5. This approach was taken to normalize the kanamycin resistance transfer frequencies to the transfer of an independent marker representing the total RcGTA production by a strain in a given experiment. Neither kanamycin resistance marker insertion significantly affected RcGTA production, as measured by comparison of the puhA transfer rates. In both wild type (SB1003) and RcGTA overproducer (DE442) backgrounds, transfer of the marker outside of the RcGTA gene cluster occurred at ∼ 40% of that of puhA, whereas transfer of the marker inside the RcGTA gene cluster occurred at ∼ 20% (Fig. 3). These differences were statistically significant (anova, Tukey HSD test, P < 0.01), while the differences between the rates of transfer from equivalent SB1003- and DE442-derived strains were not different (P > 0.05).

Figure 3.

Frequency of RcGTA-mediated transfer of the kanamycin resistance marker when located inside (rcc01685::KIXX) or outside (rcc02539::KIXX) the RcGTA gene cluster. Assays were performed with the marker inserted into the genome of both the wild type (SB1003) and RcGTA overproducer (DE442) strains. The transfer of the kanamycin resistance marker in these locations was normalized to the transfer of the puhA gene by the same strains in the same assays. The data are shown as averages from four replicate gene transfer bioassay experiments and the bars represent the standard deviation. Each letter (a, b) indicates a group whose members are not statistically different from one another, but are different from the other group.

RcGTA gene expression in single cells

We hypothesized that the bias against packaging the RcGTA gene cluster might be linked to the coincident high level of transcription of these genes (Fig. 2C). However, other regions showing similar localized high transcript levels (e.g. region at position ∼ 700 on Fig. 2A), or high ratios of transcript levels in DE442 versus SB1003 (e.g. region at position ∼ 200 on Fig. 2D) did not have a corresponding decrease in packaging. We therefore hypothesized that the RcGTA genes might be differentially expressed within the population, and that perhaps only a subset of the population is responsible for all of the RcGTA expression. If so, population-wide expression assays would underestimate transcript levels in the cells actually expressing the RcGTA genes. In these cells, high occupancy of the RcGTA genes by the transcriptional machinery might limit access by the RcGTA packaging machinery, thereby causing a decrease in the packaging of these genes relative to other regions.

To test this hypothesis, we analysed RcGTA gene expression at the single-cell level using plasmid-borne translational fusions to a lacZ reporter gene. Fusions were constructed to two different RcGTA genes, orfg2 encoding the terminase protein and orfg3 encoding the portal protein, because of previously observed differences in the transcript patterns of these genes by microarray analyses (Mercer et al., 2010). Both fusions contained the same sequences upstream of the RcGTA gene cluster, and negative controls lacked the predicted promoter regions. An independent fusion to a photosynthesis gene, pucB encoding the light-harvesting complex 2 β protein, was constructed as a control. The β-galactosidase activities of the gene fusions were assayed in RcGTA overproducer (DE442) and wild type (SB1003) cells by flow cytometry (Fig. 4). The pucB (Fig. 4A) and orfg2 (Fig. 4B) fusions produced similar unimodal patterns in both SB1003 and DE442. The orfg3 fusion, however, had considerably higher signals in some cells and showed a clearly multimodal distribution, with extended tails of cell counts with increased fluorescence (Fig. 4C) not observed for the other fusions. Both SB1003 and DE442 had subsets of cells with higher orfg3 expression, but the expression levels and numbers of highly expressing cells were greater in the overproducer.

Figure 4.

Population patterns in gene expression measured with reporter gene fusions. Gene expression was quantified for the different gene fusions within populations of cells using flow cytometry, recording 100 000 events. The assays were repeated independently three times and a representative experiment is shown for each set of strains. A. The control fusion of the photosynthesis gene pucB (pXPB) in DE442 and SB1003 (black and grey lines respectively). The mean fluorescence values were 7.77 and 7.70 for DE442 and SB1003 respectively. B. The experimental fusion of the terminase-encoding orfg2 (pX2) in DE442 and SB1003 (thick black and grey lines respectively), and the promoterless control fusion (pX2NP) in DE442 (thin black line). The mean fluorescence values were 3.75, 10.18 and 10.55 for the promoterless, and experimental DE442 and SB1003 respectively. C. The experimental fusion of the portal-encoding orfg3 (pX3) in DE442 and SB1003 (thick black and grey lines respectively), and the promoterless control fusion (pX3NP) in DE442 (thin black line). The mean fluorescence values were 3.46, 50.3 and 16.11 for the promoterless, and experimental DE442 and SB1003 respectively.

Identification of the putative RcGTA lysis gene

The subpopulation expression phenotype identified above might explain the lack of observed lysis in RcGTA-producing cultures. If only a small subset of cells is responsible for the majority of RcGTA production, these cells could lyse and release the particles. One of the genes upregulated in the RcGTA overproducer relative to wild type (Fig. 2D), rcc00555, encodes a putative N-acetylmuramidase lysozyme protein that contains a variation (E-X8-D-X4-T) on the conserved catalytic residues present in the N-terminus of many phage endolysins, E-X8-D/C-X5-T (Sun et al., 2009). The downstream and overlapping gene, rcc00556, is similarly upregulated in DE442 and may encode a holin protein, which would be required for such an endolysin to access the peptidoglycan. It is predicted to have three trans-membrane domains and a topology consistent with a lambda S-type holin (Young, 2002) but lacks the dual-start codons separated by a positively charged amino acid common to such proteins and has no homology to proteins of known function. There are no apparent phage-related genes in the genome near these two genes.

Insertional disruption of the putative endolysin resulted in a ∼ 95% reduction in RcGTA gene transfer activity (Fig. 5A). This decrease was associated with a decrease of RcGTA major capsid protein in the culture supernatants (Fig. 5B) and an increase in intracellular capsid protein. To determine whether the intracellular capsid protein represented functional particles, the cells were artificially lysed and the lysates showed RcGTA gene transfer activity equivalent to the lysed wild-type strain (Fig. 5C) confirming that the only RcGTA-related phenotype of the mutant is the inability to release functional particles trapped within the cells. Assay of the orfg3 reporter construct in the rcc00555 mutant revealed an accentuated subpopulation of cells, 2.76% (± 0.53) of the population, expressing orfg3 at a much higher level (9.02-fold ± 3.31) than the remainder of the population (Fig. 5D). We interpret this accentuation relative to the wild type to represent cells that otherwise would have lysed, but are now detectable in this assay and showing the highest expression.

Figure 5.

Identification of a putative endolysin required for release of RcGTA from cells. A. The frequency of gene transfer by the rcc00555 mutant, SB555, and the plasmid-complemented strain. The gene transfer activity was determined as an average relative to SB1003 in five replicate bioassays and the bars represent the standard deviation. An asterisk (*) denotes RcGTA gene transfer levels that differed significantly from the wild type (P < 0.001) determined by analysis of variance (anova). The complemented strain did not differ from the wild type (P = 0.67). B. The relative abundance, as measured by Western blot, of RcGTA capsid protein in the cells (top) and culture supernatants (bottom) for SB1003 and SB555. Blots were performed on three replicate cultures and one representative set of blots is shown. C. Gene transfer activities in samples from artificially lysed SB1003 and SB555. The gene transfer activity was determined as an average relative to unlysed SB1003 in three replicate experiments and the bars represent the standard deviation. anova indicated these activities do not differ from one another (P = 0.78). D. Expression of the orfg3 fusion construct (pX3) in SB1003 and SB555 (black and grey lines respectively). A black line above the peak in the SB555 strain demarcates the subpopulation discussed in the text. Mean fluorescence values were 138.6 and 145.3 for the entire SB1003 and SB555 populations, respectively, and 1847 for the cells in the indicated subpopulation. The assays were repeated independently three times and a representative is shown.

Bioinformatic analysis of the RcGTA terminase

Using the RcGTA large terminase protein sequence for a blast search returned many high-scoring (score > 500) sequences, all of which were from RcGTA-like elements in α-proteobacterial genomes and prophages. The top matches from phages were much weaker (e > 10−5), and these were all from γ-proteobacterial phages classified as ‘T4-like’. Therefore, at the present time, the phage sequences most closely related to the RcGTA terminase are in the T4-like group, although the recognizable homology to these T4-like phage proteins is over only ∼ 31% of the protein. There is no recognizable homology between any other RcGTA protein and sequences from the T4-like group. An alignment of the homologous region for the RcGTA, phage Acj61 (the top phage blast match), and two other phages from the T4-like group whose packaging has been characterized (T4 and IME08), is shown (Fig. 6). The RcGTA and Acj61 sequences are 28% identical (47% similar) over the aligned region, supporting an evolutionary connection between these sequences. We presume that the RcGTA orfg1 encodes the small terminase subunit because of its location directly upstream of the large terminase, but it lacks recognizable sequence homology to any known phage sequence.

Figure 6.

Alignment of a portion of the large terminase proteins from RcGTA and phages T4, IME08 and Acj61. The numbers indicate the amino acid residue positions in the original proteins. The presence of positively scoring positions is indicated above the aligned sequences as defined in clustal: the asterisk ‘*’ indicates a fully conserved residue, the colon ‘:’ indicates full conservation of a strong group and the dot ‘.’ indicates full conservation of a weak group. GenBank Accession Nos. AAF13179 (RcGTA), YP_004009792 (Acj61), NP_049776 (T4) and ADI55485 (IME08).

Characterization of the ends of the DNA molecules in RcGTA particles

We conducted ligation experiments to determine the structure of the ends of the DNA within RcGTA particles. Untreated RcGTA did not ligate with itself efficiently, and most of the DNA remained at the ∼ 4 kb size (Fig. 7). Treatment of the DNA with T4 DNA polymerase prior to ligation improved the efficiency considerably, while treatment with M-MuLV reverse transcriptase did not (Fig. 7). As T4 DNA polymerase will convert both 5′ and 3′ overhangs into blunt ends (Kucera and Nichols, 2008), the improvement in ligation efficiency indicates the ends of the DNA within the particles are neither blunt nor complementary overhangs. M-MuLV reverse transcriptase will fill 5′ overhangs to make blunt ends but does not possess the 3′–5′ exonuclease activity that would be required to make 3′ overhangs blunt (Verma, 1975). Therefore, the ends of the DNA in RcGTA particles are 3′ overhangs that are not consistent from particle to particle. Some ligation did occur in the absence of any end treatments (Fig. 7).

Figure 7.

Ligation of DNA from RcGTA particles. Purified RcGTA DNA was treated with DNA ligase only (lane 1), M-MuLV reverse transcriptase followed by DNA ligase (lane 2) and T4 DNA polymerase followed by DNA ligase (lane 3). A DNA ladder is shown on the left with the 4 kb and 8 kb bands indicated.


Previous studies of the DNA inside RcGTA particles (Solioz and Marrs, 1977; Yen et al., 1979) using low-resolution techniques suggested that RcGTA packages DNA from within the producing cell at random. Our data from hybridization of DNA from RcGTA particles to an R. capsulatus whole-genome microarray show that DNA packaging by RcGTA is essentially random (Fig. 1). The RcGTA particles contain every gene in the donor cell (Fig. 2), and the terminase protein shows homology to known sequence-independent enzymes from phages in the T4-like group (Fig. 6). Phage T4 is a well-characterized example of a phage that uses a non-sequence-specific headful packaging mechanism (Rao and Black, 2005). The limited, but recognizable, sequence homology between these terminases indicates a distant evolutionary connection between RcGTA and the T4-like phage proteins.

Sequence-independent headful packaging is thought to always result in blunt ends, as there is no requirement for cohesive end structures (Casjens and Gilcrease, 2009). Blunt ends have been demonstrated for phages P22 (Schmieger et al., 1990), Mu (Morgan et al., 2002), as well as T4 (Louie and Serwer, 1990). We expected to find similarly blunt-ended DNA within the RcGTA particles. However, ligation experiments with DNA from within RcGTA particles indicate the presence of 3′ non-sequence-specific overhangs on the packaged DNA. The observation of a small amount of ligation in the absence of end-modifying treatments (Fig. 7) most likely indicates that some matching cohesive ends are present at a low frequency and the 3′ overhangs may be only several nucleotides in length. The discovery of non-matching end sequences and the random packaging data together support a model where the RcGTA terminase has no sequence specificity and the DNA molecules present inside producing cells act as ‘concatamer’ substrates for headful packaging.

A putative Bartonella GTA capable of packaging all genes in the genome was recently identified and was found to preferentially package a chromosomal ‘high-plasticity zone’ (Berglund et al., 2009). This region was associated with run-off replication and therefore the packaging bias likely results from the increased relative copy number of certain sequences and not a packaging specificity. It might have been expected to find an increased packaging by RcGTA of the genes nearer to the origin of chromosome replication (ori), due to overall higher copy number in cells undergoing replication and division, but this was not observed (Fig. 2, where ori is at position 0). However, RcGTA production is highest in the stationary phase of growth (Solioz et al., 1975), which is also when the RcGTA particles were purified, and so there would be little replication and division happening at this time in the culture. Our data from R. capsulatus fail to support the hypothesis that GTAs might preferentially package ‘cloud genes’ (poorly conserved genes) (Kristensen et al., 2010).

An examination of packaging frequency and a variety of other factors did not yield any obvious correlations. There was one notable exception, where the RcGTA gene cluster was the least frequently packaged region of the chromosome, at ∼ 75% of the average. Therefore, RcGTA DNA packaging is not selective for RcGTA genes with occasional packaging of ‘host’ DNA, as one might expect of a transducing prophage. The relative rate of transfer of markers inserted inside and outside of the RcGTA gene cluster confirmed the array-based quantification and extended this observation from the overproducer strain to the wild type (Fig. 3). The higher transfer rates of the control marker puhA are most likely due to the smaller size of this marker, which requires transfer of 591 bp of non-homologous sequence to the recipient while the kanamycin resistance marker requires transfer of 1368 bp of non-homologous sequence to the recipient. This size difference would result in an increase in the frequency with which an intact copy of the puhA marker would be packaged with sufficient flanking sequence to allow for homologous recombination in the recipient cell, and the size difference may also affect the efficiency of recombination into the recipient chromosome.

The finding that under-packaging of RcGTA genes was correlated with localized high transcript levels (Fig. 2) suggested there could be a link between gene expression and RcGTA packaging, and perhaps high occupancy of these genes by the cell's transcription machinery could protect them from packaging. However, there was no correlation between transcript levels and packaging intensity over the remainder of the chromosomal ORFs (Fig. 2). The RNA expression microarray measures total transcript levels from the entire population and the results represent an average for each cell. Therefore, if only a subset of the population were transcribing the RcGTA genes at a high level, the arrays would yield an artificially low estimate of the transcript levels in those cells. Such subpopulation expression patterns have been reported in other species (Avery, 2006; Lopez et al., 2009). Analysis of translational reporter fusions to RcGTA orfg3 by flow cytometry validated this hypothesis, as there was a multimodal distribution of gene expression levels in the population (Fig. 4). The control fusion to the photosynthesis gene pucB, which would be expressed by all cells in these phototrophic growth conditions, showed a unimodal distribution. A fusion to RcGTA orfg2 also was unimodal. It has previously been observed that orfg2 and orfg3 differ in their transcription patterns. Loss of the response regulator CtrA leads to loss of RcGTA production, but whereas no transcripts are detected for orfg3, some transcripts of orfg2 are still detected in a ctrA mutant (Mercer et al., 2010). This indicates that control of transcription and protein expression for different genes in the RcGTA gene cluster is more complex than previously realized.

The documentation of unequal RcGTA gene expression within a population may help explain the lack of observable cell lysis in cultures producing RcGTA. A tailed phage particle escaping from cells without lysis has never been reported, and it is presumably only the small subset of the population that is expressing the RcGTA genes at a higher level (Fig. 4C) that are producing RcGTA and lysing to release the particles. In order to validate this hypothesis, we examined a list of genes co-regulated with RcGTA in the overproducer strain and identified a putative lysis gene with sequence homology to lysozyme proteins. Disruption of this gene resulted in a ∼ 95% reduction in gene transfer activity (Fig. 5A). This reduction is the result of lower levels of RcGTA in culture supernatants and an accumulation within the cells (Fig. 5B). Manual lysis of this mutant released equivalent functional RcGTA to that from the lysed wild type strain (Fig. 5C). These findings support the role of rcc00555 as a gene involved in release of RcGTA. Furthermore, the presence of the orfg3 reporter construct in the rcc00555 mutant resulted in the appearance of a more pronounced subpopulation of cells highly expressing orfg3 (Fig. 5D). This subpopulation of ∼ 3% of the total cells showed approximately ninefold higher expression and is presumably responsible for almost all of the RcGTA activity in the culture (Fig. 5A). The accentuation of this population in the rcc00555 mutant must reflect lack of lysis in the highly expressing cells that would normally have lysed to release RcGTA particles. The lack of observed lysis in RcGTA-producing cultures is easily explained, given the small size of this subpopulation responsible for release of the majority of the RcGTA particles.

All genes inside the producing cell are packaged inside RcGTA particles, although there is a slight but significant reduction in packaging of the RcGTA-encoding structural gene cluster. The higher RcGTA gene expression in the subset of cells responsible for producing RcGTA particles could result in decreased access to these genes by the packaging machinery. There could be a selective advantage to this protection of the RcGTA genes in these cells, favouring their prolonged expression to maximize RcGTA production. The confirmation that cells are lysing to release RcGTA, an important cost for their production, is mitigated by the discovery that only ∼ 3% of the cells in RcGTA-producing cultures are responsible for release of the majority of the particles.

Experimental procedures

Bacterial strains, growth conditions and plasmids

The strains of R. capsulatus used in this study are listed in Table 1. Strains carrying the kanamycin resistance marker were created as follows. The ORFs of interest were amplified by PCR and cloned in pGEM-T Easy (Promega, Madison, WI). The 1368 bp SmaI fragment of the KIXX cartridge (Barany, 1985), encoding resistance to kanamycin, was then ligated into a restriction enzyme cut site within the cloned PCR product. The primers used for the rcc02539 construct were 5′-TTCCATGCCGAAATAGGCCGC-3′ and 5′-GGCGCCGTCGTCGATCTGAAT-3′, and the KIXX fragment was ligated into a SmaI site. The primers used for the rcc01685 construct were 5′-AACGGGATGGGACTGAATTT-3′ and 5′-ATGTCACCAGCGACACTTCC-3′, and the KIXX fragment was ligated into an Eco47III site. The primers used for the rcc00555 construct were 5′-AACGAGGTTTTCCTGGAGGT-3′ and 5′-AACCTGTTCCGCAAGATCAC-3′, and the KIXX fragment was ligated into a SmaI site. These plasmids were independently transferred into R. capsulatus by conjugation from Escherichia coli C600 (pDPT51) (Taylor et al., 1983), and the kanamycin resistance genes transferred to the chromosome of recipient R. capsulatus cells by RcGTA transfer (Scolnik and Haselkorn, 1984). Successful transfers of the gene were confirmed by PCR using the same primer pairs and template DNA from the resultant kanamycin-resistant RcGTA recipients, which showed 1.4 kb larger products than the non-disrupted versions.

Rhodobacter capsulatus cells were grown under anaerobic photoheterotrophic conditions in complex YPS medium (Wall et al., 1975) at 35°C for RcGTA production bioassays, purification of particles for DNA isolations, and purification of RNA for microarray analysis. For all other purposes, they were grown aerobically at 30°C in RCV medium (Beatty and Gest, 1981).

The experimental plasmids are listed in Table 1. They consist of an in-frame fusion of the chosen ORF to lacZ in the promoter probe vector pXCA601 (Adams et al., 1989), created using the BamHI and PstI sites in the vector and adding corresponding sites to the amplification primers for cloning. pXB was created using the primers 5′TGCCTGCAGAAAGATGCGTCTGGAACACC-3′ and 5′-GGGGATCCCCATCGATCAGGTAGCTGTG-3′; pX3 and pX3NP were created using the forward primers 5′-CGGCTGCAGACCGATCCGG-3′ and 5′-ATACTGCAGCATGGACATGGGGTTCAA-3′, respectively, and the reverse primer, 5′-AGGATCCCCCGTGCGCATCAGACTGAC-3′; pX2 and pX2NP were created using the same forward primers used for pX3 and pX3NP, respectively, and the reverse primer, 5′-AGGATCCACGTCGCGCACCTGAT-3′; underlined bases represent the restriction sites added for cloning. All constructs were created from sequences amplified from the genome-sequenced strain SB1003, and the fusions were confirmed as in-frame by sequencing. We sequenced the same RcGTA upstream region amplified from the RcGTA overproducer DE442, confirming it was identical to the SB1003 sequence.

Complementation of the rcc00555 mutant was carried out with rcc00555 and its native promoter, as amplified by the primers 5′-ATGGTACCATGGTCGAGGGCACCTTT-3′ and 5′-ATGGTACCCCAGGATCGTCCCGATC-3′, ligated into the board host-range vector pRK767 (Gill and Warren, 1988) using KpnI sites (underlined in the primer sequences).

RcGTA DNA isolation

Cultures of strain DE442 were grown for 48 h. The cells were then centrifuged at 5855 g, the supernatant filtered through a 0.45 µm polyvinylidene fluoride (PVDF) filter (Millipore, Bedford, MA), and the filtrate ultracentrifuged at 184 000 g for 5 h. The resulting pellet was resuspended by shaking at 100 RPM in G buffer (Solioz and Marrs, 1977) overnight at 4°C. The resuspensions were treated with 2 units RNase-free DNase I (New England Biolabs, Pickering, Canada) and 1.2 units RNase A (Sigma-Aldrich, Oakville, Canada) in 1× DNase buffer (New England Biolabs) at 37°C for 30 min to remove any free nucleic acids, and then incubated at 75°C in the presence of 5 mM EDTA (pH 8). DNA was purified by phenol : chloroform : isoamyl alcohol (25:24:1) extraction and ethanol precipitation. The sample was subjected to agarose gel electrophoresis and the ∼ 4 kb RcGTA DNA band extracted using the QIAEX II Gel Extraction Kit (Qiagen, Mississauga, Canada).

Microarray analyses

The R. capsulatus microarrays are Affymetrix whole-genome expression arrays (Affymetrix, Santa Clara, CA) that contain oligonucleotide probes for 3635 ORFs (Mercer et al., 2010). For the RNA analysis cells were harvested 16 h after reaching stationary phase, as determined by monitoring culture turbidity, and RNA was extracted using the RNeasy Kit (Qiagen) as described (Mercer et al., 2010). The RNA and DNA samples were processed for cDNA synthesis and fragmentation, respectively, and subsequent labelling and array hybridization at the Michael Smith Genome Science Center (Vancouver, Canada) as described in the Affymetrix Expression Analysis Technical Manual for prokaryotic samples.

Raw data from the RcGTA DNA array were extracted using the MAS5 algorithm with detection calls (Pepper et al., 2007) to generate signal intensity. Statistical analyses of the raw data were carried out using Minitab 15 (Minitab, State College, PA). Raw data from the RNA arrays were Robust Multi-Array normalized (Irizarry et al., 2003) and normalized to the 50th percentile using GeneSpring 7.2 (Agilent Technologies, Santa Clara, CA).

The microarray data from this study have been deposited in the NCBI GEO database (Accession No. GSE33176).

RcGTA activity bioassays

Cultures of the test strains were grown aerobically overnight, and then normalized for density and used to inoculate RcGTA-production bioassay cultures. These cultures were then grown for 48 h and filtrates were collected using 0.45 µm PVDF syringe filters (Millipore). Filtrates were assayed for RcGTA activity using strain DW5 as the recipient as follows. An overnight aerobic culture of DW5 was centrifuged and the cells resuspended in an equal volume of G buffer (Solioz et al., 1975). Equal volumes of donor filtrate and recipient cells were mixed with 4 volumes of G buffer and incubated for 1 h at 35°C with shaking. Nine volumes of RCV medium was then added and the mixtures incubated for a further 3 h before plating. Each bioassay was plated in equal parts on YPS and YPS with kanamycin sulphate (10 µg ml−1). The YPS plates were grown under anaerobic phototrophic conditions to select for transfer of the puhA gene while the kanamycin-containing plates were incubated aerobically in the dark to select for transfer of the resistance marker. Colonies on the plates were counted after 2 days, and the ratios of transfer of kanamycin resistance to puhA in four independent assays were calculated. The transfer rates were compared by one-way anova and Tukey HSD test (Chambers et al., 1993). In lysis assays, a 1 ml portion of each culture was centrifuged and the cells resuspended in 30 µl of 20 mM Tris-HCl, 5 mM EDTA, 250 mM sucrose (pH 7.8) containing 0.5 mg ml−1 lysozyme (Sigma-Aldrich). After three freeze–thaw cycles in dry ice-ethanol, 1 ml of 20 mM Tris-HCl, 0.5 mM MgCl2 (pH 7.8) containing 0.1 mg ml−1 DNase (Sigma-Aldrich) was added to the cells and the mixtures incubated for 5 min before filtration using a 0.45 µm PVDF filter (Millipore). The filtrates were then used for gene transfer bioassays as described above.

Western blots targeting the RcGTA major capsid protein

Cells and culture filtrates from the same cultures used in RcGTA activity bioassays were assayed for RcGTA capsid protein by Western blotting. Cultures were centrifuged at 17 000 g, the supernatant was removed and the cells resuspended in an equal volume of TE buffer. For the different samples, 5 µl of the cell suspensions, 10 µl of culture filtrates and 10 µl of the cell lysates were run. SDS-PAGE, blotting and detection of the RcGTA major capsid protein were done as described (Mercer et al., 2012) with the primary antibody AS08 365 (Agrisera, Sweden). Images were captured on a gel documentation system and subsequently inverted and adjusted for brightness and contrast.

RcGTA gene expression in single cells

Rhodobacter capsulatus cultures containing the fusion constructs (Table 1) were grown until 4 h after reaching stationary phase, and analysed for β-galactosidase activity. Cells were permeabilized by exposure to 15% (v/v) isopropyl alcohol for 15 min and then washed with Z buffer (60 mM Na2HPO4, 40 mM NaH2PO4, 10 mM KCl, 1 mM MgSO4, 50 mM β-mercaptoethanol; pH 7) (Miller, 1992). Fluorescein di-β-d-galactopyranoside (FDG) (Sigma-Aldrich) in H2O : DMSO : ethanol (8:1:1) was added to a final concentration of 0.1 mg ml−1. The cells were incubated for 1 h and subsequently diluted 1:200 in Z buffer and analysed by flow cytometry recording 100 000 events. These events were gated, according to forward and side scatter, to identify > 90% of events as ‘cells’. These assays were repeated three times with independently grown cultures.

Bioinformatic analyses

The RcGTA large terminase protein sequence was used to perform a blast search against the nr database (Wheeler et al., 2007) by both psi-blast and blastp (Altschul et al., 1990; 1997). The selected terminase protein sequences were aligned using clustal x (Larkin et al., 2007).

Treatments of RcGTA DNA for ligation experiments

RcGTA DNA (1 µg) was treated with 3 units T4 DNA polymerase (New England Biolabs) at 12°C for 15 min as per the manufacturer's recommendations. An equivalent sample was incubated with 2 µl of M-MuLV RNase H+ RT solution from the Phusion RT-PCR kit (Finnzymes, Espoo, Finland) supplemented with 0.5 mM dNTPs according to the manufacturer's recommendation at 40°C for 30 min, and the enzyme then heat-inactivated by incubation at 85°C for 5 min. Both reactions, alongside a negative control that contained the same amount of RcGTA DNA in TE buffer in the same total volume, were cleaned using the QIAquick cleanup kit (Qiagen), and the DNA was eluted in 30 µl of elution buffer. Twenty-six microlitres of these eluates were independently treated with 800 cohesive end units of T4 DNA Ligase (New England Biolabs) according to the manufacturer's recommendations at 16°C for 18 h. The samples were then subjected to agarose gel electrophoresis.


We thank J.T. Beatty for assistance with the microarray experiments and M. Leung for help with the lysis protocol. We also thank the anonymous reviewers for helpful suggestions for improvement of the manuscript, especially with regard to the lysis component. A.P.H. and R.G.M. were supported by fellowships from Memorial University School of Graduate Studies (SGS) and the Natural Sciences and Engineering Research Council (NSERC) of Canada. D.E.W. was supported in part by funds from the Memorial University Department of Biology Honours programme and an Undergraduate Student Research Award from NSERC. C.B.B. is supported by a SGS fellowship and the Newfoundland and Labrador Research & Development Corporation (NL RDC). This research in A.S.L.'s lab was supported by grants from NSERC, the Canada Foundation for Innovation and the NL RDC.