Synthetic mammalian transgene negative autoregulation

Biological networks contain overrepresented small-scale topologies, typically called motifs. A frequently appearing motif is the transcriptional negative-feedback loop, where a gene product represses its own transcription. Here, using synthetic circuits stably integrated in human kidney cells, we study the effect of negative-feedback regulation on cell-wide (extrinsic) and gene-specific (intrinsic) sources of uncertainty. We develop a theoretical approach to extract the two noise components from experiments and show that negative feedback results in significant total noise reduction by reducing extrinsic noise while marginally increasing intrinsic noise. We compare the results to simple negative regulation, where a constitutively transcribed transcription factor represses a reporter protein. We observe that the control architecture also reduces the extrinsic noise but results in substantially higher intrinsic fluctuations. We conclude that negative feedback is the most efficient way to mitigate the effects of extrinsic fluctuations by a sole regulatory wiring.

Real-time quantitative PCR has been used as an alternative to Southern blot or fluorescence in situ hybridization for detection of gene copy numbers 1 . Various studies demonstrated that this method is accurate enough compared to Southern blot. For example, in Table 2 from "Determination of Cytochrome P450 2D6 (CYP2D6) gene copy number by real-time quantitative PCR" 2 , the estimations of CYP2D6 gene copies from real-time quantitative PCR match with those from Southern blotting. The average copy numbers of DsRED of all stable clones were estimated by the delta delta Ct method as follows: 2 -ΔΔCt = ((1 + E DsRED ) -ΔCt,DsRED ) /((1+E BRCA1 ) -ΔCt,BRCA1 ), where E DsRED is the PCR amplification efficiency for DsRED and E BRCA1 for BRCA1 (endogenous reference gene) 3 .
The PCR primers are: DsRED forward primer: 5'ctccaccacggtgtagtcct-3'; DsRED reverse primer: 5'agaccgtgtacaaggccaag-3'; BRCA1 forward primer: 5'gagcgtcccctcacaaataa-3'; and BRCA1 reverse primer: 5'tgctccgtttggttagttcc-3'. The control stable HEK293 cell line was generated by Flp-In system (Invitrogen) and contains one copy of DsRED transgene 4 . All genomic DNA samples were extracted using DNeasy Blood and Tissue kit (Qiagen). To determine the PCR amplification efficiency, genomic DNAs from the control cell line were used to generate the dilution curve of log 2 (DNA amount, ng) vs. Ct. E DsRED was calculated as 1.07, and E BRCA1 as 0.98. The PCR conditions were as: 95 degree for 3 minutes, followed by 40 cycles of 95 degree for 15 seconds and 60 degree for 30 seconds. For each stable clone, triplicates (50 ng of genomic DNA) were performed and the average copy numbers were calculated as the mean ± SD. For statistical analysis, z scores were calculated against estimated integer copy numbers, and -1.96<z<1.96 was determined as no statistical difference (corresponding to 95% confidence interval).

Theory
Stochastic events which govern the concentration of a single protein, such as the synthesis and degradation of that protein, are referred to as "intrinsic" or "local" noise. Such random fluctuations can propagate along regulation pathways, with the consequence that protein distributions along a pathway appear correlated 5, 6 . However, even proteins from different regulation pathways show correlation 5,7 . This arises from stochastic variations in quantities which affect the regulation of all genes 5, 6 , such as in polymerase copies or cell cycle phase. As a consequence, a strongly expressing constitutive promoter is expected to have little intrinsic noise, while a weak promoter will have high intrinsic noise 8,9 . In addition, two identical, independently regulated promoters are expected to have the same extrinsic noise, which arises through global effects 5,7 .
The total noise observed in a fluorescent reporter distribution arises through the combination of these "global" or "extrinsic" fluctuations together with the fluctuations in that protein's local regulation machinery ("intrinsic" noise) 5 . The intrinsic noise and extrinsic noise squared, sum to the CV-squared of the fluorescent reporter 5 . Using this notation, let angle brackets indicate that an average is taken with extrinsic variables held fixed, and let an overbar indicate an average where intrinsic variables are fixed.
Then the three noises, intrinsic, extrinsic, and total, can be written in terms of P, the observed distribution of reporter protein: For intrinsic noise, the authors of 5 take the variance of the intrinsic variables, 〈 〉 〈 〉 , then estimate the expected value of this variance, denoted by the overbar, and subsequently divide by the mean squared of P. For extrinsic noise, the authors take the expected value of P with respect to intrinsic variables, then the variance of 〈 〉, and finally divide by the mean squared of P. For the total noise (CVsquared), the variance of P is divided by the mean squared.
Note that with a single reporter, the noises can't be estimated unless both the intrinsic and extrinsic variables are observed. However, in the standard two-reporter experiment, the extrinsic noise becomes the normalized covariance of two reporters that are independently regulated and identically distributed. The reason 5 is that in a single cell, the extrinsic variable is fixed, so the quantity 〈 〉 ̅̅̅̅̅̅ can be calculated as the average product of the two reporters 〈 〉 ̅̅̅̅̅̅̅̅̅̅̅̅̅ , then since 〈 〉 ̅̅̅̅̅̅̅̅ 〈 〉 ̅̅̅̅̅̅̅̅ the extrinsic noise becomes the normalized covariance of the two reporters: and the intrinsic noise becomes the normalized RMS difference from , so that the sum of intrinsic and extrinsic is twice the CV of one reporter.
In this paper we examine more complicated regulatory mechanisms where it is not feasible to construct two identically-regulated reporters (or impossible to obtain identical reporter statistics). We define a new formulation and we will obtain the previous results as a special case, where the extrinsic noise is the normalized covariance and the components sum to the total noise. We generalize this multiplicative model and assume the observed random variable is a function of its independent component sources (A and B, the intrinsic and extrinsic variables) of the following general form: i.e. where a and b are not necessarily both equal to 1. These sensitivity coefficients must appear as powers because multiplied coefficients fall out as a single constant in the next step. It is convenient to convert this to a linear model (for ease of calculation), by taking the logarithm: For ease of notation, we drop the log functions and just use the original variable names.
Here we need to calculate the contributions of A (intrinsic) and B (extrinsic) to the total observed noise of X. In general, summing two independent random variables A and B with variance and results in the following variance: The Intuitively, the inverse tangent of is the slope of the data on a log-log plot that lies along the 45degree diagonal in the special case but does not if the two reporters experience different fluctuation magnitudes from extrinsic sources due to the presence of noise-changing regulatory components.
We take the logarithm to convert to a linear model to find the components.
Once more dropping the log notation for simplicity, Taking the covariance of the logarithms of the reporters (this can be done directly with cytometry datagate as in figure S1, take the log of the raw reporter values, and calculate the covariance): Because A, B, and C are defined as uncorrelated, We take the variances of the logarithms of X and Y, which are the experimentally determined total noises, All of these terms are unbiased estimates of the sample covariance computed as follows, for cytometry data where each individual well has cells recorded with reporter measurements and indexed : We note that in some experiments the two reporters in the fully induced well do not have the identical statistics required by the dual reporter theory of Elowitz et al. We attribute this to the difficulty in finding statistically identical genes and promoters, and also to measurement related issues. For these experiments, instead of assuming the both reporters have the same intrinsic and extrinsic noise, we may assume that they are proportionally the same. For example, if one reporter's CV-square is 1.2 times the other's we can assume its intrinsic and extrinsic CV-squares are also 1.2 times as large. Hence for the computation, instead of assuming for the fully induced well, we would assume is the ratio of the CVs of the reporters. Otherwise the calculation proceeds the same way.
In summary, this is how we define the noise components, using as data and , the experimentally determined CV-squares of reporters X and Y (which, as described below, we approximate with the variances of the logs of X and Y) and the covariance of the logarithms of X and Y, also We can verify the approximation by calculating the standard deviation of the logarithm of the data and comparing it to the RSD of the original data. We improve the approximation by trimming the largest values of by dropping all values more than 2.5 standard deviations from the mean of the log of the data (these points are not dropped from the direct RSD verification, and the cutoff was obtained by trying several values).

Verification and decomposition of simulated noise
In our noise decomposition, we expect random quantities which affect the expression of both genes to show up as extrinsic noise, while we expect random quantities which affect only a single gene to show up as intrinsic. We address the case where one reporter may be less sensitive to extrinsic noise sources due to noise-reducing regulatory pathways. To see this, first take the simplest case of a two-color experiment: suppose we have a plasmid with a constitutive bidirectional promoter P coding for reporters X and Y, and let the only source of uncertainty be the plasmid copy number. Then we have production rates of each reporter: At steady-state, we have the relations ( )

( )
And we want to find the extrinsic noise, the normalized covariance, which we have defined approximately by taking the covariance of the logarithm of the data: To calculate intrinsic noise we need the total noise of each reporter: Which shows that in this example there is no intrinsic noise; hence a common promoter for two reporters is an extrinsic source of noise: Suppose instead that there were two different plasmids with promoters P1 and P2 coding for reporters X and Y, and let their copy number be independent random variables. Setting up the problem the same way, At steady-state, Calculating the extrinsic noise, For the total noise, Hence in this case, where the random variable independently affects the two reporters, the extrinsic noise is zero, making these intrinsic noise sources.
Notice that the strength of our approach is when the two reporters are not identically regulated with identical statistics. We extend the applicability by assigning different extrinsic noise quantities to each reporter, so that now instead of there being a single extrinsic noise, each reporter has its own set of intrinsic and extrinsic contributions. The following example shows what can happen to extrinsic noise in the case of negative feedback. Suppose we have the extrinsic promoter case, but reporter X has negative feedback (and ): At steady-state, If we calculate the intrinsic noise using the Elowitz et al. approach, we find that the extrinsic noise exceeds the total noise for reporter X. However, for this simplified example, we know the only noise source is an extrinsic variable, and thus the intrinsic noise should turn out to be zero. This allows us to infer that , as described in the previous section, has a value of ½, representing the fact that reporter X experiences half as much noise from the variable plasmid copy number as Y does.
Recall that in the experiments, we must first estimate in a case where the Elowitz et. al.
assumptions hold, i.e., the reporters are identically regulated.
To confirm our analysis we used simulations to test the decomposition on noise for two extreme cases, where we control the levels of intrinsic and extrinsic noise. As illustrated in Supplementary Fig.   13a we first we vary the strength of transcription of a single bidirectional promoter coding for two fluorescent proteins, leading to perfectly correlated fluorescence quantities, which our decomposition shows to have only extrinsic noise and no intrinsic noise. Next, in Supplementary Fig. 13b we vary the strength of transcription of two fluorescent genes independently, which leads to uncorrelated fluorescence quantities; our method returns only intrinsic noise and no extrinsic noise.