Silencing and variegated transgene expression are poorly understood problems that can interfere with gene function studies in human embryonic stem cells (hESCs). We show that transgene expression (enhanced green fluorescent protein [EGFP]) from random integration sites in hESCs is affected by variegation and silencing, with only half of hESCs expressing the transgene, which is gradually lost after withdrawal of selection and differentiation. We tested the hypothesis that a transgene integrated into the adeno-associated virus type 2 (AAV2) target region on chromosome 19, known as the AAVS1 locus, would maintain transgene expression in hESCs. When we used AAV2 technology to target the AAVS1 locus, 4.16% of hESC clones achieved AAVS1-targeted integration. Targeted clones expressed Oct-4, stage-specific embryonic antigen-3 (SSEA3), and Tra-1–60 and differentiated into all three primary germ layers. EGFP expression from the AAVS1 locus showed significantly reduced variegated expression when in selection, with 90% ± 4% of cells expressing EGFP compared with 57% ± 32% for randomly integrated controls, and reduced tendency to undergo silencing, with 86% ± 7% hESCs expressing EGFP 25 days after withdrawal of selection compared with 39% ± 31% for randomly integrated clones. In addition, quantitative polymerase chain reaction analysis of hESCs also indicated significantly higher levels of EGFP mRNA in AAVS1-targeted clones as compared with randomly integrated clones. Transgene expression from the AAVS1 locus was shown to be stable during hESC differentiation, with more than 90% of cells expressing EGFP after 15 days of differentiation, as compared with ∼30% for randomly integrated clones. These results demonstrate the utility of transgene integration at the AAVS1 locus in hESCs and its potential clinical application.
Disclosure of potential conflicts of interest is found at the end of this article.
In addition to their potential applications in biotechnology and drug discovery, human embryonic stem cells (hESCs) provide a model system for studying events in early postimplantation human development and cell-type specification. However, unlike mouse embryonic stem cells (mESCs), for which in vivo developmental potential can be monitored in chimeras and homozygosity of gene expression can be achieved through germ-line transmission in vivo, studies of hESCs are limited by ethical constraints, thus compelling greater reliance upon in vitro analysis. Therefore, to achieve both basic and translational research goals involving hESCs, it is necessary to use approaches that rely on stable and robust transgene expression in the absence of gene silencing and other integration site-related influences.
Methods that have previously been used to generate genetically altered ESCs include random integration of cDNAs [1, –3], retrovirus-mediated transduction [4, , –7], and homologous recombination [8, 9]. Although these methods have generally been successful with other mammalian cell systems, limitations to their use in embryonic stem cell research have emerged. For instance, random integration of transgenes using either cDNAs or retroviruses is subject to silencing, leading to variegated transgene expression in clonally isolated cells [3, , –6, 10, 11]. Such silencing of transgene expression is thought to be mediated by methylation-dependent and -independent mechanisms of the integrated expression cassettes. Silencing in hESCs has also been shown to be dependent upon the promoter used to drive transgene expression . In addition, during differentiation of retrovirally transduced mouse embryonic stem cells, a complete cessation of transgene expression is often observed during differentiation , although initial reports suggest that lentiviral vectors do not suffer from the same fate in human ESCs [12, 13]. Nevertheless, retroviruses tend to integrate within transcriptional units of the genome, thus greatly increasing the potential for detrimental effects due to insertional mutagenesis [14, 15] and thus potentially limiting their use in a clinical setting.
Homologous recombination-mediated targeting (knockout/knock-in) has been used with great success in mESC research, inserting transgenes in the context of the native gene and thereby avoiding the silencing often observed with random integration . However, only two examples of homologous recombination in hESC research have been documented [8, 9]. This seems to indicate as yet poorly understood limitations to homologous recombination in hESCs.
In view of the limited information on the efficiencies of transfection and stability of transgene expression in hESCs, we assessed the extent of transgene silencing in hESCs. We found that random transgene integration was characterized by substantial silencing and variegation in reporter gene activity. We then developed an approach for transgene integration in hESCs that not only avoids disruption of endogenous gene function but also protects transgene expression from integration site-related influences.
In doing so, we took advantage of a natural locus for transgene insertion and expression, using adeno-associated virus type 2 (AAV2) technology. During its evolution, AAV2 has developed the unique ability to integrate into a region on chromosome 19, termed the AAVS1 locus . The essential requirements for integration at this locus include the AAV2 viral inverted terminal repeats (ITRs), viral REP78/68 integrase, and the AAV2 p5 promoter [18, –20]. AAV2-mediated site-specific integration in human cell lines has been shown to be highly efficient and, more importantly, resistant to transgene silencing . Interestingly, the ITRs that flank the integration cassette have been shown to possess insulating activity themselves, protecting transgene integrants from integration site-specific influences, thereby enabling nonvariegated transgene expression in Xenopus and zebrafish [22, 23]. Furthermore, the AAVS1 locus has been shown to contain native insulators that protect this site from silencing .
We found that targeting of the AAVS1 locus in hESCs could be achieved with AAV2 technology and transiently transfected vectors with standard lipofection techniques, thereby providing the essential viral targeting elements. Once targeted, transgene expression from the AAVS1 locus in hESCs was nonvariegated, and it showed a strongly reduced tendency for silencing during passaging and differentiation of hESCs, as compared with randomly integrated controls.
Materials and Methods
Human ESC Culturing and Transfection
All products were purchased from Invitrogen (Carlsbad, CA, http://www.invitrogen.com) unless otherwise stated. The hESC lines H9 and H1 were obtained from WiCell Research Institute (Madison, WI, http://www.wicell.org), and hSF-6 was obtained from the University of California-San Francisco (San Francisco). These cells were cultured in feeder-containing conditions as previously described . Human ESCs were transfected with targeting vector (20 μg) and pREP78 vector (10 μg), targeting vector only (20 μg), or pTP6-enhanced green fluorescent protein (EGFP) only (20 μg), using either electroporation or Lipofectamine 2000 as previously described [1, –3, 7]. Vector schematics are shown in supplemental online Figure 1. At 24 hours post-transfection, hESCs were plated on feeders at low density, and selection was started (1 μg/ml puromycin). After 14 days in selection, colonies were picked onto feeders in 24-well plates.
Southern Blot, Polymerase Chain Reaction, and Sequencing Analysis
DNA from putative targeted clones was extracted using previously described methods [1, –3, 7]. Twelve micrograms of DNA was digested with Apa1, run on 0.8% agarose gels, and blotted onto nitrocellulose membrane (Amersham Biosciences, Piscataway, NJ, http://www.amersham.com). Initially, 32[P]-CTP (Amersham Biosciences)-labeled cytomegalovirus (CMV) promoter cDNA was used to probe the blot to detect integration of the targeting vector. After exposure, the blot was stripped and reprobed with 32[P]-CTP-labeled AAVS1-specific probe (kindly provided by David Russell, University of Washington, Seattle), to localize the integration to the AAVS1 site on chromosome 19. Exact size matches for both probes were regarded as putative targeted clones. Copy number integration was initially determined by the number of bands present on the CMV promoter blot for each clone. This allowed us to match copy number for future experiments. Nested polymerase chain reaction (PCR) was performed on positive clones using previously described methods and primers as shown in the supplemental online Methods . A 1 kb DNA ladder was from New England BioLabs (Hitchin, U.K., http://www.neb.uk.com).
Fluorescence In Situ Hybridization Analysis
Mitotic spreads and interphase fibers were analyzed by fluorescence in situ hybridization (FISH). The targeting vector pITR-p5-CAG-enhanced green fluorescent protein (EGFP)-independent ribosomal entry site Puro was used as a probe. For fiber FISH analysis, the chromosome 19-specific probe G248P81017A8: WI2-732B15 (38,354 base pairs [bp]) was used. DNA probe was labeled using a nick translation kit (Invitrogen), following the manufacturer's instructions, and FISH was performed essentially as described previously .
Differentiation of hESCs
Clones (AAVS1-targeted, ITR-flanked random-integrated, and pTP6-EFP random-integrated) were grown in 60-mm dishes (Corning Enterprises, Corning, NY, http://www.corning.com) on mouse embryonic feeders (MEFs). When confluent, hESCs were passaged using 1 mg/ml collagenase, as previously described , and cultured in either chemically defined medium (CDM) (50% Iscove's modified Dulbecco's medium, 50% Ham's F-12 medium NUT-MIX, supplemented with 7 μg/ml of insulin [Roche Diagnostics, Basel, Switzerland, http://www.roche-applied-science.com], 15 μg/ml of transferrin [Roche Diagnostics], 450 μM monothioglycerol [Sigma-Aldrich, St. Louis, http://www.sigmaaldrich.com], and 5 mg/ml bovine serum albumin fraction V [Sigma-Aldrich]) or MEF medium (Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum and 5 mM glutamine). Human ESCs that had been treated with collagenase were cultured on a rotating shaker to promote the formation of embryoid bodies (EBs) in 5% CO2 at 37°C [1, –3]. EBs were subsequently plated on gelatin in 24-well plates and cultured in either CDM or MEF medium for 15 days to generate in vitro-differentiated cell types. Cells were harvested at determined time points using trypsin, and immunocytochemistry was performed, followed by fluorescence-activated cell sorting (FACS) analysis.
Immunocytochemistry and FACS Analysis
Harvested pluripotent hESCs from both targeted and nontargeted clones were stained either for SSEA3 expression using monoclonal anti-SSEA3 IgG antibody (R&D Systems Inc., Minneapolis, http://www.rndsystems.com) and secondary antibody goat anti-mouse IgG-tetramethylrhodamine B isothiocyanate (Invitrogen) or for Tra-1–60 expression using monoclonal anti-Tra-1–60 IgM antibody (Santa Cruz Biotechnology Inc., Santa Cruz, CA, http://www.scbt.com) and secondary goat anti-mouse IgM-allophycocyanin (Jackson Immunoresearch Laboratories, West Grove, PA, http://www.jacksonimmuno.com). Prior to FACS, cells were stained with 7-aminoactinomycin D (7-AAD) to allow the identification of dead cells. For differentiated cells from both targeted and nontargeted clones, cells were harvested using trypsin and stained with 7-AAD prior to FACS analysis for EGFP+ cells.
Tra-1–60 sorted hESCs were collected and total RNA was extracted using the RNeasy minikit from Qiagen (Hilden, Germany, http://www1.qiagen.com) according to the manufacturer's protocol. DNA contamination was prevented by treating RNA with DNase I (Qiagen). The collected total RNA was suspended in RNase-free water. Complementary DNA was produced from 2 μg of RNA with 200 U of Superscript II reverse transcriptase (Invitrogen) in a 100-μl reaction using a standard protocol. The mixture was heated to 65°C for 5 minutes, and then activation was initiated at 25°C for 15 minutes. This was followed by elongation of the strands at 42°C for 50 minutes and the inactivation of the enzyme at 70°C for 15 minutes. PCRs consisted of 500 ng of the first-strand cDNA and were amplified using the Quantitect EGFP designed primer set (Qiagen). PCRs included a 5-minute denaturation step followed by 35 rounds of 45 seconds at 94°C, 30 seconds at 55°C, and 90 seconds at 72°C. The expression was compared with that of β-Actin under the same conditions.
Quantitative polymerase chain reaction (QPCR) was performed on the Stratagene (La Jolla, CA, http://www.stratagene.com) Mx3005P using the SensiMix kit (Quantace, London, http://www.quantace.com) following the manufacturer's protocol, with the exception that the passive reference dye was obtained from Stratagene. The mixture was denatured at 95°C for 10 minutes followed by 40 cycles of 95°C for 30 seconds, 60°C for 60 seconds, and 72°C for 30 seconds. Primer efficiency was verified through the generation of dissociation curves. All reactions were performed a minimum of two times and normalized to β-Actin.
Gene Expression from Random Integration
The pTP6-EFP vector (supplemental online Fig. 1) was used to generate stable, randomly integrated hESC clones. Fifteen stable clones were arbitrarily selected and used for EGFP expression analysis in hESCs. We found highly variable silencing, variegation, and variation of transgene expression when assessing gene expression of randomly integrated transgenes in hESCs (Fig. 1A). Even under continuous selection, only approximately half (57% ± 32%) of the randomly integrated SSEA3+ hESCs expressed EGFP, as indicated by FACS analysis. Such variegation under selection may reflect the low levels of puromycin gene expression needed to survive in selection, compared with the less sensitive EGFP detection. Moreover, within 25 days after withdrawal of selection, the fraction of EGFP+ cells among those positive for SSEA3 had dropped to 39% ± 31%, representing a significant loss of EGFP expression in pluripotent hESCs (n = 15; p = .0089).
During differentiation of hESCs, randomly integrated pTP6-EFP clones also incurred a significant loss of EGFP transgene expression after 15 days of differentiation in either MEF medium (49% loss; p = .0042) or CDM (52% loss; p = .0025) (Fig. 1B, 1C). These data suggest that expression of randomly integrated transgenes in hESCs is influenced by integration site-related influences, which result in variegation of transgene expression even while under drug selection and lead to further transgene silencing after withdrawal of selective pressure. To overcome such positional effects influencing transgene expression in hESCs, we developed and assessed an expression system based upon AAV2-targeting of the AAVS1 site on chromosome 19.
Southern Blot Analysis
Prior to targeting the AAVS1 locus, we screened for intact AAVS1 loci in three different hESC lines, H1, H9, and HSF-6. All three cell lines possessed an intact AAVS1 locus, indicating that they were all amenable to AAVS1 integration (supplemental online Fig. 2A). We focused on the H9 line for the remainder of the experiments described here, on the basis of the genetic similarity of three available hESC lines at their AAVS1 loci.
Previous studies have shown that AAV2-mediated integration at the chromosome 19 AAVS1 locus occurs within ∼2 kilobases downstream of the Rep-binding site (RBS) and terminal-resolution site (TRS)  (supplemental online Fig. 3). We used dual-probe Southern blot analysis to screen for targeted integration at the AAVS1 site. By using an integration cassette-specific probe (CMV promoter) and then stripping the blots and reprobing with an AAVS1 locus-specific probe, we were able to identify six putative positive clones that possessed both AAVS1 and integrating cassette-specific bands of the same size. This represents an efficiency of integration at the AAVS1 locus of 1.04% and 4.16% for electroporation and lipofection, respectively (Table 1). We show as an example one of the targeted clones (targeted clone 2 [TC2]) (Fig. 2A, 2B).
Table Table 1.. Efficiencies of targeting the AAVS1 locus in human ESCs using lipofection (4.16%) and electroporation (1.04%)
Transfection with targeting vector alone without the pREP78 expression plasmid or using the pTP6-EFP vector did not yield any AAVS1-targeted clones (data not shown), highlighting the importance of REP78 expression for AAVS1 targeting. To ensure that pREP78 was not itself integrated into the targeted hESC clones, we performed Southern blot analysis using REP as a probe. No clones were positive for REP78 integration (supplemental online Fig. 2B).
Putative targeted clones judged positive by Southern blot were further analyzed using previously characterized nested primers . PCR products were sequenced and assessed for the presence of AAVS1 locus and integration cassette-specific sequences. The transgene integration site for TC2 was 1,084 bp downstream of the RBS and TRS. In this case, three additional bases had become inserted between the ITR and the AAVS1 junction (Fig. 2C).
Fluorescence In Situ Hybridization
The AAVS1-targeted clones were further analyzed by FISH analysis, using the integration cassette as a probe. All six targeted clones showed hybridization on chromosome 19 (Fig. 3A), but not with any other locus, thus confirming a single integration site at the AAVS1 locus in each of the targeted clones. Fiber FISH analysis was used to assess integration cassette copy number within the AAVS1 locus. By using two probes, one of known size spanning the AAVS1 target site (38,354 bp), and the other the integration cassette itself ∼5,000 bp, it was possible to estimate the number of cassette insertions at the AAVS1 site. Single copy integration occurred in two of six of the targeted integration events (Fig. 3B; supplemental online Table 1). The remaining targeted clones had either two (TC4) or three (TC2 and TC6) insert copies. All targeted clones were analyzed karyotypically using 4,6-diamidino-2-phenylindole banding analysis and were found to be karyotypically normal, with no visible translocations (supplemental online Fig. 4). Additional FISH analysis performed on randomly integrated ITR-flanked clones showed that such clones were integrated at sites other than the AAVS1 locus (supplemental online Fig. 5).
Phenotypic Characterization of Targeted hESCs
TC2 was initially used to characterize the phenotypes of AAVS1-targeted hESCs, and the characterization was then extended to the other targeted clones, with identical results (data not shown). TC2 was positive for the pluripotency markers Oct-4, Tra-1–60, and SSEA3 and maintained homogeneous expression of EGFP (Fig. 4A). When TC2 was used to generate differentiated EBs, the three cell types representative of all primary germ layers were readily generated. Moreover, there was robust EGFP expression from such AAVS1-targeted transgenes in all differentiated cell types examined (Fig. 4B). Furthermore, many EBs derived from AAVS1-targeted clones readily generated beating structures, all of which possessed homogeneous EGFP expression (data not shown).
Characterization of Transgene Expression from Targeted hESC Clones
In view of the apparent persistence of robust transgene expression in the AAVS1-targeted clones, in contrast to the silencing observed with random integration, it was important to assess the stability of transgene expression in greater detail. EGFP expression data were first obtained for each AAVS1-targeted clone in pluripotent conditions. EGFP expression from AAVS1-targeted clones (n = 6) was compared with EGFP expression from either randomly integrated (ITR-flanked) targeting vector (n = 5) or randomly integrated pTP6-GFP vector (n = 15). To allow authentic comparisons of gene expression between random and targeted clones, randomly integrated clones used for these experiments were chosen that were equal in copy number to the AAVS1-targeted clones, as observed by Southern blot analysis (data not shown). Using FACS analysis for quantitative assessment in SSEA3-positive hESCs, EGFP expression was found to be significantly less variegated in AAVS1-targeted clones than in randomly integrated clones, with 90% ± 4% of cells expressing detectable EGFP in AAVS1-targeted clones, as compared with 57% ± 32% in randomly integrated pTP6-EFP vector clones (p = .022) (Fig. 5A). Interestingly, we observed less variegation in EGFP expression in randomly integrated transgenes when they were flanked by AAV2 ITRs (82% ± 16%) compared with randomly integrated pTP6-GFP (p = .058). This may reflect the previously described insulating effects of ITR viral repeat sequences [22, –24]. Further statistical analysis is given in supplemental online Table 2.
Transgene expression was also maintained at high levels in undifferentiated hESCs for 25 days (five passages) after withdrawal of puromycin selection in AAVS1-targeted clones (86% ± 7% of cells expressing EGFP), in contrast to randomly integrated clones carrying the equivalent ITR-flanked expression cassette (43% ± 20%) (p = .0051) or pTP6-GFP vector (39% ± 31%) (p = .0039) (Fig. 5B–5D). Maintenance of EGFP expression was then assessed in two differentiation conditions, using either MEF medium or CDM. EGFP expression after withdrawal of selection was maintained for 15 days of differentiation of hESCs as EBs without substantial silencing in AAVS1-targeted clones in both differentiation conditions (MEF medium, 95% ± 4%; CDM, 97% ± 1%; n = 6), in contrast to randomly integrated clones carrying the equivalent ITR-flanked expression cassette (MEF medium, 69% ± 13%; CDM, 67% ± 14%; n = 5) or pTP6-EFP vector alone (MEF medium, 31% ± 7%; CDM, 32% ± 7%; n = 5) (Fig. 5E–5J). Further statistical analysis is given in supplemental online Table 2.
Quantitative PCR analysis was then used to analyze EGFP mRNA levels in Tra-1–60-positive hESCs from AAVS1-targeted, randomly integrated ITR-flanked and random pTP6-EFP clones maintained for 10 passages. The QPCR analysis revealed that mRNA levels differed markedly between AAVS1-targeted clones and randomly integrated clones by several orders of magnitude (Fig. 6A). Furthermore, although randomly integrated clones showed general decreases in EGFP mRNA levels in progressing from 0 to 10 passages, AAVS1-targeted clones maintained high EGFP mRNA levels. This result was confirmed by reverse transcription-PCR analysis (Fig. 6B).
Our results show that hESCs are prone to repress transgene expression after random integration, leading to highly variable, variegated transgene expression even when hESCs were maintained in selection. The levels of EGFP expression from random integrants in hESCs varied between 1% and 91%, depending upon the clone analyzed. Such variegation was evident even when cells were grown in selection. This phenomenon may be due to the insensitivity of EGFP detection compared with that of puromycin resistance. Indeed, previous studies identifying highly expressing EGFP hESC clones did not report 100% EGFP-positive cells . Although such variegation could be minimized by selecting those clones that express the highest levels of transgene, such transgene expression levels are unlikely to be maintained when the clones undergo differentiation. Previous studies using retroviral vectors or cDNA to generate stable transgene-expressing clones in vitro have similarly observed variegated transgene expression, and this has been linked to either methylation of transgene expression cassettes [4, , –7] or use of unsuitable promoters . Numerous studies suggest that methylation and other epigenetic changes that occur during differentiation of ESCs cause rapid silencing of transgenes from retroviral vectors [6, 7]. Furthermore, in vitro differentiation of neural precursor cells that have been transduced by either retroviruses or lentiviruses preceding transplantation into the spinal cord induces rapid loss of transgene expression, indicating a close relationship between differentiation and transgene silencing from viral vectors .
Analysis of random integration of expression cassettes shows that the promoter used to drive transgene expression may be partly responsible for variegated transgene expression in hESCs . In that study, the CAG promoter was shown to produce variegated unstable expression, similar to our results. Interestingly, one study where cDNA was randomly integrated in hESCs showed persistent levels of transgene expression , although the frequency of finding such a permissive site for transgene expression was not reported.
Here, we found that transgene integration at the AAVS1 locus can be achieved with sufficient efficiency (4.16%) to warrant its use in generating stably expressing transgenic hESCs. The resistance to transgene silencing observed at the AAVS1 site may reflect the presence of the previously documented insulation present at the AAVS1 locus . Such insulation would have the dual effect of both protecting an integrated expression cassette from extrachromosomal influences and serving to restrict the influence of the expression cassette on the surrounding genome. Furthermore, previous work has shown that gene expression from the AAVS1 locus can be maintained for more than 18 weeks without selection . In addition, the AAV2 ITRs that flank the expression cassette impart insulator effects to randomly integrated expression cassettes in zebrafish and Xenopus [22, 23]. Here we show that AAV ITRs also impart insulation on the expression cassette in hESCs, resulting in less variegated EGFP expression when the expression cassette was flanked by ITRs, compared with the pTP6-EFP expression vector alone.
Furthermore, QPCR analysis revealed that high levels of transgene mRNA were achieved and maintained when the expression cassette was targeted to the AAVS1 locus as compared with random integration. Such transgene expression driven from the CAG promoter may thus reflect true promoter strength in a stable expression system not influenced by the integration site.
Unlike retroviruses, which integrate into active transcriptional sites, AAV2 has evolved a mechanism of preferentially targeting an integration hotspot on chromosome 19 [17, , , –21, 24, 26, 27], thus making it a potentially neutral site for transgene insertion in hESCs. In this study, we demonstrated that targeting the AAVS1 locus in hESCs circumvents transgene inactivation, a prominent shortcoming of standard genetic manipulation techniques. The reduction of variegated transgene expression and transgene silencing during differentiation provided by AAVS1 targeting surmounts this otherwise considerable obstacle to carrying out functional genomic studies in hESCs. The use of AAV2 technology in hESC research could potentially facilitate applications that require the expression of transgenes at defined levels in all cells, including short interfering RNA and growth factor-driven differentiation strategies. Furthermore, the generation of hESC lines with minimal integration site-dependent influences provides an opportunity for the isolation of pure populations of hESC-derived cell types using promoter-driven transgene expression strategies without functional perturbation of the host cell genome. If genetically altered hESCs are to be used for clinical purposes in the future, it will be important to achieve transgene integration not only with minimal silencing but also without the consequences of insertion of expression cassettes near oncogenes, as has been observed for retroviral insertion in gene therapy studies .
Disclosure of Potential Conflicts of Interest
The authors indicate no potential conflicts of interest.
This work was funded by the Cambridge-MIT Institute (to J.R.S., R.A.P., C.ff.-C.), the Christopher and Dana Reeve Foundation (to J.R.S.), and the U.K. Medical Research Council (to R.A.P.). L.A.D. was supported by the Leukemia and Lymphoma Society. We thank Andre Lieber for kindly providing the pCMV-REP78 plasmid and Steve Russell for the AAVS1 probe. We also thank Paula McPhee for her assistance with the proofreading of the manuscript. C.ff.-C. and R.A.P. were the two principal investigators, and their laboratories contributed equally to this work.