Genome-Wide Expression Profiling of Sox17-Dependent Genes
For a transcriptome-wide assessment of Sox17-dependent genes, ESCs expressing Sox17 or luciferase shRNA were differentiated for up to 10 days by the embryoid body method, then were analyzed using Affymetrix microarrays (Supporting Information Fig. S1A and Table S1). Roughly 800 genes (4% of those tested) showed significant changes both across time and in response to Sox17 shRNA. The filtering thresholds were chosen guided by the change in five Sox17-dependent genes of known biological importance, which we had earlier identified by a candidate gene approach  (Foxa1, Mesp1, Nkx2−5, Mef2c, Myh6; Supporting Information Fig. S1B).
For the resulting overall set of Sox17-regulated genes, the significant gene ontology (GO) and GenMAPP terms are discussed below. Two GenMAPP categories requiring Sox17 were indicative of cell lineage, namely, striated muscle contraction (p = 1E-13) and smooth muscle contraction (p = 3E-05). Accordingly, many of the most affected GO Biological Process terms were related to cardiovascular development and function (Fig. 1A; 16 of the top 40 including: heart morphogenesis, p = 3E-17; heart development, p = 1E-14; vasculogenesis; p = 2E-10). Other highly dependent categories were related to more generic events (multicellular organismal development, p = 4E-30; regulation of transcription, DNA dependent, p = 3E-16; cell fate commitment, p = 5E-07), or, notably, endoderm development (p = 9E-05). Numerous pathways for growth factor signaling were Sox17-dependent (SMADs, p = 1E-06; Wnts, p = 5E-06; BMPs, p = 7E-05; transforming growth factor beta, p = 1E-03). As further steps to refine and visualize these results, the filtered genes were also clustered according to their dynamic profiles (Supporting Information Fig. S1C, S1D) and the temporal clusters then subjected to GO analysis (Supporting Information Fig. S2A). Cluster III, comprising transiently expressed genes with an onset between days 2 and 4, was notably enriched for the GO processes endoderm formation, ectoderm formation, embryonic heart tube morphogenesis, and heart looping (Supporting Information Figs. S2B, S3).
Figure 1. Sox17-dependent genes in differentiating ESCs.(A): The top 20 GO biological process terms that were dysregulated in Sox17-deficient mouse ESCs. Those specific for cardiovascular development and function are highlighted (black), and a p-value of 1E-5 is noted for reference (Supporting Information Figs. S1, S2). (B): Heat map of gene expression levels for 126 Sox17-regulated genes from a curated gene set related to cardiac myogenesis (Supporting Information Table S2). Genes that fulfill the filtering criteria in Supporting Information Figure S1A are presented, grouped according to the temporal clusters obtained from the whole-transcriptome analysis (Supporting Information Fig. S1D). Functional annotations are shown at the right for transcription factors, extracellular/membrane proteins, and muscle-specific genes for cardiac contractility. For the complete set of affected genes refer Supporting Information Figure S3. Sox17, Hhex, and Cer1 are highlighted. Abbreviation: GO, gene ontology.
Download figure to PowerPoint
Next, the changes contingent on Sox17 were scrutinized using a manually curated set of 445 genes relevant to cardiac myogenesis, antecedent processes, and selected relatives for multigene families (Supporting Information Table S2). Significant dysregulation was observed in 28% of the genes, that is, enriched sevenfold compared to the unbiased 22K chipset (Fig. 1B). Endogenous Sox17 and its direct targets Foxa1 and Foxa2 were suppressed, as expected (preconditions for the knockdown experiment to be valid). Moreover, the genome-wide analysis and specific conditions chosen for data mining were sufficient to capture all the Sox17-dependent genes we had found previously by a limited candidate gene approach . In addition to just the representative markers Myh7 and Ryr2, the lack of Sox17 broadly downregulated the genes for diverse cardiac thick filament proteins (Mybpc3, Myh6, Myl2/3/4/7, Mylk3, Myom1, and Ttn), thin filament proteins (Actc1, Actn2, Tnnc1, Tnni1/3, Tnnt2, and Tpm1/2), Z disc proteins (Csrp3/Mlp), Nppa, and regulators of Ca2+ homeostasis (Atp2a2, Pln, and Srl).
With regard to cardiogenic transcription factors and their coactivators, the lack of Sox17 resulted in suppression of Foxc1/2, Gata4/5/6, Hopx, Irx3/5, Isl1, Smarcd3/Baf60c, and Smyd1/Bop, in addition to the three factors examined previously (Mesp1, Mef2c, and Nkx2–5). Unlike those mentioned above, Hand1/2 and Tbx20 were upregulated, concomitant with other genes for neural development, consistent with their additional roles, respectively, in neural crest and motoneurons . Thus, in unbiased genome-wide testing, Sox17 expression in ESCs was a prerequisite for the induction of highly diverse cardiogenic transcription factors and cardiac structural genes.
Demonstrated by induction of T, Eomes, Fgf8, Gsc, Cdx2, and Mixl1 to normal or increased levels, suppressing Sox17 did not prevent the induction of primitive mesoderm, mesendoderm, or the primitive streak (Fig. 1B, Supporting Information Fig. S3). In the case of Eomes, a direct activator of Mesp1 [7, 17], this may be due to loss of a known negative feedback loop . Several upregulated mesodermal genes were related to hematopoiesis (Etv2, Fli1, Hoxa9, Hoxb6, Hoxc8, Hba-a1/2, Hba-x, Hbb-bh1, and Hbb-y). Notably, however, key early markers of the multipotential cardiovascular progenitor cell were suppressed (Kdr/Flk1, Pdgfra) [18, 19]. Thus, taking these results together, genome-wide profiling substantiates that Sox17 specifically affects the direction of mesoderm toward a cardiovascular fate, not mesoderm formation per se .
In addition to Foxa1/2, discussed above, multiple markers of early endoderm (Cldn6, Dpp4, Epcam, Foxg1, Nr2f1, Rhox5, and Sparcl1), definitive endoderm (Cd24a, Cxcr4, Foxa1/2, and Kitl), and visceral endoderm (Afp, Cited1, Dab2, Fxyd3, Hesx, and Ttr) were inhibited, the latter notably including Cer1 (Fig. 1B; Supporting Information Fig. S3, cluster III). Several of the endodermal genes cited are reportedly direct targets of Sox17 by chromatin immunoprecipitation (ChIP) and whole-genome promoter tiling arrays , although whether Sox17 directly activates Cer1 is unsubstantiated. Other Sox17-dependent genes included two related F box family members, Sox18 and Sox7, with which Sox17 can be redundant [21, 22]. Given these genes' sequence similarity, we confirmed that the Sox17 shRNAs have no promiscuous effects on cotransfected Sox18  or Sox7 (Supporting Information Fig. S4A). Thus, under the conditions tested, lack of Sox17 downregulates the redundant family members that are required in concert with Sox17 during embryogenesis.
Because the action of Sox17 in mesoderm patterning is cell-nonautonomous , we next tested for operation of a secretory pathway, as opposed to ones requiring cell-cell contact. Doxycycline- (Dox-) dependent “inducer” ESCs bearing Sox17 gain-of-function mutations were able to upregulate Nkx2–5 and Myh6 in wild-type “responder” ESCs, across a semipermeable membrane (Fig. 2A). Thus, Sox17 is sufficient to promote cardiac muscle differentiation from ESCs via one or more soluble signals. In Xenopus embryos, the endodermal genes that most conclusively regulate secreted signals for cardiac specification by primitive mesoderm—the stage regulated by Sox17—are Hhex (whose relevant target is unknown) acting in parallel with Cer1, induced by Wnt antagonists and Nodal, respectively [10, 23]. Using the transmembrane induction assay, Hhex, like Sox17, was sufficient to upregulate Nkx2–5 and Myh6 in responder cells (Fig. 2A). As a prelude to more detailed investigation of these two putative effectors and their potential relationship, we substantiated our microarray finding that Cer1, like Hhex, depended on Sox17 using quantitative real-time RT-PCR (QRT-PCR; Fig. 2B).
Figure 2. Sox17 regulates secreted signals for cardiac differentiation. (A): (Left) Schematic cartoon of the experimental design. For details of the Sox17 gain-of-function mutation refer Figure 4D and Supporting Information Figure S5B–S5D. (Right) QRT-PCR results for Nkx2–5 and Myh6. (B): Corroboration by QRT-PCR of Cer1 induction as contingent on Sox17. *, p < .05 versus control cells; n ≥ 3. Abbreviation: QRT-PCR, quantitative real-time RT-PCR.
Download figure to PowerPoint
Hhex and Cer1 Act in Series for Mesoderm Patterning to a Cardiac Fate in Differentiating ESCs
To test the requirement for Hhex and Cer1 by RNA interference, preparatory studies confirmed the shRNAs' effect on cotransfected Hhex, these sequences were retested as recombinant lentiviruses, and a block to endogenous Hhex was confirmed (Supporting Information Fig. S4B). The consequences of Hhex shRNA shown by QRT-PCR strongly resembled those of blocking Sox17 (Fig. 3A): (a) suppression of cardiac structural genes (Myh6 and Ryr2), (b) suppression of cardiogenic transcription factors (Nkx2–5, Myocd, and Mef2c), (c) lack of interference with downregulation of stemness factors (Oct4 and Sox2), and (d) failure to inhibit T, implicating one or more steps later than the formation of primitive mesoderm.
Figure 3. The Sox17-dependent genes Hhex and Cer1 are important for cardiac myogenesis in differentiating ESCs. (A–C): Hhex and (D) Cer1 shRNA suppressed the respective cognate genes in differentiating ESCs and inhibited the induction of cardiac transcription factors and structural genes, acting at a stage subsequent to induction of Mesp1/2. *, p < .05 versus control cells; n ≥ 3. (A, D): Results are shown for the most potent of the shRNAs tested, measured by effectiveness against the endogenous transcripts. For each gene, qualitatively similar results were obtained using at least two independent shRNAs. (B): Partial comparison of the microarray findings with Hhex and Sox17 shRNAs, illustrating the shared impairment of Cer1, cardiac transcription factors, and cardiac structural genes. In addition, a potential positive feedback loop between Hhex and Sox17 is noted. n = 2 for Hhex shRNA; n = 1 for the Luc shRNA controls. (C): Ectopic Cer1 expression rescues cardiac differentiation in Hhex-knock down ESCs. Cer1 was encoded by a tetO-regulated lentiviral vector, and was induced on day 3 by doxycycline. Gene expression was assayed by QRT-PCR. *, p < .05 versus control cells; n ≥ 3.
Download figure to PowerPoint
Among the few differences from Sox17-deficient ESCs , Hhex-deficient ones showed little or no loss of Mesp1/2. Thus, these results suggest that Sox17 acts on Mesp gene induction, whereas Hhex mediates a later stage. In preliminary microarray analyses, further cardiogenic transcription factors and cardiac structural genes were suppressed by Hhex shRNA, and, notably, like Sox17, Hhex was required for the normal induction of Cer1 (Fig. 3B). Conversely, forced expression of Cer1 was sufficient to rescue cardiac differentiation in Hhex-deficient ESCs, as shown by Nkx2–5, Tbx5, and Myh6 expression (Fig. 3C). Thus, Cer1 stands out among the plausible candidates to explain the impact of Sox17, Hhex, or both on cardiac myogenesis.
As done for Hhex, we selected shRNAs against Cer1 and proved their efficacy in transfected 293T cells (Supporting Information Fig. S4B). After lentiviral delivery into ESCs, each shRNA inhibited endogenous Cer1, cardiogenic transcription factors, and cardiac structural genes (Fig. 3D), identical to the results obtained in Hhex-deficient cells. Like Hhex, Cer1 shRNAs did not suppress the tested markers of “stemness” (Oct4 and Sox2), primitive mesoderm (T), or precardiac mesoderm (Mesp1/2). Thus, Cer1—like Hhex—acts at a later stage than the conversion of primitive mesoderm to Mesp-expressing mesoderm and, consequently, later than Sox17. We saw no difference between shRNA-mediated silencing of Hhex versus Cer1, based on the candidate genes studied in both backgrounds. These results do not exclude differences that could emerge from broader surveys or genome-wide profiling. The combined knockdown of Hhex and Cer1 was similar qualitatively and quantitatively to that of either alone (not shown), consistent with the ability of Cer1 to rescue Hhex-deficient cells under these conditions. The lack of any additive effect suggests that Hhex and Cer1 act in series, an interpretation supported strongly by the Cer1 rescue experiment.
Sox17 Couples the Activin/Nodal Pathway to Hhex, Cer1, and Cardiac Specification
In all the experiments above, the requirement for a Sox17-Hhex-Cer1 circuit was demonstrated in differentiating embryoid bodies after aggregation in serum-containing medium, that is, conditions that are spontaneous but biochemically undefined. To ascertain whether this Sox17-Hhex-Cer1 circuit might be essential even if an exogenous differentiating signal were provided, we first tested for induction of these genes in serum-free monolayer culture  containing 25 ng/ml Activin  (Fig. 4A). Along with transient induction of T on day 4, Activin induced sustained expression of Sox17, Hhex, and Cer1. As expected, Activin was sufficient to provoke a cardiomyocyte phenotype, denoted by Nkx2–5, Tbx5, and Myh6 at day 7. Thus, in addition to the later expression of cardiac markers, Activin elicited the prior induction of Sox17, Hhex, and Cer1.
Figure 4. Sox17 mediates the Activin/Nodal pathway for cardiac myogenesis. (A): Induction of Sox17, Hhex, Cer1, and cardiac genes in Activin-treated embryonic stem cells (ESCs). (B): The Nodal receptor Cripto is essential for induction of the endoderm-associated Sox17-Hhex-Cer1 pathway. (C): Sox17 shRNA recapitulates the Cripto-deficient phenotype in Activin-treated ESCs. Cells were grown in monolayer culture for panels (A) and (C), and as embryoid bodies for panel (B). *, p < .05 versus control cells; n ≥ 3.
Download figure to PowerPoint
Conversely, to assess whether the Nodal/Activin pathway was required to induce the Sox17-Hhex-Cer1 module, we used homologous-null ESCs lacking the coreceptor Cripto. As reported , cardiac transcription factors (Nkx2–5 and Tbx5) and Myh6 were not induced in the absence of Cripto (Fig. 4B). Similarly, the lack of Cripto reduced Sox17, Hhex, and Cer1 each by 80%–90% (Fig. 4B). By contrast, T was expressed at levels even higher than in wild-type cells, although delayed by 1 day. Thus, in addition to canonical Wnts and BMPs , a third signal is essential to confer Sox17 induction, namely the Nodal/Activin cascade.
To verify whether Sox17 is essential for the induction of Hhex and Cer1 by exogenous Activin, we next compared control and knockdown ESCs in the serum-free monolayer cultures (Fig. 4C). Corresponding to the requirement for Sox17 in embryoid bodies (Fig. 1B; Supporting Information Fig. S3, cluster III), suppressing Sox17 likewise prevented the induction of Hhex and Cer1 by recombinant Activin. Despite forced stimulation of the Activin/Nodal pathway, suppressing Sox17 resulted in the failure of cardiac myocyte differentiation, measured here by Nkx2–5, Myocd, and Myh6 (Fig. 4C). Together, these complementary results clearly position Sox17 upstream from both Hhex and Cer1, mammalian counterparts of the endodermal signals for cardiac myogenesis in Xenopus, and likely explain the absence of Cer1 seen in Cripto-deficient ESCs .
Sox17 Binds and Activates Endogenous Cer1
To distinguish whether Sox17 confers expression of Cer1 only indirectly, via Hhex, or also acts on Cer1 directly, we performed Sox17 ChIP assays, using (A/T)(A/T)CAA(A/T) sites conserved across the mouse, canine, rhesus, and human Cer1 genes , guided by the (A/T)(A/T)CAA(A/T)G consensus binding site for the Sox family  and the ATTGT core site for Sox17 itself  (Fig. 5A; Supporting Information Table S3). Epitope-tagged Sox17 was transduced into ESCs using a tetracycline-inducible lentiviral system, to obviate potential confounding effects of constitutive expression. Negative controls were randomly selected regions lacking this motif, remote from predicted binding sites. Predicted Sox sites from the Foxa1 and Foxa2 loci (Fig. 6A) also were assayed, as these genes are proven direct targets of Sox17 . In concordance with identification of Cer1 as a potential Sox17 target by ChIP-chip, albeit along with 1,800 other genes , we specifically confirmed Sox17 binding at each of the predicted sites we tested from the upstream 6 kbp, with enrichment at least equal to that for the sites in Foxa1/2 (Fig. 5B). Thus, predicted binding sites in the upstream region bind Sox17 efficiently.
Figure 5. Sox17 binds to evolutionarily conserved Sox sites in the Cer1 upstream region. (A): Predicted Sox17 binding sites in the Cer1, Foxa1, and Foxa2 loci. Primers corresponding to predicted binding sites versus irrelevant control regions are indicated in black and white, respectively. Conservation profiles are shown for the human and rhesus orthologs (range, 50%–100%). Pink bars above the profiles denote regions of conservation with the mouse genome; blue, coding exons; yellow, untranslated regions; salmon, introns; red, intergenic regions; green, transposons and simple repeats. (B): Chromatin immunoprecipitation, assayed by quantitative PCR, shown as the fold enrichment for the indicated regions (V5 antibody, normalized for nonspecific precipitation by nonimmune IgG). Black, predicted Sox binding sites; white, irrelevant regions. For Foxa1 and Foxa2 refer Figure 6A.
Download figure to PowerPoint
Figure 6. Sox17 activates Cer1. (A–F): Mapping the Sox17 transactivation domain. (A) Schematic representation of the Sox17 deletion mutants. Domain of unknown function 3547 (Pfam 24.0) designates the conserved C-terminal region of F group Sox proteins. (B): Western blot analysis of the constructs in 293T cells. (C): Sox-dependent reporter gene activity (SOP-FLASH) in 293T cells, in the presence of cotransfected Sox17 expression vectors. Deletion of the N terminus (C1) increases the transcriptional activity. Deletions of the conserved C-terminal domain (N3, N4) attenuate transactivation. (D): Schematic representation of the GAL4DBD-Sox17 fusion proteins. (E): Western blot analysis of the constructs in 293T cells, using antibody to Gal4. (F): Reporter gene activity (5xGal4-luc) in 293T cells, induced by GAL4-Sox17 vectors. Deletion of the Sox17 C-terminal domain (GAL4 129–299) cripples transactivation. Other constructs showed activity equal to or greater than that of GAL4-VP16. (G, H): Doxycycline-dependence of the Sox17 vectors, measured in AB2.2 cells by flow cytometry (G), Western blotting (H, above), and transactivation of SOP (H, below). (I): Wild-type Sox17 and the C1 truncation both induce endogenous Cer1. n ≥ 3; *, p < .05 versus control embryonic stem cells. (J): The multimerized Sox17 site at −657 of the Cer1 locus mediates Dox-dependent, sequence-specific trans-activation. n = 6; *, p < .01 versus the absence of Dox.
Download figure to PowerPoint
To test whether exogenous Sox17 suffices to induce Cer1, we first mapped the transactivation domain of mouse Sox17, based on the structural organization of XSox17β and other Sox proteins  (Fig. 6A, 6B). Each construct was cotransfected into 293T cells with the Sox-dependent luciferase reporter SOP, containing a concatamer of CTTTGTT (an inverse of AACAAAG)  (Fig. 6C). Activation was obtained only with wild-type Sox17 or an N-terminal truncation that retains both the DNA binding domain (HMG box) and C-terminus (“C1”). C1 was 14-fold more potent than wild-type Sox17, suggesting the presence of auto-inhibitory elements in the N terminus. None of the constructs activated the inactive control reporter, NOP.
To confirm the putative function of this C-terminal activation domain, we constructed chimeric expression vectors encoding the Sox17 truncations in frame with the Gal4 DNA-binding domain (DBD; Fig. 6D, 6E). Tested using a Gal4-dependent luciferase reporter gene, four of the five fusion proteins—all those preserving the distal C-terminus—were at least as potent as the Gal4-VP16 control, but the C-terminal residues 129–299 were inactive, similar to the Gal4DBD alone (Fig. 6F). Thus, the Sox17 trans-activation domain is located in the distal C-terminus.
On this basis, wild-type Sox17, the N-terminal portion N3, and the C-terminal portion C1 were compared for their ability to induce endogenous Cer1, using Dox-dependent lentiviral vectors. Flow cytometry for eGFP 1 day after Dox administration demonstrated that roughly 80% of the cells were successfully induced (Fig. 6G), versus 0.5%–0.8% for cells without Dox. All three Sox17 proteins were induced efficiently, without discernible leak (Fig. 6H, upper panel). In agreement with their respective activity toward the SOP reporter (Fig. 6H, lower panel), viruses encoding wild-type Sox17 and C1 conferred Dox-dependent induction of endogenous Cer1 (16–120-fold; Fig. 6I), with C1 (lacking the auto-inhibitory domain) being even more potent than the wild-type protein and N3 (lacking the activation domain) being altogether ineffective.
To determine whether Sox17 can specifically activate the binding sites verified in the Cer1 locus, the inducible expression vectors were then also tested against Cer1 luciferase reporter genes. Construction was based on the SOP reporter, containing instead a 3× multimer of the Cer1 proximal or distal Sox17 binding site, and using each in its wild-type or mutationally inactivated form (−657, Fig. 6J; −6017, not shown). Both wild-type Sox17 and the gain-of-function mutation C1 evoked Dox-dependent, sequence-specific trans-activation via the multimerized proximal site. No induction was seen if the Sox17 binding motif was mutated, a minimal promoter was tested, or N3 was used, lacking the trans-activation domain. As the distal Sox17 site was activated by C1, but not wild-type Sox17, its activation should be interpreted more cautiously, even though sequence-specific. Together with our evidence that Sox17 specifically binds the Cer1 promoter, these gain-of-function studies indicate that Sox17 may directly drive Cer1 transcription, at least in part through the proximal Sox17 site. None of the Sox17 expression vectors upregulated endogenous Hhex (data not shown), suggesting that Sox17 is required but not sufficient for Hhex induction.